Computational Toxicology Research Program
More on DSSTox File Names
A standard DSSTox file naming convention has been adopted for all DSSTox documentation and data files. The file naming convention serves two primary aims:
- to convey useful information about the DSSTox file; and
- to uniquely identify and provide easy linkage back to current or archived DSSTox data and documentation files.
A publication reporting use of any DSSTox data file, which specifies the DSSTox file name, allows others to trace the exact origins and content of that file.
DSSTox SDF file names have the general form:
NAMEID = 6 letter unique database identifier assigned by the DSSTox project to each main DSSTox database, chosen to have some relationship to the database Source or content:
Examples include: EPAFHM, CPDBAS, etc.
EPAFHM = Environmental Protection Agency's Fathead Minnow Acute Toxicity Database
CPDBAS = Carcinogenic Potency Data Base Project All-Species
ADD = optional extension to NAMEID to identify specialized SDF data files whose content augments or is a modification of the main SDF data file:
Examples include: EPAFHM_DOP, NCTRER_DOP3D
DOP = Defined Organic Parent, subset of Main SDF that excludes organometallics, inorganics and mixtures, and that includes 2D parent structures (i.e. not salt or complex forms) in the structure field; discontinued in versions dated later than Nov04.
DOP3D = same as DOP file, but containing 3D representations of parent structures
v1a = version number (1,2,3, ) and revision letter (a,b,c, ):
version number will be incrementally increased with major modifications (such as reformatting) or record or field additions to the database (i.e., adding new chemicals or properties);
revision letter will be incrementally increased (starting from "a" with each new version number) to indicate minor revisions or error corrections to the database
e.g., v1a, v1b, , v2a, v2b,
#records = total number of chemical records in the database, generally corresponding to the total number of structures (or substances) included, although a record may be included without a structure, e.g. if it corresponds to an undefined chemical mixture.
date = date of file creation or publication formatted ddMmmyyyy, where dd=day, Mmm=3 letter abbreviation for month, yyyy=year:
e.g., 10Apr2006, 03Dec2005, etc.
Note: Date format has been changed from ddMmmyy to ddMmmyyyy to conform to Y2K conventions in the most recently revised DSSTox Standard Chemical Fields (Aug, 2005)
ext = type of file
e.g., sdf, doc, pdf
Sample DSSTox SDF file names include:
|You will need Adobe Acrobat Reader, available as a free download, to view the Adobe PDF files on this page. See EPA's PDF page to learn more about PDF, and for a link to the free Acrobat Reader.|
DSSTox supplementary data files correspond to a particular SDF file and provide alternate formatting and representation of the data in that file for alternate uses and include:NAMEID_[ADD]_v1a_#records_date_nostructures.xls
MS Excel data file in spreadsheet form that exactly mirrors the full data content of the corresponding SDF file with the single exception that the structure field is excluded for all records. The file name exactly corresponds to the corresponding SDF file name with the .sdf extension replaced by "_nostructures.xls" View Sample nostructures.xls File (MS Excel 2003 file) (32 KB)
This is a pdf file containing a tiling table view of all the chemical structures contained in the database, i.e. a graphical summary representation of the contents of only the structure fields. The file name exactly corresponds to the corresponding SDF file name with the .sdf extension replaced by "_structures.pdf"
View Sample structures.pdf File (PDF, 4 pp., 90 KB)
DSSTox SDF files of a given NAMEID are associated with a standard common set of documentation files. These include the following with their corresponding name convention:
DSSTox Log File offered as a pdf file, providing summary counts, quality assurance history, missing or incomplete structure-related information in the database, and a documented history of all version and revision modifications. View Sample LogFile (PDF, 4 pp., 90 KB)
DSSTox Field Definition File offered as a pdf or MS Word doc file is a reference document containing a table of detailed definitions and descriptions for all fields included in the NAMEID SDF files. View Sample FieldDefFile (PDF, 10 pp., 57 KB)
The DSSTox Log File plays a centrally important role in providing a historical record of modifications to DSSTox SDF files resulting in version and revision updates. A table is included in the NAMEID_LogFile that documents specific modifications in the NAMEID SDF file(s) that resulted in incremental increases of the version number and revision letter.
View Sample LogFile (PDF, 4 pp., 90 KB)
The DSSTox_FileID field is a DSSTox Standard Chemical Field that is included in every DSSTox SDF file. The contents of this field includes a record integer counter (1,2,...total # records) followed by an underscore and the abbreviated DSSTox File Name (NAMEID_v#r).
For example, the 14th record of the DSSTox SDF file named:
has the corresponding DSSTox_FileID field entry:
This field uniquely labels each and every DSSTox SDF file record with its exact file location. This is essential for File Error Report. Also, if a particular DSSTox SDF file is merged with other DSSTox files by a user, or incorporated into a larger corporate database, the DSSTox_FileID field allows location of all records corresponding to the original file in a single search step. This greatly facilitates replacement of the original DSSTox file records with a new, updated version of that DSSTox SDF file.