Jump to main content or area navigation.

Contact Us

Computational Toxicology Research Program

SDF Download Page


TOXCST: Research Chemical Inventory for EPA's ToxCast Program
Structure-Index File


** Version 4a, revised 20 March 2012
arrow Significant update and expansion of chemical inventory file to include 767 new PhaseII chemicals and 800 new e1k chemicals.
Newly added Source fields tracking sub-inventory listings (PhaseI_v1, PhaseI_v2, PhaseII, donated pharma, e1k), and notes pertaining to analytical QC and the testing status of compound
s, past and present.
Elimination of test duplicate compound entries

exit EPA.

Quick & Easy File Downloads: FTP Download Instructions

blue bullet graphic Description
blue bullet graphic Source Contacts & Website
blue bullet graphic MainCitation
blue bullet graphic SDF Fields
blue bullet graphic SDF Content Summary
blue bullet graphic Version 4 Update

blue bullet graphic SDF Download Table

blue bullet graphic Acknowledgements, DSSTox Citation & Disclaimer

New Users: For general information, see DSSTox Project Goals and About DSSTox. For additional information on DSSTox SDF (Structure Data Format) files and their use in Chemical Relational Databases, see More on SDF and More on CRDs.

Description: EPA's National Center for Computational Toxicology is conducting research to develop the ability to forecast toxicity based on bioactivity profiling: the ToxCast™ Program for Prioritizing Toxicity Testing of Environmental Chemicals. The goal is to develop methods of prioritizing chemicals for further screening and testing to assist EPA in the management and regulation of environmental contaminants. For more information on the purpose, goals, and parameters of the program, see the ToxCast™ Program Website, and the Main Citations. Within the ToxCast Program, data are being generated on a library of environmentally or toxicologically relevant chemicals using various medium and high-throughput screening (HTS) assays (e.g., cell-based, biochemical, transcriptional) to evaluate a broad spectrum of bioactivities potentially relevant to toxicity (see ToxCast™ Assays). PhaseI_v1 of the ToxCast Program focused on a set of 309 unique chemicals (revised count is 310), most or which were registered pesticidal actives having guideline toxicity animal studies. Thus, ToxCast Phase I included several hundred toxicity reference chemicals that were representative of varied structural classes and phenotypic outcomes (e.g., tumorigens, reproductive toxicants, neurotoxicants). Phase I data have been used to develop models and predictive signatures for specific phenotypic outcomes related to carcinogenicity, and reproductive and developmental toxicity (see ToxCast™ Publications).

For the next phase of ToxCast testing, to supply compounds for new assay technologies and for use as assay replicates, most of the original ToxCast Phase I (v1) chemicals were reprocured (v2). A small set of compounds were excluded from PhaseI_v2 due to problems detected during analytical testing with sample stability and solubility in DMSO. Phase II of the ToxCast program added 767 new unique compounds, spanning many different use-cases (e.g., pesticides, industrial chemicals, drugs, food additives, fragrances, etc.) and a much broader range of chemical functionality and structural diversity than in Phase I. Phase II also includes 135 "failed pharma" compounds, for which physical samples along with various levels of pre-clinical and clinical data were donated by 6 pharmaceutical company partners (Pfizer, Sanofi, GlaxoSmithKline, Merck, Roche, and Astellas). With the addition of Phase II testing results, the combined and updated ToxCast inventory (PhaseI_v2 and PhaseII) will not only have nearly a full complement of the original ToxCast HTS assays (a small number of assays are being discontinued), but new assay technologies are being added. In addition, 800 unique chemicals not already included in the full ToxCast inventory have been added as part of the "e1k" compound set and are undergoing testing in a subset of the ToxCast HTS assays (approximately 50 assays) specifically related to endocrine activity (the ToxCast e1k inventory includes the full ToxCast inventory plus the additional 800 unique chemicals, for a total of 1860 unique compounds). Details of the ToxCast e1k program and a listing of subject HTS assays for this program will be provided on the ToxCast™ Program Website. Finally, the complete current ToxCast and e1k chemical inventories have been incorporated into EPA's larger contribution to the full Tox21 testing program (see TOX21S).

The DSSTox TOXCST Structure-Index File offers the full complement of DSSTox Standard Chemical Fields for the combined ToxCast testing library, past and present, spanning Phase I (v1, v2), Phase II (including the subset of DonatedPharma), and e1k (which includes all compounds being tested in the endocrine-related subset of ToxCast assays) chemical inventories. Additionally, indicator columns are included that denote presence (1) or absence (blank) of a compound within the specified inventory - PhI_v1, PhI_v2, e1k, PhII_DonatedPharma, Tox21. A field entitled ToxCast_TestingStatus communicates the status of chemicals as they move into or out of testing in the various testing phases, such as when chemicals were reprocured in slightly different form, thus replacing a previously tested compound with a close analog, or when chemicals are retired from testing due to known problems with decomposition or solubility in DMSO. Such problems are annotated within the newly added field AnalyticalQC_Problems. Additional notes, error corrections or version updates are include in the field Note_TOXCST. The DSSTox TOXCST chemical structure inventory is fully contained within the EPA ACToR on-line database and provides the primary chemical annotation for all ToxCast bioassay information published within EPA ACToR. The DSSTox TOXCST file provides easy access to high quality chemical structures associated with all published assay data and will be available for structure-searching from the DSSTox Structure-Browser, as well as in association with HTS assay data from various websites: EPA ACToR, NCGC Tox21 Browser Exit EPA Disclaimer, and PubChem Exit EPA Disclaimer.

** Note: The DSSTox TOXCST current compound inventory isalmost entirely incorporated within the larger (approx 3700 compounds) EPA contribution to the DSSTox TOX21S structure-inventory file (with the exception of one compound included in a salt vs. a retired parent form in the two inventories). This provides significant chemical overlap between the ToxCast and Tox21 HTS testing programs, which differ in terms of their largely non-overlapping HTS assays and associated data. Hence, the bioassay profiles of the DSSTox TOXCST compounds will be significantly expanded within the Tox21 HTS program.


Source Contacts: For questions concerning the EPA ToxCast Program, contact Keith Houck, email: houck.keith@epa.gov or David Dix, email: dix.david@epa.gov. For questions concerning the DSSTox files or the TOXCST chemical inventory, contact Ann Richard, email: richard.ann@epa.gov.

Source Website
: EPA National Center for Computational Toxicology ToxCast™ Program

Return to the list aboveReturn to Top

Main Citations: Publications reporting use of the DSSTox TOXCST data files are asked to list the full DSSTox file name, including date stamp, and to cite as primary reference the following:

Dix, D.J., Houck, K.A., Martin, M.T., Richard, A.M., Setzer, W., Kavlock, R.J. (2007) The ToxCast program for prioritizing toxicity testing of environmental chemicals. Tox. Sci., 95:5-12.

Also see the ToxCast Website for more information on ToxCast Data Sets & Published Research: http://www.epa.gov/ncct/toxcast/data.html

TOXCST SDF Fields (28 total)

DSSTox Standard Chemical Fields (19)

new field
new field
new field
new field
ToxCast_e1k new field

new field
new field

Return to the list aboveReturn to Top

TOXCST SDF Content Summary - 20 March 2012

Totals_v3a* Totals_v4a
# Records
DSSTox Standard Chemical Fields
TOXCST Source Fields
Total # Fields
Chemical Content
Counts_v3a Counts_v4a
defined organic
no structure (blank entry)
salt or complex
single chemical compound
mixture or formulation

* v3a contained 5 sets of duplicate compounds and 3 sets of triplicate compounds; from v4a onward, DSSTox file includes no duplicate substances.

Return to the list aboveReturn to Top

Version 4a Update: The most significant change in this version is the addition of over 1500 unique compound records added from the Phase II and e1k programs, including discontinued and newly added compounds, as well as the addition of 6 new inventory indicator fields marking the presence or absence of a particular compound in a subinventory. Additionally, the file carries updated documentation pertaining to testing status and some summary analytical QC results for PhaseI_v1compounds supporting reasons for discontinuation of a compound in the test library. A change in practice as of this date is to no longer include compound sample duplicates (i.e., compounds having the same DSSTox_Generic_SID) in any DSSTox Data File, thus eliminating the need for the ChemicalReplicateCount field. In addition, the Relationship_CID field has been eliminated, to be replaced by alternate means for mapping related chemicals throughout DSSTox files (feature in development). Minor corrections to structure-annotation fields resulting from continuing QC, are documented in the Note_TOXCST field. exit EPA.

Return to the list aboveReturn to Top

File Download Notes: The following files are offered in the DownLoad table below:

Log File (PDF) provides SDF data file version history and summary information (field, chemical counts, etc.), and a description of procedures and quality assurance checks used in SDF file creation;
Structure Data File (SDF) is the main DSSTox product, providing the complete inventory of chemical structures, DSSTox Standard Chemical Fields, and all Source-specific data fields [Note: the structure field is blank for all records containing mixtures or undefined substances];
Data Table MS Excel (MS Office 2003) file contains the full SDF data contents in spreadsheet table form, minus the chemical structure field [file created with CambridgeSoft ChemFinder plug-in to MS Excel 2003];
Structures Table (PDF) file contains a tiled format graphical view of all chemical structures contained in the SDF file, annotated with TestSubstance_CASRN and truncated TestSubstance_ChemicalName field entries for the tested form of the chemical [file created with ACD ChemFolder, ver. 12, ACD Labs].

You will need Adobe Acrobat Reader, available as a free download, to view the Adobe PDF files on this page. See EPA's PDF page to learn more about PDF, and for a link to the free Acrobat Reader.
Zip files may be decompressed using a utility such as JZip. Exit EPA Disclaimer

File Types   Description File Size Format

Documentation Files: TOXCST
Log File  
pdf document icon
Data Files: TOXCST
SDF Structure Data File  



sdf document icon
• Data Table
(no structures)
  excel document icon
• Structures Table   pdf document icon
file error report graphic link to submit error report form    

These files constitute the main DSSTox products. Documentation Files use standard templates, and DSSTox Structure Data Files and DSSTox File Names adhere to strict formatting standards and conventions. For additional information, see More on DSSTox Standard Chemical Fields, Known Problems & Fixes, Chemical Information Quality Review Procedures, and How to Use DSSTox Files.

Quick & Easy File Downloads: FTP Download

Return to the list aboveReturn to Top

Acknowledgements: The DSSTox SDF file of the original and updated TOXCST chemical inventory was largely compiled from lengthy and careful review of primary Source documentation accompanying chemical sample procurements (i.e., mainly Certificates of Analysis and Material Safety Data Sheets). Chemical structure annotation, QA review, and problem resolution were carried out by Maritja Wolf (Lockheed Martin, Contractor for EPA) with assistance from Inthirany Thillainadarajah (EPA SEE grant program). Stephen Little (EPA) provided valuable assistance in reviewing and indexing Certificates of Analysis for chemicals in PhaseI_v1 of the ToxCast Research Project. We thank Keith Houck for valuable assistance in processing information for the donated pharmaceutical chemicals, Richard Judson for a major role in nominating and finalizing the Phase II and e1k chemical inventories, and the full NCCT ToxCast team for contributions to, and support of this effort. We also thank the staff at Compound Focus, Inc. (contractor to the US EPA) for expert assistance in the procurement, tracking, analysis, processing, and plating of the EPA ToxCast and e1k compound libraries.

DSSTox Citation: Houck, K., D. Dix, R. Judson, M. Martin, M. Wolf, R. Kavlock, and A.M. Richard (2012) DSSTox EPA ToxCast High Throughput Screening Testing Chemicals Structure-Index File: SDF File and Documentation, Updated version: TOXCST_v4a_1892_20Mar2012, www.epa.gov/ncct/dsstox/sdf_toxcst.html

Disclaimer: Every effort is made to ensure that DSSTox SDF files and associated documentation are error-free, but neither the DSSTox Source collaborators nor the EPA DSSTox project team make guarantees of accuracy, nor are any of these persons to be held liable for any subsequent use of these public data. The contents of this webpage and supporting documents have been subjected to review by the EPA National Center for Computational Toxicology and approved for publication. Approval does not signify that the contents reflect the views of the Agency, nor does mention of trade names or commercial products constitute endorsement or recommendation for use. See additional disclaimers.


Return to the list above Return to Top

Jump to main content.