Computational Toxicology Research Program
SDF Download Page
CPDBAS: Carcinogenic Potency Database Summary Tables - All Species Database File
** Version 5d, revised 20 November 2008: Refinements of field descriptions and adjustments to few ActivityCategory "inconclusive" assignments. Incorporation of new field, Substance_Modify_yyyymmdd to track structure modifications across DSSTox inventory.
Quick & Easy File Downloads: FTP Download Instructions
- Source Website & Contact
- Main Citations
- Guidance for Use
- Version 5 Update
- SDF Fields
- Data Content Summary
- SDF Download Table
- Acknowledgements, DSSTox Citation & Disclaimer
For general information, see DSSTox Project Goals and About DSSTox. For additional information on DSSTox SDF (Structure Data Format) files and their use in Chemical Relational Databases, see More on SDF and More on CRDs.
The CPDB Summary Tables list summarized results for experiments on 1547 substances in the Carcinogenic Potency Database (CPDB). These Summary Tables report the strongest evidence of carcinogenicity for each chemical, in each sex/species and represent one of many possible summarizations of the data in the CPDB. The CPDB includes detailed results and analyses of 6540 chronic, long-term carcinogenesis bioassays reported in 1513 papers in the general literature and 452 Technical Reports of the National Cancer Institute/National Toxicology Program. Details on each experiment can be obtained from the CPDB Source website. For all experiments in the CPDB, detailed information is provided on each experiment, including the species, sex, strain, route of administration, duration of exposure and of experiment, dose levels, target sites, shape of the dose response, estimates of carcinogenic potency and the confidence limits surrounding it, statistical significance of the carcinogenic dose response, tumor incidences, and bibliographic citation to the published paper or to the NCI/NTP Technical Report. All experiments in the complete database meet a specific set of inclusion criteria that were designed to permit the estimation of carcinogenic potency; therefore, reasonable consistency of experimental protocols is assured. Bioassays are included in the database only if the test agent was administered alone rather than in combination with other substances; if the bioassay included a control group; if the route of administration was either diet, water, gavage, inhalation, intravenous injection or intraperitoneal injection; and if the length of experiment was at least half the standard lifespan for the species with dosing for at least one fourth lifespan. Evidence of carcinogenicity in an experiment was not determined for the CPDB; rather, the CPDB reports the evaluation of the published author of each experiment and the statistical significance of the tumorigenic dose response calculated by the CPDB. Fewer than thirty percent of the chemicals have been tested by the NCI/NTP.
A CPDB Margin of Exposure Graphic recently posted on the CPDB Source website offers a broad perspective on the difference between dose of a substance that induces tumors in rodent tests versus estimated human exposures, where such data are available. Margin of Exposure (MOE) indicates how many times lower the average human exposure is than the dose to give tumors to 10% of rats or mice in cancer tests (LTD10 in mg/kg/day). The graphic is accompanied by an MOE Table that ranks MOE (LTD10/Human Exposure) along with estimates of Human Intake Amount (mg/kg/day), Rat and Mouse LTD10, and corresponding literature references for more than 100 human exposures, about half of which are to synthetic chemicals and half to naturally-occurring chemicals.
The CPDB Summary Tables provide an overview of bioassay results on a given chemical with information on: the sex-species groups tested, and the strongest level of evidence for carcinogenicity based on the opinion of a published author; the carcinogenic potency value (TD50), and target organs in each species tested. If there are both positive and negative experiments in a sex-species, the negative results are ignored in the Summary Table. When there is more than one positive experiment in a species, the TD50 value in the CPDB Summary Table is the harmonic mean of the most potent TD50 value from each positive experiment in the species. The target organs listed for each sex-species group may be from more than one positive experiment. Summary evaluations of mutagenicity in Salmonella (positive or negative) are also reported. CPDB Summary Table results are provided for rats, mice, hamsters, dogs, and nonhuman primates.
DSSTox SDF files were originally provided separately for each of these for 4 data tables. In CPDBAS_v2a and later versions, all data tables have been consolidated into a single SDF for easier implementation and use in a relational database application. In CPDBAS_v3b and later, URLs are provided to chemical-specific data page summaries posted on the CPDB Source Website. In CPDBAS_v5b and later, summary activity fields are provided for each rodent species, with primates grouped into a single summary field, consistent with representation of DSSTox CPDB Assays in PubChem.
The complete, updated CPDB of 6540 experiments, from which the Summary Tables and CPDBAS are derived, is available in several formats at http://potency.berkeley.edu/.
Please contact Lois Swirsky Gold for questions pertaining to the content of the CPDB Summary Tables; email: firstname.lastname@example.org. Please Contact Us for questions or comments pertaining to the DSSTox CPDBAS SDF files.
Publications reporting use of DSSTox SDF files for the CPDB Summary Tables are asked to list the full DSSTox file name(s), including date stamp, and to cite as primary references the following citations:
Gold, L.S., T.H. Slone, B.N. Ames, N.B. Manley, G.B. Garfinkel, and L. Rohrbach (1997) Chapter 1: Carcinogenic Potency Database. In: Gold, L.S., and E. Zeiger, Eds. Handbook of Carcinogenic Potency and Genotoxicity Databases. Boca Raton, FL: CRC Press, pp. 1-605. http://potency.berkeley.edu/text/methods.html
Gold, L.S., N.B. Manley, T.H. Slone, and J.M. Ward (2001) Compendium of chemical carcinogens by target organ: Results of chronic bioassays in rats, mice, hamsters, dogs and monkeys. Toxicol. Pathol. 29: 639-652. http://potency.berkeley.edu/text/ToxicolPathol.pdf
Gold, L. S., Manley, N. B., Slone, T. H., Rohrbach, L., and Garfinkel, G.B. Supplement to the Carcinogenic Potency Database (CPDB): Results of Animal Bioassays Published in the General Literature through 1997 and by the National Toxicology Program in 1997 and 1998. Toxicol. Sci. 85: 747-800 (2005). http://potency.berkeley.edu/pdfs/ToxSciPlot.pdf
Carcinogenicty Potency Data Base (2008) http://potency.berkeley.edu/
Summary Table of Chemicals in the Carcinogenic Potency Database: Results for Positivity, Potency (TD50), and Target Sites: http://potency.berkeley.edu/chemicalsummary.html
See also, a full listing of Data Plots (from which the CPDB Summary Table was derived) and CPDB publications at:
These CPDB Summary Tables can be used as an overview of the literature of animal cancer tests, as an index of results in the CPDB, and to investigate associations between carcinogenic potency and other factors such as mutagenicity, teratogenicity, chemical structure, and human exposure. The complete CPDB, which is summarized in these tables, has been fully described in several data plots and publications (see Main Citations above), and details of the inclusion criteria, protocol characteristics, and derived variables are available at: http://potency.berkeley.edu/methods.html . A user of the DSSTox CPDBAS SDF file is encouraged to consult the extensive documentation and content of the CPDB website [http://potency.berkeley.edu/ ] for further information, details, and references pertaining to each entry of the CPDB Summary Tables. In addition, the CPDBAS Field Definition File below contains definitions of each field in the DSSTox CPDBAS SDF file, and information pertaining to any discrepancies between these fields (and their contents) and the data in the original CPDB Summary Tables. The CPDBAS Field Definition File contains essential documentation and should be downloaded with, and accompany any use of the DSSTox CPDBAS SDF file. The CPDBAS LogFile provides SDF file summary information (field, chemical counts, etc.), a description of procedures and quality assurance checks used in SDF file creation, and a listing of unavailable CASRN or Structure information in the SDF file. The Log File also contains a full version history, documenting modifications incorporated into version/revision updates of the DSSTox CPDBAS (and earlier) SDF files.
CPDBAS_v3b and later versions contain the full inventory of chemical-specific data page URLs from the CPDB Source Website Chemical Index that can be used in combination with structure-searching capabilities to link to the specific chemical data page summary on the CPDB website. These links can be accessed from within the newly launched DSSTox Structure-Browser (August 2007). The CPDB Source Website Chemical Index data pages also have directly incorporated some DSSTox Standard Chemical Field content (SMILES, InChI's).
To report errors in any CPDBAS documentation or data file, click on File Error Report here or below. For additional information on DSSTox SDF files and their use in Chemical Relational Databases, see More on SDF and More on CRDs.
Note: CPDBAS_v3a was previously posted to PubChem as a single "activity" Assay ID (AID) file, with activity defined by the ActivityOutcome_CPDBAS_SingleCellCall. CPDBAS_v5c and later versions are deposited under sevev separate CPDBAS_AID listings, each representating one of the several indicators of summarized mutagenic and carcinogenic activity contained within CPDBAS_v5c, i.e. ActivityOutcome_CPDBAS_[Mutagenicity, Rat, Mouse, Hamster, etc], ActivityScore_CPDBAS_[Mutagenicity, Rat, Mouse, Hamster, etc].
CPDBAS_v5a represents a major update of the CPDBAS data file, with 66 new chemical records added and over 400 new or modified experimental results affecting nearly 100 existing data records. These changes correspond to the most recently published data in the CPDB Summary Tables posted on the CPDB Source website . In addition to 66 added records and hundreds of new or modified experimental results, additional quality review and a user error-report led to correction of 3 previously included chemical structures and a small number of previous experimental results with errors in transcription. For more information and version history, consult the CPDBAS Log File in the Download Table below.
The significant number of new chemical records added to v5a presents a unique opportunity to structure-activity modelers to use these new data for validation of existing carcinogenicity models; however, the significant number of modified and new experimental results in v5a should necessitate model rederivation prior to validation. Of the 66 new records, 59 are defined organics, 53 correspond to the parent structure (i.e., neither salt nor complex), and 51 are a "single chemical compound" as opposed to a macromolecule or mixture. To assist users in locating modified or newly added records from one DSSTox CPDBAS file version to the next, we have incorporated controlled text entries (separated by semicolons) into the Note_CPDBAS field where applicable, e.g. for this v5a update:
- chemical added v5a ....(66 instances)
- Rat added v5a ....(63 instances)
- Mouse added v5a ....(42 instances)
- TD50_Rat_Note modified v5a ....(19 instances)
- Mutagenicity_SAL_CPDB added v5a (33 instances)
- TD50_Rat modified v5a ....(13 instances)
- TargetSites_Mouse_Female modified v5a ....(8 instances)
- ... etc
Also of interest to structure-activity modelers should be the various summary activity representations provided in CPDBAS_v5 and later. These various options for considering carcinogenicity activity or inactivity should provide useful avenues for modelers to explore.
Version 5b Revision: CPDBAS_v5b includes no new carcinogenicity experimental data, but includes several new and modified summary activity fields for use in PubChem and structure-activity relationship studies (see CPDBAS SDF Field listing below). These include:
ActivityOutcome_CPDBAS_,,, fields for Mutagenicity (Salmonella), SingleCellCall, MultiCellCall, Rat, Mouse, Hamster, and Dog_Primates, where field entry is one of the following:
- "active"... at least one positive experiment reported in CPDB, with TD50 and tumor site listed
- "inactive" ... at least one experiment reported with "no positive results", and no other positive experiment reported in CPDB
- "inconclusive" ... NCI/NTP bioassays were the only available experiments in the species, and results for both sexes in the species were evaluated by NCI/NTP as inadequate; one case (Hamster) for which TD50 and tumor results were listed but were not statistically significant
- blank ... no experiments reported in CPDB
ActivityScore_CPDBAS_... fields for Rat, Mouse, Hamster, and Dog_Primates, where field entry is one of the following:
- Log(1/TD50mmol), mapped onto 1-100 integer range ... for "active" chemicals (see above)
- "0" ... for "inactive" chemicals (see above)
- blank ... no experiments reported in CPDB
Note: Some fields, such as ActivityOutcome_CPDBAS_SingleCellCall will have entries for all 1547 compounds, whereas others, such as ActivityOutcome_CPDBAS_Hamster will have non-blank entries only for the relatively small number of compounds in the CPDB with experiments reported in Hamsters (i.e., 87 total).
Since two species listed in the Source CPDB Summary Tables (Bush Baby and Tree Shrew) have data for only a single chemical record each, and the chemical record in each case is also listed in the CPDB Rat Mouse Summary Table, data for these species are provided in the new TD50_Dog_Primates_Note field of the corresponding chemical record. In addition, the new STRUCTURE_InChIKey field (25 character abbreviated InChI for use in structure-indexing applications) has been added as a DSSTox Standard Chemical Field to all DSSTox files.
Version 5c Revision: CPDBAS_v5c includes 2 structure modifications, and one correction to a record that incorrectly listed an inactive bioassay result for Mouse (but MultiCellCall was correct). See CPDBAS_LogFile for details.
For more information and version history, consult the CPDBAS_LogFile in the Download Table below and version update entries in the Note_CPDBAS field.
Version 5d Revision: CPDBAS_v5d includes modifications for 16 "inconclusive" entries in ActivityOutcome_CPDBAS_(Rat, Mouse, Hamster, SingleCellCall) fields based on further discussion with the Source Contact (L.Gold). One of the 16 (ActivityOutcome_CPDBAS_Hamster) was changed to "active". For the remaining 15 cases (ActivityOutcome_CPDBAS_Rat - 7 cases, ActivityOutcome_CPDBAS_Mouse - 5 cases, ActivityOutcome_CPDBAS_SingleCellCall - 3 cases), the "inconclusive" entry was changed to "unspecified". These changes have been incorporated in the corresponding PubChem CPDBAS bioassay entries. The corresponding ActivityScore of "0" remains unchanged. For furher details, consult the CPDBAS_LogFile.
For more information and version history, and to locate specific updated chemical records, consult the CPDBAS_LogFile in the Download Table below and version update entries in the Note_CPDBAS field.
- DSSTox Standard Chemical Fields (19) * STRUCTURE_InChIKey *field added in v5b
- Substance_Modify_yyyymmdd *field added in v5d
- DSSTox Standard Toxicity Fields (3)
- ActivityOutcome_CPDBAS_Mutagenicity *modified in v5b (formerly Mutagenicity_SAL_CPDB)
- ActivityScore_CPDBAS_Rat *new to v5b
- ActivityOutcome_CPDBAS_Rat *new to v5b
- ActivityScore_CPDBAS_Mouse *new to v5b
- ActivityOutcome_CPDBAS_Mouse *new to v5b
- ActivityScore_CPDBAS_Hamster *new to v5b
- ActivityOutcome_CPDBAS_Hamster *new to v5b
- TD50_Dog_Primates_Note *modified in v5b
- ActivityOutcome_CPDBAS_Dog_Primates *new to v5b
- ActivityOutcome_CPDBAS_SingleCellCall *modified in v5b
- ActivityOutcome_CPDBAS_MultiCellCall *modified in v5b
- ActivityOutcome_CPDBAS_MultiCellCall_Details *modified in v5b
- Note_CPDBAS ......contains controlled text entries for version content updates
- ChemicalPage_URL .....(formerly Website_URL in v4a), contains link to the record-specific CPDB Chemical Index data page, e.g. see ACETALDEHYDE .
* Note: For detailed information on SDF content, see CPDBAS Field Definition File in Download Table below.
CPDBAS_v5d contains 1547 total chemical substance records, but not all substances have been tested in all species categories. The table below provides a summary of the total numbers of substances tested for each of the species categories and the corresponding counts within the ActivityOutcome and ActivityScore fields derived from the CPDB Summary Tables. For definitions of these fields and field entries, refer to the field listing in the previous section or consult the CPDBAS_FieldDefFile available in the Download Table below.
|Active||Inactive||Unspecified||Total # Entries||Total # Entries*|
** ActivityScore_CPDBAS_Species fields are only included for CPDB species categories with at least one positive experiment in the CPDB, and therefore a TD50 value is reported in the CPDB Summary Table. However, this field is not included for the field ActivityOutcome_Dog_Primates (which groups results for species Dog, Rhesus, and Cynomolgus) due to the combined species nature of this field and the small number of chemicals with available data for each of the included species.
The following files are offered in the DownLoad table below:
- Log File (PDF) provides SDF data file version history and summary information (field, chemical counts, etc.), and a description of procedures and quality assurance checks used in SDF file creation;
- Field Definition File (PDF or MS Word doc file) provides field definitions and essential documentation, and should be downloaded with and accompany any use of the DSSTox SDF file;
- Structure Data File (SDF) is the main DSSTox product, providing the complete inventory of chemical structures, DSSTox Standard Chemical Fields, and all Source-specific data fields [Note: the structure field is blank for all records containing mixtures or undefined substances];
- Data Table MS Excel (MS Office 2003) file contains the full SDF data contents in spreadsheet table form, minus the chemical structure field [file created with CambridgeSoft ChemFinder plug-in to MS Excel 2003];
- Structures Table (PDF) file contains a tiled format graphical view of all chemical structures contained in the SDF file, annotated with TestSubstance_CASRN and truncated TestSubstance_ChemicalName field entries for the tested form of the chemical [file created with ACD ChemFolder, ver. 10.01, ACD Labs].
You will need Adobe Reader to view some of the files on this page. See EPA's PDF page to learn more.
|File Types||Description||File Size||Format|
|Log File||CPDBAS_LogFile_20Nov2008.pdf (PDF, 14 pp.)||
|Field Definition File||CPDBAS_FieldDefFile_20Nov2008.pdf (PDF, 14 pp.)||
|Data Files: CPDBAS|
|SDF Structure Data File||CPDBAS_v5d_1547_20Nov2008.sdf||
| Data Table
| Structures Table||CPDBAS_v5d_1547_20Nov2008_structures.pdf (PDF, 31 pp., 642KB)|
These files constitute the main DSSTox products. DSSTox Documentation Files use standard templates, and DSSTox Structure Data Files and DSSTox File Names adhere to strict formatting standards and conventions. For additional information, see More on DSSTox Standard Chemical Fields, Known Problems & Fixes, Chemical Information Quality Review Procedures, and How to Use DSSTox Files.
Quick & Easy File Downloads:
The original DSSTox SDF file of the Carcinogenic Potency Database Summary Tables was created by ClarLynda Williams (EPA/NC Central Univ Student COOP; EPA) with the assistance of Jamie Burch (EPA/NC Central Univ Student COOP), Todd Stewart (EPA/UNC Student COOP), Adam Swank (EPA), James Beidler (EPA Summer Student Hire), and Ann Richard (EPA). In addition, a special debt of gratitude is owed to Lois Swirsky Gold and Thomas H. Slone (Univ Calif Berkeley, Carcinogenic Potency Project) for their collaboration and involvement throughout the initial DSSTox project phase and subsequently. In particular, we thank them for invaluable assistance in the quality review of the CPDBAS SDF files, for careful review of all documentation pertaining to the CPDB on this website, for assistance in locating structural information, and for numerous clarifications and suggestions for improvements to the launch-version databases and documentation. Thanks to Larry Claxton and Chandrika Moudgal for careful reviews of the v1a documentation files. All subsequent QA review and structure modifications to CPDBAS_v2 and later versions were carried out by Maritja Wolf (Lockheed Martin, Contractor for EPA). We thank Emilio Benfenati (Mario Negri, Institute of Pharmacological Research) for reporting 3 structure errors, single-instance chemicals from our earliest DSSTox data entry days. Thanks also to David Hollandsworth (CSC, Contractor to the US EPA) for performing a link-check on the full list of CPDBAS URLs.
Gold, L.S., T.H. Slone, C.R. Williams, J.M. Burch, T.W. Stewart, A.E. Swank, J. Beidler, and A.M. Richard (2008) DSSTox Carcinogenic Potency Database Summary Tables - All Species: SDF File and Documentation, Updated version: CPDBAS_v5d_1547_20Nov2008, www.epa.gov/ncct/dsstox/sdf_cpdbas.html
Every effort is made to ensure that DSSTox SDF files and associated documentation are error-free, but neither the DSSTox Source collaborators nor the EPA DSSTox project team make guarantees of accuracy, nor are any of these persons to be held liable for any subsequent use of these public data. The contents of this webpage and supporting documents have been subjected to review by the National Health and Environmental Effects Research Laboratory and approved for publication. Approval does not signify that the contents reflect the views of the Agency, nor does mention of trade names or commercial products constitute endorsement or recommendation for use. See additional disclaimers.