Guidance on Searching for Chemical Information and Data
Guidance on Searching for Chemical Information and Data
IntroductionDescription of Selected Information Sources
Locating Studies and Data
Description of Selected Information Sources
As noted elsewhere in the search guide, the following information sources may be available from a variety of sources. Publications and some databases may be available in libraries. (Unless stated otherwise, EPA cannot provide access to these information sources or copies of publications.)
IMPORTANT: The fact that a resource is included in this guide does not mean that EPA is endorsing that source. Nor does it mean that EPA will automatically accept data included in or referenced by that source. Studies and data will need to meet the requirements as spelled out in the guidance document on data adequacy in order to be accepted under the HPV Challenge Program.
Further, this listing is not intended to be exhaustive. Rather, the listing provides a guide for sponsors to resources that might reference studies containing data on chemicals. Sponsors are encouraged to search sources beyond this list, using their familiarity with their chemicals in identifying other potential sources. Such sources might include proprietary sources such as trade associations or government agencies to which companies have submitted data, as well as public sources focusing on specific disciplines or industries. The latter sources, though containing publicly accessible data, may not be widely known outside a particular industry. Sponsors are welcome to search any of these of which they are aware at their discretion.
In addition, sponsors should keep in mind that because many of these secondary sources index and catalog the same studies, the same study may show up in more than one database. Aggregating the number of hits from several databases to determine how many studies have been done may not necessarily reflect the actual number.
Finally, it is possible that sources that focus on other endpoints than the SIDS endpoints might contain data relevant to the HPV Challenge Program. A separate listing of such sources is provided for informational purposes only.
This database is produced by
a group associated with the EPA Environmental Research Laboratory in Duluth,
Minnesota. AQUIRE data are extracted from literature published worldwide
and from independently compiled data files; documents to be abstracted
are identified through online literature searches, from review article
and criteria document bibliographies, and from existing aquatic toxicity
reprint collections. AQUIRE includes data on acute and chronic toxicity,
bioaccumulation, and sublethal effects data from tests performed on freshwater
and saltwater species. Data on aquatic mammals, birds, and bacteria are
not included in AQUIRE. The data are formatted into records at the level
of the individual tests or observations. Each record contains chemical
substance information, test organism details, study protocol, experimental
details, and results for one test or one observation within a given reference
document. Thus, there can be multiple AQUIRE records for a given chemical
from a single paper as well as from several different papers.
The most comprehensive one-volume
guide of its kind, this standard reference work has been newly revised
and expanded to present information on teratogenic agents in a ready-reference
format. The ninth edition includes approximately 1,200 additions, of which
250 are newly listed agents. The Catalog emphasizes human data and covers
pharmaceuticals, chemicals, environmental pollutants, food additives,
household products, and viruses. A special effort has been made to obtain
as much information as possible on drugs and other agents to which pregnant
women may be exposed. Substances are listed alphabetically, and each entry
briefly summarizes research procedures and results. As in previous editions,
a complete list of references is included for each agent.
The Catalog is also accessible as a database through CIS (Chemical Information System).
This database contains scientifically
evaluated data derived from carcinogenicity, mutagenicity, tumor promotion,
and tumor inhibition studies. It contains over 7,300 chemical records
and is sponsored by the National Cancer Institute. (The database is available
through CIS (Chemical Information System) and the National
Library of Medicine's TOXNET system.)
Includes about 30 databases
concerned with chemicals having an environmental impact or that are regulated
in some way. Originally developed by the National Institutes of
Health and EPA for managing chemical data and information, CIS is now
owned by Oxford Molecular. See the CIS Web page at http://dreamon.oxmol.co.uk/prods/cis for
a complete list of the CIS databases. The Structure
and Nomenclature Search System (SANSS) database serves as an index
to the CIS databases.
CIS can provide searches for a fee. Call 800-CIS-USER for more information on services.
ChemID is a Chemical Identification
file built and maintained by the National Library of Medicine (NLM). It
serves as an authority file for
the identification of chemical substances cited in NLM databases.
ChemID is searchable by name of chemical, name fragments, CAS Registry
Number, or molecular formula. ChemID is a "pointer system". In addition
to CAS Registry Numbers, molecular formulas, systematic names, and synonyms,
ChemID identifies other databases which contain data on the chemical in
question, as well as lists maintained by federal and state regulatory
ChemID is accessible through NLM's Internet Grateful Med (IGM) (http://igm.nlm.nih.gov)> service. (Select "ChemID" from the list of databases on the IGM home page (http://igm.nlm.nih.gov) .) For more information on ChemID see the NLM ChemID Fact Sheet (http://www.nlm.nih.gov/pubs/factsheets/chemidfs.html).
Contains citations for published
articles containing data on the environmental fate and the physical-chemical
properties of chemicals released into the environment. Available
through CIS (Chemical Information System).
DART is a Toxicology Literature
File on the National Library of Medicine's (NLM) TOXNET®
system. It covers teratology and
other aspects of developmental and reproductive toxicology. It contains
some 40,000 literature citations published since 1989.
DART is a continuation of the earlier ETICBACK (Environmental Teratology
Information Center Backfile) database, also
available on TOXNET. DART and ETICBACK are funded by the U.S. Environmental
Protection Agency, the National Institute
of Environmental Health Sciences, the National Center for Toxicological
Research of the Food and Drug Administration,
and the NLM. It is also searchable as a subfile in the TOXLINE
This database, developed through
the collaborative efforts of the EPA's Office of Pollution Prevention
and Toxics (OPPT) and the Syracuse Research Corporation (SRC), contains
data on approximately 800 chemicals. Envirofate contains summary
information concerning the environmental fate and the physical-chemical
properties of chemicals released into the environment. Chemicals selected
for inclusion in the database are produced annually in excess of one million
pounds. ENVIROFATE contains twenty-four types of data extracted from papers
published worldwide dealing with environmental fate and behavior studies.
It is available through CIS (Chemical Information System).
A bibliographic database on
chemicals, biological and physical agents that have been tested for genotoxic
activity. EMIC covers publications from 1991 to present; earlier
years are covered in EMICBACK. These files are maintained by federal funding.
(The database can also be searched online through the TOXLINE
database and the TOXNET system.)
A collection of genetic assay
studies developed through the collaborative efforts of the National Institute
of Environmental Health Sciences (NIEHS), the Oak Ridge National Laboratory
(ORNL), and the Environmental Mutagen Information Center (EMIC).
The database is directed toward the goal of establishing standard genetic
testing and evaluation procedures for the purposes of regulating toxic
substances and determining the direction of research and development in
this area. Assays are compared on a chemical-by-chemical basis. Each GENE-TOX
record corresponds to an individual chemical and may incorporate several
studies in a single record. Assay results are included on mutagenicity
tables. These tables provide specific information on the type of assay,
the biological host, the assay endpoint, and the final qualitative results
of the assay. When available, reference information for the assays is
GENE-TOX data are derived from the results of assays selected from published primary papers and from reports from contractors submitted to EMIC. Chemical data included in the database comes from the following sources: EMIC (Environmental Mutagen Information Center), ETIC (Environmental Teratogen Information Center), SOLM, WATER POLLUTION, OSHA (Occupational Safety and Health Administration), CHEMLINE, TSCA (Toxic Substances Control Act) Lists, and RTECS (Registry of Toxic Effects of Chemical Substances).
HSDB is an online, interactive
file composed of approximately 4000 comprehensive, peer-reviewed chemical
records. It is produced by the National Library of Medicine (NLM)
in cooperation with Oak Ridge National Laboratory. It contains toxicological,
pharmacological, environmental, occupational, manufacturing, and use information
as well as chemical and physical property data. Compounds selected
for HSDB include highly regulated chemicals, high volume production and
exposure chemicals, and drugs and pesticides exhibiting toxicity potential.
The database can be searched through the "Toxicology Data Search" option on the TOXNET system.
Integrated Risk Information System (IRIS) - http://www.epa.gov/iris
Prepared and maintained by
EPA, IRIS is an electronic database containing health risk and EPA regulatory
information on specific chemicals. IRIS was developed by EPA staff in
response to a growing demand for consistent risk information on chemicals
substances for use in decision-making and regulatory activities. IRIS
is designed for EPA staff, but is also accessible to state and local environmental
health agencies. The information in IRIS is intended for EPA staff with
extensive training in toxicology, but with some knowledge of health sciences.
(IRIS is accessible through the EPA Web site at http://www.epa.gov/iris.
The database can also be searched online through the TOXNET
List of IRIS Substances - http://www.epa.gov/docs/ngispgm3/iris/subst/index.html
The Merck Index is an internationally
recognized encyclopedia of chemicals, drugs, pesticides, and biologically
active substances. It is available in both print and electronic
versions. The online database, which is available through CIS (Chemical Information System) and DIALOG, contains nearly 10,000 records
containing references to approximately 30,000 substances, inclusive
dates late 19th century to present, updated semi-annually, produced by
Merck & Co., Inc. Each record in the database discusses a single
chemical entity or a small group of very closely-related compounds.
Records include molecular formulas and weights, systematic chemical names,
generic and trivial names, CAS Registry numbers, physical and toxicity
data, uses, etc.
- Investigating potentially hazardous working conditions as requested by employers or employees
- Evaluating hazards in the workplace, ranging from chemicals to machinery
- Creating and disseminating methods for preventing disease, injury, and disability
- Conducting research and providing scientifically valid recommendations for protecting workers
- Providing education and training to individuals preparing for or actively working in the field of occupational safety and health
The National Institute for Occupational Safety and Health (NIOSH) was established by the Occupational Safety and Health Act of 1970. NIOSH is part of the Centers for Disease Control and Prevention (CDC) and is the only federal Institute responsible for conducting research and making recommendations for the prevention of work-related illnesses and injuries. The Institute's responsibilities include:
The National Library of Medicine
is, as its name indicates, one of the national libraries of the United
States. Located on the campus of the National Institutes of Health,
it provides a number of services and resources for use by the American
public. Among the resources of most interest for the purposes of
this search guide are ChemID, TOXLINE,
For fact sheets on NLM's toxicological databases, as well as manuals for selected databases, see the following:
The National Toxicology Program
(NTP) was established in 1978 by the Secretary of Health and Human Services
toxicology research and testing activities within the Department, to provide
information about potentially toxic chemicals
to regulatory and research agencies and the public, and to strengthen
the science base in toxicology. In its seventeen years, the NTP has become
the world's leader in designing, conducting, and interpreting animal assays
The NTP consists of relevant toxicology activities of the National Institutes of Health's National Institute of Environmental Health Sciences (NIH/NIEHS), the Centers for Disease Control and Prevention's National Institute for Occupational Safety and Health (CDC/ NIOSH), and the Food and Drug Administration's National Center for Toxicological Research (FDA/NCTR). The NIH's National Cancer Institute (NIH/NCI) was a charter agency; however, the NCI Carcinogenesis Bioassay Program was transferred to the NIEHS in 1981. The NCI remains active in the Program through membership on the NTP Executive Committee. EPA and other Federal health research and regulatory agencies also participate through the Executive Committee.
Information and Study Results (http://ntp-server.niehs.nih.gov/main_pages/NTP_ALL_STDY_PG.html)
The National Toxicology Program (NTP) conducts toxicity/carcinogenesis studies on agents suspected of posing hazards to human health. Chemical-related study information is submitted to NIEHS and is archived and maintained in a central location (Central Files) so that all study information can be monitored and tracked efficiently. Currently, more than 800 chemical studies are on file. NTP Information is routinely provided to industry and the public on an as requested basis.
Long Term Carcinogenesis
Short-term Toxicity Studies
Continuous Breeding Studies
Short-term Reproductive & Developmental Toxicity Studies
National Toxicology Program Technical Reports
NTP Chemical Health & Safety Data
Health and Safety information
has been collected on over 2000 chemicals studied by the NTP. There
are several ways to retrieve
data from these NTP files:
Toxicology Program Web site
Health Information Service (EHIS)
- Technical articles selected from approximately 160 core journals, and additional information from proceedings of scientific meetings, and symposia.
NIOSHTIC is the National Institute for Occupational Safety and Health's (NIOSH) electronic, bibliographic database of literature in the field of occupational safety and health. NIOSHTIC is updated quarterly and is available on-line and on compact disk from several vendors. Information contained within NIOSHTIC is selected from a number of sources.
- All NIOSH documents including NIOSH Numbered Publications, Criteria Documents, Current Intelligence Bulletins, Hazard Evaluations and Technical Assistance Reports, Industrial Hygiene Surveys, Field Studies, and final Contract and Grant reports.
- Selected pre-1974 documents from CIS, the International Labour Organization's occupational safety and health activity headquartered in Geneva, Switzerland.
- Translations of non-English occupational safety and health articles acquired by NIOSH.
- References cited in NIOSH Criteria Documents and Current Intelligence Bulletins.
NIOSHTIC is accessible as a subfile in the TOXLINE database, as well as a separate database through a number of vendors.
A database dealing with the
effects of organic chemicals on terrestrial vascular plants. The PHYTOTOX
database has been compiled in the Department of Botany and Microbiology
at the University of Oklahoma under sponsorship of the US Environmental
Protection Agency. All information in the database has been extracted
from the open literature. Each record in the database contains information
about the effects of the application of one concentration of a single
chemical on a particular plant species as reported in one paper. Papers
selected for inclusion in the PHYTOTOX database must satisfy three principal
criteria: (1) a terrestrial plant was studied; (2) organic chemicals were
applied; and (3) direct effects were evaluated. Data are then extracted
from these papers for inclusion in the PHYTOTOX database. Each record
in the database defines the chemical and plant species involved in the
test, provides dosage and application information, lists all noted effects
of the test on the plant, and provides a bibliographic citation for the
source of the data.
RTECS contains over 100,000
records covering 1971 to present, quarterly updates, maintained by NIOSH.
RTECS is a comprehensive database of basic toxicity information for over
100,000 chemical substances. In addition to toxic effects and general
toxicology reviews, data on skin and/or eye irritation, mutation, reproductive
consequences, and tumorigenicity are provided. Toxic effects are linked
to literature citation from both published and unpublished government
reports (including unpublished test data from TSCATS, the EPA TSCA test
submissions database), and published articles from the scientific literature.
Access to RTECS:
The RTECS database is available from a number of vendors. While the RTECS data base is no longer available free of charge on TOXNET on the Web it can still be accessed via the TOXNET system via TELNET or dial-up modem to a toll free number. These fee based options allow precise command line searching, and require a userid and a password. Fees range from $18-25 an hour. Please call 1-888-346-3656 to obtain an access code. You can access RTECS by using the following Internet access options:
1. TELNET directly to TOXNET
2. TELNET first to NLM using: MEDLARS.NLM.NIH.GOV , and then select menu item "T"
SANSS contains records for
more than 500,000 chemicals. SANSS serves as an index to most of
the other CIS (Chemical Information System) components/databases
as well as to over 100 other important sources of information on environmentally
significant chemicals (including EPA reports, state documents, international
lists, etc.). Included for each chemical are names, synonyms, molecular
formulas, structural images, and references to the appropriate source
collection. Information is retrievable in SANSS by searching on
any of the above parameters or by molecular weight.
SANSS is used to identify specific chemicals and information sources. It is a pointer to CIS sources such as RTECS, the Merck Index, and AQUIRE, as well as non-CIS sources such as IARC Monographs, Hazardous Substances Data Bank, and National Toxicology Program studies. The lists generated can then be imported into other CIS components to retrieve technical information.
While SANSS costs a lot less
to search than STN and other CAS systems, it does not contain information
on as many chemicals as STN.
TSCATS are submitted by industry
to EPA under several provisions of the Toxic Substances Control Act.
The TSCATS database was developed by Syracuse Research Corporation for
OTS to index these submissions, which include unpublished health and safety
studies, chemical test data, and substantial risk data submitted to EPA
under TSCA sections 4, 8(d), 8(e), and FYI. It conveys the Agency's
receipt of unpublished, non-confidential studies covering test results
and adverse effects of chemicals on health and ecological systems submitted
under TSCA. TSCATS catalogs the purpose of testing (observations sought);
the test organisms used; the routes of administration; and, where available,
a description of the nature of the chemical tested (e.g., pure, component
of a mixture). The title of the submission is given, as well as file identification
data. The database can also be searched online as part of the TOXLINE
file, available from a number of vendors, or on the Web via NLM's Internet
Grateful Med gateway. The 8(e) TRIAGE database is a subset
of the TSCATS database.
The actual studies can be purchased from the National Technical Information Service (NTIS) (http://www.ntis.gov) ($) and CIS (Chemical Information System). They can also be viewed on microfiche in the TSCA Non-Confidential Information Center (also known as the TSCA Docket).
TOXLINE® is the National
Library of Medicine's extensive collection of online bibliographic information
biochemical, pharmacological, physiological, and toxicological effects of drugs and other chemicals. TOXLINE and its backfile TOXLINE65 together contain more than 2.5 million bibliographic citations, almost all with abstracts and/or indexing terms and CAS Registry Numbers. The information in TOXLINE is taken from secondary sources which formulate the subfiles listed below. Citations with publication year 1980 and older are located in the backfiles. [From the TOXLINE Fact Sheet (http://www.nlm.nih.gov/pubs/factsheets/toxlinfs.html")]
For a sample record and user manual, see the TOXLINE entry at http://sis.nlm.nih.gov/tox_chart.htm
The following subfiles are
included in TOXLINE:
Developmental and Reproductive Toxicology (DART)
Environmental Mutagen Information Center File (EMIC)
Environmental Teratology Information Center File (ETICBACK)
Epidemiology Information System (EPIDEM)
Federal Research in Progress (FEDRIP)
Hazardous Materials Technical Center (HMTC)
International Labour Office (CIS)
International Pharmaceutical Abstracts (IPA)
Pesticides Abstracts (PESTAB)
Poisonous Plants Bibliography (PPBIB)
Swedish National Chemicals Inspectorate (RISKLINE)
Toxic Substances Control Act Test Submissions (TSCATS)
Toxicity Bibliography (TOXBIB)
Toxicological Aspects of Environmental Health (BIOSIS)
Toxicology Document and Data Depository (NTIS)
Toxicology Research Projects (CRISP)
Web interface (http://toxnet.nlm.nih.gov) also allows users to search for toxicology data in the
following toxicology data files:
Hazardous Substances Data Bank, Chemical Carcinogenesis
Research Information System, Integrated Risk Information
System, and GENE-TOX, as well as EPA's Toxics
Release Inventory (TRI).
TOXNET (TOXicology Data NETwork) is a computerized system of files oriented to toxicology and related areas. It is managed by the National Library of Medicine's (NLM) Toxicology and Environmental Health Information Program (TEHIP) (http://sis.nlm.nih.gov/) and runs on Sun servers in a UNIX-based environment. TOXNET provides a free Web-based interface that permits easy searching of the following files:
Toxicology Data Files:
(Hazardous Substances Data Bank)
HSDB is a factual data bank focusing on the toxicology of over 4500 potentially hazardous chemicals. In addition to toxicity data, the file carries information in the areas of emergency handling procedures, environmental fate, human exposure, detection methods, and regulatory requirements. The data are fully referenced and peer-reviewed by a Scientific Review Panel composed of expert toxicologists and other scientists.
Risk Information System)
IRIS is an online database built by the Environmental Protection Agency (EPA). It contains EPA carcinogenic and non-carcinogenic health risk information on over 500 chemicals. The risk assessment data have been scientifically reviewed by groups of EPA scientists and represent EPA consensus.
Carcinogenesis Research Information System)
Sponsored by the National Cancer Institute (NCI), CCRIS contains scientifically evaluated data derived from carcinogenicity, mutagenicity, tumor promotion and tumor inhibition tests on some 8000 chemicals.
GENE-TOX, created by EPA, contains genetic toxicology test results on over 3,000 chemicals. Selected mutagenicity assay systems and the source literature are reviewed by work panels of scientific experts for each of the test systems under evaluation. The GENE-TOX data bank is the product of these data review activities. Each test system in GENE-TOX has been peer reviewed and is referenced.
Toxic Releases Files (TRI):
TRI (Toxic Chemical Release
TRI contains information on the annual estimated releases of toxic chemicals to the environment. It is mandated by the Emergency Planning and Community Right-to-Know Act and is based upon data submitted to the Environmental Protection Agency (EPA) from industrial facilities throughout the U.S.A. This data includes names and addresses of the facilities, and the amounts of certain toxic chemicals they release to the air, water, or land, or transfer to waste sites. Information is included on over 600 chemicals and chemical categories. Separate TRI files are available for each year beginning with 1987. Since 1991, pollution prevention data are also reported by each facility for each chemical.
Toxicology Literature Files:
(Development and Reproductive Toxicology) and ETICBACK (Environmental
Teratology Information Center Backfile)
DART is a bibliographic database covering literature on teratology and other aspects of developmental toxicology. It is managed by NLM and funded by EPA, the National Institute of Environmental Health Sciences (NIEHS), and the National Center for Toxicological Research of the Food and Drug Administration. DART is a continuation of ETICBACK, which contains 49,000 citations to teratology literature published from 1950-1989.
Mutagen Information Center) and EMICBACK (Environmental Mutagen Information
EMIC is a bibliographic database containing some 20,000 citations to literature on chemical, biological, and physical agents that have been tested for genotoxic activity. It is produced by the Oak Ridge National Laboratory (ORNL) and funded by EPA and NIEHS. EMIC covers literature published since 1991. EMICBACK contains over 75,000 citations to literature published from 1950-1990.
More information on TOXNET can be found in the National Library of Medicine TOXNET Fact Sheet (http://www.nlm.nih.gov/pubs/factsheets/toxnetfs.html).
IMPORTANT: The fact that a resource is included in this guide does not mean that EPA is endorsing those sources. Nor does it mean that EPA will automatically accept data included in or referenced by those sources. Studies and data will need to meet the requirements as spelled out in the guidance document on data adequacy in order to be accepted under the HPV Challenge Program.
Send comments on this guide to the Chemical Right to Know staff (firstname.lastname@example.org).