Skip to main content
U.S. flag

An official website of the United States government

Here’s how you know

Dot gov

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

HTTPS

Secure .gov websites use HTTPS
A lock (LockA locked padlock) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

    • Environmental Topics
    • Air
    • Bed Bugs
    • Cancer
    • Chemicals, Toxics, and Pesticide
    • Emergency Response
    • Environmental Information by Location
    • Health
    • Land, Waste, and Cleanup
    • Lead
    • Mold
    • Radon
    • Research
    • Science Topics
    • Water Topics
    • A-Z Topic Index
    • Laws & Regulations
    • By Business Sector
    • By Topic
    • Compliance
    • Enforcement
    • Laws and Executive Orders
    • Regulations
    • Report a Violation
    • Environmental Violations
    • Fraud, Waste or Abuse
    • About EPA
    • Our Mission and What We Do
    • Headquarters Offices
    • Regional Offices
    • Labs and Research Centers
    • Planning, Budget, and Results
    • Organization Chart
    • EPA History

Breadcrumb

  1. Home
  2. CompTox Tools

Cheminformatics Modules Manual

Welcome to the User Manual!

The Cheminformatics Modules (CIM) is a set of prototype modules which are using a compilation of information sourced from many sites, databases and sources including U.S. Federal and state sources and international bodies that saves the user time by providing information in one location. The data are not reviewed by USEPA – the user must apply judgment in use of the information. The results do not indicate EPA’s position on the use or regulation of these chemicals.

This guide offers step-by-step instructions to help you access Cheminformatics Modules (CIM) data and features. For additional information or assistance, please contact CCTE@epa.gov.

On this page: 
Navigation | Hazard Module | Safety Module | Search Module

Navigation

At the top of any page of the Cheminformatics Modules (CIM) is the navigation banner. From here, you can access various modules associated with the application as a whole.

Cheminformatics navigation bar

Clicking on each of the individual modules will show the relevant screen associated with that particular module as described below.

  • Hazard: Create a hazard comparison profile comparing chemicals across multiple toxicity endpoints.
  • Safety: Create a safety comparison profile comparing chemicals across multiple data types including physical properties and safety data including shipping and firefighting.
  • Alerts: Profile chemicals relative to multiple structure-based alerts.
  • Predict 1.0: Batch prediction of Toxicity Estimation Software Tool (TEST) physicochemical property and toxicity endpoints.
  • Predict 2.0: New modeling approaches to property endpoint predictions.
  • ToxPrints: Generate ToxPrints and profile chemicals against ToxCast enrichments.
  • Utilities: Various tools to harvest data from CommonChemistry and PubChem APIs.

Hazard Module

The Hazard Module Navigation Bar is displayed below with highlighted sections as searches to input a collection of chemicals to profile. Various parts of the navigation bar are activated following the entry of chemicals for profiling while the reporting buttons (on the right hand side) are activated following generation on a profile.

Hazard toolbar in Cheminformatics

Selecting Chemicals for Profiling

The magnifying glass indicates the ability to search for chemicals while the paint brush indicates the ability to draw a chemical as an input to retrieve a chemical or set of chemicals, while the up arrow allows for upload of a set of chemicals to search

Magnifying Glass – Identifier Search

Selection of the magnifying glass allows for multiple identifiers, either homogeneous in nature or mixed format (i.e., names plus CAS Registry Numbers plus DTXSIDs) to be input to perform a search. The available identifiers supported for searching include SMILES and InChIs as highlighted in the input box shown below.

Following input of a list of identifiers, and selecting the SEARCH button, a set of chemicals will be returned.

Chemical identifier search box
Populated identifier search bar

For the 10 input identifiers shown in the input box all retrieve results with the interface reporting that nine are found by searching against the underlying database while one, the SMILES string, is converted into the associated structure. It is possible that not all input strings will return a result, especially in the case of misspellings, absent synonyms in the database etc. 

In those cases where the chemicals are retrieved from the database a chemical name and CAS Registry Number (CASRN), where available, will be displayed. From this page it is possible to simply harvest relevant data as downloadable files (highlighted by the red box) in the formats of SDF, CSV, XLSX and SMILES only when navigating left to right. Hovering over any button will display relevant help-text.

Results of chemical identifier search

It is also possible to retrieve a list which is already available in the CompTox Chemicals Dashboard. Open the Retrieve Comptox List tab, select the list and click retrieve to load the list for further processing. As an example, in the figure below showing list filtering and selection, the almost 500 CompTox Lists are filtered using “epapfas” as the substring filter, the relevant list is selected (as shown by the highlight) and then retrieved in total.

Comptox List tab

The next operations of interest are the selection of chemicals to include in the Profiling. These are select or deselect all, add to cart for processing, and deletion of all chemicals in the cart.

Profiling toolbar in the Hazard Module

Paint Brush – Structure/Substructure/Similarity Search

There are two ways to access the structure editor for inputting structures to add to the cart for processing. The first approach is selection from the navigation bar as indicated below, but this only allows for drawing or retrieving one structure at a time to add to the cart for processing.

Highlighting the paintbrush icon in the profiling toolbar

The alternative approach is via the Search by Structure tab which allows for retrieval of a set of chemicals based on a substructure or similarity search. A basic overview of how to draw a structure in the Ketcher drawing editor is available by clicking on the question mark, but more details will be provided in the Search part of this manual.

Search function in the Hazard Module
Ketcher overview

Once a structure is input then a structure or similarity search can be performed (selecting the relevant search on the left hand side) and an appropriate Tanimoto similarity match factor and a maximum number of hits (values set to a match factor of 0.6 and 20 hits in the figure below).

Search by structure funtion

The resulting hit list, as shown below, can be edited using select/deselect of each chemical by clicking on the structure tile, then added to the cart for processing (see the button highlighted by the red box).

Hit list for structure search

Once the chemicals are retrieved and selected for processing then the Hazard Profile can be generated simply by processing the chemicals from the cart.

ChemInfo cart processing

The Hazard Profile that is generated is a heat map where each cell that has data available is represented by color, a letter, and a font for the letter.

Chemical Hazard Profile

The color-coding and letters on the tiles represent a grade as follows: Red – Very High (VH), Orange – High (H), Yellow – Medium (M), Green – Low (L), Grey – Inconclusive (I) and White – no data available. The color legend is shown at the top of the table. The font associated with the letter on the tile represents the data source authority with three levels available: Authoritative, Screening and QSAR Model. In order to list all sources under the three classes hover over the informational icon adjacent to the words for the sources authoritative, screening and QSAR models. For a detailed breakdown regarding the selection of the sources, access "An automated framework for compiling and integrating chemical hazard data".

Cheminfo authoritative sources

Each of the individual columns can be sorted and the data are displayed in the order of sorting and shown in the interface.

Hazard profile sorting

Clicking on a single column (e.g., oral) sorts the column from highest to lowest toxicity endpoint. Clicking on the column again sorts from low to high. The entire set of filters can be cleared using the reset button or removed one at a time by clicking on the “x” adjacent to the sorted column.

Each column represents a specific endpoint and the relative rankings of Very High to Inconclusive are made based on available data sources and a dedicated scoring scheme for each endpoint. In order to review the details for each endpoint hover over the ellipsis icon.

For a list of all scoring schemes downloaded into a multi-worksheet Excel file click on the hyperlinked Toxicity to download the file.

Example of inhalation information
Toxicity color code

The Excel file contains all relevant scoring details for each endpoint and is a useful reference document.

Hazard report in Excel

The Hazard report generated in an particular session can be exported in two formats: 1) as an SDF file which can be opened in cheminformatics software supporting that format or 2) as a multi-worksheet Excel file containing the heat-map display as well as the underlying data in a second worksheet.

Cheminformatics Excel report

The Hazard Profile screen has additional data available for the chemicals displayed in the report. For example, the information icon agent to the chemical name is an on-hover displaying general information about the chemical substance.

General chemical information

In addition to the basic information about the chemicals additional information, specifically links to other sites, are available by clicking on the array of letters adjacent to the CAS registry number.

CAS Links

Each of the hyperlinks underneath the structure image is an active hyperlink to the relevant data source:

  • A – ATSDR: Agency for Toxic Substances and Disease Registry tox profiles
  • B – Bioactivity: ToxCast/Tox21 in the CompTox Chemicals Dashboard (CCD)
  • G – GenRA: Run Generalized Read-Across on the chemical in the GenRA app
  • I – IRIS: EPA Integrated Risk Information System risk assessments
  • P – PPRTV: Provisional Peer Reviewed Toxicity Value reports
  • T – ToxVal: View data associated with the ToxVal data in CCD

The substance information informational icon and the external hyperlinks discussed above are also available in the Safety module.

Hazard Use Cases

Using the features outlined above, a number of use cases will be outlined below.

Retrieval of Single Chemical Data

Retrieval of data associated with a known chemical of interest, registered in the DSSTox Database, and general of display of results and download

This simple workflow simply requires inputting the relevant identifier (CASRN or name), searching the DSSTox database, and generation of the Hazard profile then download.

  • Input the identifier into the search box or draw the structure in the structure drawing box to ensure the chemical is in the database
  • Generate the Hazard Profile
  • Download the relevant file format to the desktop, with XLSX being the generally preferred format

Example chemical: Octamethylpyrophosphoramide, CASRN: 152-16-9

Example chemical search

Generate the relevant Hazard Profile

Example chemical hazard profile

Export the Report

Cheminformatics Excel report
  1. Retrieval of Analogues Based on Similarity 

This simple workflow retrieves data following input of a chemical structure of interest, performing a search based on chemical similarity, and profiling the related chemicals of interest.

  • Draw the chemical to be searched based on similarity or retrieve from DSSTox
  • Set the threshold for similarity searching and retrieve the hits
  • The hit list resulting from a similarity search is paginated and the number of hits on a page can be adjusted (bottom left-hand corner of figure). The hits can be sorted based on similarity, mass or toxicity (top right-hand corner of figure). Send a single page of hits to the Hazard Comparison Module (HCD) or other modules as appropriate.
Chemical tiles based on similarity
  • A: Sort records by similarity (default), mass of toxicity. Generally choose toxicity sorting before sending chemicals to the Hazard module
  • B: Select number of chemicals on a page and shuttle between pages
  • C: Choose module to transfer a page of chemicals to the cart for further processing

Safety Module

The safety module delivers access to data that would be of primary value to emergency responders as it provides access to data such as ignitability, reactivity, personal protection equipment, GHS data and an abundance of other safety data. The data have primarily been harvested from PubChem for chemicals contained within the DSSTox database, integrated with additional data from that database, and delivered via the user interface in the Safety Module.

Safety data can be generated using a similar workflow to that used to generate a Hazard comparison profile. First of all one or more chemicals must be added to the cart for processing by inputting the necessary identifiers to retrieve from the database and running the safety profile. Following the steps outlined in the Hazard module to retrieve the chemicals of interest and then generate the safety profile.

Chemical safety information
  • A. A full listing of property data
  • B. A full listing of safety data
  • C. Click each informational icon to see underlying data

Each chemical has multiple icons associated with, from left to right, the overall safety data (safety glasses icon), all property data (the conical flask icon), relevant GHS icons, a National Fire Protection Association icon, and a number of information icons associated with other safety data types. Navigating into the underlying data includes hovering over the informational icons and using the arrows to open/close different sections.

Additional safety information

A report from the Safety Module can be generated in four formats: SDF, Excel, Word or PDF. The Excel format includes basic chemical details, GHS pictograms and hyperlinks to the various types of available data.

Safety module report choices
Safety module excel report

The Safety Module offers the ability to export specific sections of the safety data by selecting the “Export Arbitrary Section…” button and selecting the sections to Export. When the sections are selected scroll to bottom and choose the format for export.

Safety module export choices

The default display for the safety profile is “Full” but two other filtered views can be chosen: GHS Classification and Safety and Hazards. Selecting either profile will switch to the relevant filtered set for display.

GHS classification view

Search Module

The Search Module is primarily a structure-based search that can be the start point to lead into the majority of other modules and offers a lot of flexibility in terms of hit list filtering prior to transferring the hit list into other modules. The list of potential searches and their related filters are shown below.

Search module options

Each of these searches will be discussed with many of the aspects of the functionality carrying between individual search types. Firstly we will start with the basics of the drawing editor.

A chemical can be drawn in the Ketcher editor by choosing from the default rings, chemicals from the template library or selecting from the periodic table and bond types. Learn more about Ketcher here.

Ketcher template library

Users are encouraged to review the Ketcher webpage linked above for details regarding how to draw a structure.

Above the structure drawing editor is an input box to search for chemicals based on input names, CAS RN, SMILES, DTXSID, InChIKeys and other identifiers. If a structure is returned to the drawing palette then it can be edited and used for further studies including similarity and substructure searches. Chemicals can also be added to the editor by copy-pasting of SMILES and molfiles through the clipboard.

Search module drawing space

Drawing a structure in Ketcher performs a realtime search against the underlying database of 1.2 million substances. The resulting search displays the hyperlinked DTXSID linked to the CompTox Chemicals Dashboard as shown in the figure below.

Search module example chemical

Substructure Searching

When a substructure is input and the search performed the hit is returned and can be filtered. In the example shown below a substructure search was performed on diphenylamine (shown in the drawing editor). The resulting hit list of ~8600 chemicals contains a long list of other elements (indicated by callout bubble A in the figure) that can be useful as a basis to exclude from the hit list (described below).

The filters indicated by callout bubble B above are “three-state” filters – one click filters the selection out (indicated by an x), a second click filters the selection in (indicated by a checkmark) and a third click switches the filter off. This is exemplified by selecting only the isotopes as shown below.

Search module results

The filters indicated by callout bubble B above are “three-state” filters – one click filters the selection out (indicated by an x), a second click filters the selection in (indicated by a checkmark) and a third click switches the filter off. This is exemplified by selecting only the isotopes as shown below.

isotopes selection

The sorting functionality indicated by the callout bubble C allows for the hit list to be sorted my similarity, by toxicity or by mass. Sorting by toxicity brings those chemicals with tox data to the top of the list while the ranking sort, highlighted in the red box reverses the search.

Sorting results by toxicity

CompTox Tools

  • CompTox Tools Home
  • ChemExpo
  • Cheminformatics
  • CompTox Chemicals Dashboard
    • About
    • Dashboard Help
      • Common Navigational Tips
      • Basic Search
      • Product Use Categories Search
      • Assay/Gene Search
      • Advanced Search
      • Batch Search
      • List of Assays
      • Lists of Chemicals
      • Tools
      • References
    • Release Notes
    • Latest News
  • ECOTOX Knowledgebase
  • GenRA
    • About GenRA
    • GenRA Manual
      • Web Application
      • Selecting a Chemical
      • Analogue Identification
      • Data Gap Analysis
      • Analogue Evaluation
      • Read-Across
      • Uncertainty Assessment and Conclusions
      • Other Fingerprint Types
      • User-Defined Neighborhoods
    • Release Notes
  • SeqAPASS
    • User Guide
  • CompTox and Exposure APIs
    • About CTX APIs
    • Data Domains
    • Clients
  • Downloadable Computational Toxicology Data
  • Toxicity Forecasting (ToxCast)
    • Exploring ToxCast Data
    • Generating ToxCast Data
      • Chemical Coverage
      • Chemical Procurement Workflow
Contact Us About CompTox Tools
Contact Us to ask a question, provide feedback, or report a problem.
Last updated on March 13, 2025
  • Assistance
  • Spanish
  • Arabic
  • Chinese (simplified)
  • Chinese (traditional)
  • French
  • Haitian Creole
  • Korean
  • Portuguese
  • Russian
  • Tagalog
  • Vietnamese
United States Environmental Protection Agency

Discover.

  • Accessibility Statement
  • Budget & Performance
  • Contracting
  • EPA www Web Snapshot
  • Grants
  • No FEAR Act Data
  • Plain Writing
  • Privacy
  • Privacy and Security Notice

Connect.

  • Data
  • Inspector General
  • Jobs
  • Newsroom
  • Regulations.gov
  • Subscribe
  • USA.gov
  • White House

Ask.

  • Contact EPA
  • EPA Disclaimers
  • Hotlines
  • FOIA Requests
  • Frequent Questions
  • Site Feedback

Follow.