Toxicity Estimation Software Tool (TEST)

On this page:

The Toxicity Estimation Software Tool (TEST) was developed to allow users to easily estimate the toxicity of chemicals using Quantitative Structure Activity Relationships (QSARs) methodologies. QSARs are mathematical models used to predict measures of toxicity from the physical characteristics of the structure of chemicals (known as molecular descriptors). Simple QSAR models calculate the toxicity of chemicals using a simple linear function of molecular descriptors:

Toxicity = ax1 + bx2 + c

where x1 and x2 are the independent descriptor variables and a, b, and c are fitted parameters. The molecular weight and the octanol-water partition coefficient are examples of molecular descriptors. Additional examples are provided in our Molecular Descriptors Guide Version 1.0.2.

TEST allows a user to estimate toxicity without requiring any external programs. Users input a chemical to evaluate by drawing it in an included chemical sketcher window, entering a structure text file, or importing it from an included database of structures. Once entered, the toxicity is estimated using one of several advanced QSAR methodologies. The required molecular descriptors are calculated within TEST.

QSAR Methodologies

Several QSAR methodologies have been developed:

  • Hierarchical method – The toxicity for a given query compound is estimated using the weighted average of the predictions from several different models. The different models are obtained by using Ward’s method to divide the training set into a series of structurally similar clusters. A genetic algorithm-based technique is used to generate models for each cluster. The models are generated prior to runtime.
  • FDA method – The prediction for each test chemical is made using a new model that is fit to the chemicals that are most similar to the test compound. Each model is generated at runtime.
  • Single-model method – Predictions are made using a multilinear regression model that is fit to the training set (using molecular descriptors as independent variables) using a genetic algorithm-based approach. The regression model is generated prior to runtime.
  • Group contribution method – Predictions are made using a multilinear regression model that is fit to the training set (using molecular fragment counts as independent variables). The regression model is generated prior to runtime.
  • Nearest neighbor method – The predicted toxicity is estimated by taking an average of the three chemicals in the training set that are most similar to the test chemical.
  • Consensus method – The predicted toxicity is estimated by taking an average of the predicted toxicities from each of the above QSAR methodologies.
  • Mode of action method - The predicted toxicity is calculated using a two-step process: (1) linear discriminant models are used to predict the aquatic toxicity mode of action and (2) the quantitative toxicity is predicted using the multiple linear regression model developed for that mode of action.

These methodologies are explained in detail in the publications below.

The software includes models for the following endpoints:

TEST is based on The Chemistry Development Kit Exit, an open-source Java library for computational chemistry.

The software now contains models for the following physical properties:

Models for additional endpoints will be added as they are completed.

Top of Page

What's new in Version 4.2.1?

  • Corrected bug where FDA method was omitted from the list of method options.

Top of Page

Prior Version History

  • 4.2 (4/2016)
    • Added MOA based method for calculating acute fathead minnow toxicity
    • Fixed bug involving selecting the output folder
    • Fix inconsistencies in the calculation of the ALOGP descriptor
  • 4.1 (7/27/2012)
    • Results are now displayed for the most similar chemicals in the training and test sets (enables users to assess confidence in the predicted value)
      • The results pages now list which fragment is missing if the fragment constraint is violated
      • Fixed bug which occurred when saving results files to network drives
      • Fixed bug that occurred when editing chemicals in the batch list
      • Fixed bug where single model method was not included for batch mode predictions
      • Added the ability to copy the smiles of the current structure to the clipboard
      • Added the ability to load recently analyzed structures from the File menu
      • Added the ability to load recently generated batch results files from the File menu
      • Improved the speed of loading large aromatic compounds from MDL SD files
      • Updated/added endpoints
  • 4.0 (6/7/11)
    • Physical properties are now estimated
    • Batch mode is improved:
      • Loading can now be interrupted
      • Chemicals with loading errors are displayed at the top of the batch table
      • Can now load SMILES files with no identifier field (chemicals are assigned arbitrary IDs)
    • Aromaticity detection is improved:
      • Can handle aromatic bond orders (bond order = 4) in mol or sd files
      • The SMILES parser has been improved to better handle complicated aromatic ring systems
    • Added Options screen:
      • Added ability to change the output directory after it has been set
      • The program now remembers the previously selected output folder
      • The "Relax fragment constraint" checkbox was moved to Options screen
  • 3.3 (7/8/10)
    • Daphnia magna LC50 endpoint was added
    • AMES Mutagenicity endpoint was added
    • The following changes were made for binary endpoints such as developmental toxicity and AMES mutagenicity:
      • QSAR models now have stricter statistical standards (leave one out concordance = 0.8, sensitivity = 0.5, and specificity = 0.5)
      • Model statistics such as concordance, sensitivity, and specificity are now displayed in the results web pages
  • 3.2 (12/18/09)
    • Reproductive toxicity endpoint was added
    • Random forest QSAR method was added (for reproductive toxicity endpoint only)
  • 3.1 (6/23/09)
    • Fixed issue with running TEST in non-english speaking countries
  • 3.0 (4/14/09)
    • Random selection is used to divide the data sets into training and test sets
    • Added BCF endpoint
    • Added consensus prediction method
  • 2.0 (2/24/09)
    • Each toxicity data set is now split into a training and test set.
    • The toxicity models included in the software are now fit to the training sets (previously they were fit to the overall sets)
    • The batch mode was improved (chemicals can be added and the list can now be saved as an SDF)
  • 1.0.3 (10/24/08)
    • Fixed calculation of "ieadje" molecular descriptor
    • Fixed definitions of chi descriptors in numbered list in molecular descriptors guide

Top of Page

System Requirements

  • In Version 4.2, a separate copy of Java will be installed which should reduce compatibility issues.
  • Two or more GB of RAM is recommended.

Top of Page

Installation Instructions

  1. Save the appropriate installation file to your hard drive. Due to the large size of the file, the download may take 15 minutes or longer depending on the speed of the connection.
  2. Double-click the installation file (for Linux users: open a shell, cd to the directory where you downloaded the installer and at the prompt type: sh ./install.bin).

Silent Installation Instructions for Network Administrators (for Windows users)

The software can be installed silently by issuing the following command at the command prompt: install -i silent

Download TEST (version 4.2.1)

 Training and prediction sets(12 MB)  used in T.E.S.T. (sdf format)

 Structure Data Files (ZIP)(3 K)  (such as a MDL SD file).

Top of Page


U.S. EPA (2016). "User’s Guide for T.E.S.T. (version 4.2) (Toxicity Estimation Software Tool): A Program to Estimate Toxicity from Molecular Structure."

Sushko, I.; Novotarskyi1, S.; Körner, R.; Pandey, A. K.; Cherkasov, A.; Li, J.; Gramatica, P.; Hansen, K.; Schroeter, T.; Müller, K.-R.; Xi, L.; Liu, H; Yao, X.; Öberg, T.; Hormozdiari, F.; Dao, F.; Sahinalp, C.; Todeschini, R.; Polishchuk, P.; Artemenko, A.; Kuz’min, V.; Martin, T.M.; Young, D. M.; Fourches, D.; Muratov, E.; Tropsha, A.; Baskin, I.; Horvath, D.; Marcou, G.; Varnek, A; Prokopenko, V. V.; Tetko, I.V. (2010). “Applicability domains for classification problems: benchmarking of distance to models for AMES mutagenicity set.” J. Chem. Inf. Model, 50, 2094-2111.

Cassano, A.; Manganaro, A; Martin, T.; Young, D.; Piclin, N.; Pintore, M.; Bigoni, D.; Benfenati, E. (2010). “The CAESAR models for developmental toxicity.” Chemistry Central Journal, 4(Suppl 1):S4.

Zhu, H.; Martin, T.M.; Young, D. M.; Tropsha, A. (2009). “Combinatorial QSAR Modeling of Rat Acute Toxicity by Oral Exposure.“ Chemical Research in Toxicology, 22 (12), pp 1913-1921.

Benfenati, E., Benigni, R., Demarini, D.M., Helma, C., Kirkland, D., Martin, T.M., Mazzatorta, G., Ouedraogo-Arras, G., Richard, A.M., Schilter, B., Schoonen, W.G.E.J., Snyder, R.D., and C. Yang. (2009). “Predictive Models for Carcinogenicity and Mutagenicity: Frameworks, State-of-the-Art, and Perspectives.” Journal of Environmental Science and Health Part C, 27, 2: 57-90.

Young, D.M.; Martin, T.M.; Venkatapathy, R.; Harten, P. (2008) “Are the Chemical Structures in your QSAR Correct?” QSAR & Combinatorial Science, 27 (11-12), 1337-1345.

Martin,T.M., P. Harten, R. Venkatapathy, S. Das and D.M. Young. (2008). “A Hierarchical Clustering Methodology for the Estimation of Toxicity.” Toxicology Mechanisms and Methods, 18, 2: 251–266.

Martin, T.M., and D.M. Young. (2001). “Prediction of the Acute Toxicity (96-h LC50) of Organic Compounds in the Fathead Minnow (Pimephales Promelas) Using a Group Contribution Method.” Chemical Research in Toxicology, 14, 10: 1378–1385.

Top of Page

Sign up to receive email alerts when new versions of the TEST software are posted.

Top of Page