ToxCast Data Generation: Processing and Analysis
Once the testing is complete, the labs send the resulting data back to EPA for processing and quality assurance/quality checking. The diversity of the ToxCast data received from the numerous vendors led to the development of a flexible high-throughput screening (HTS) data analysis pipeline capable of efficiently processing and storing large volumes of data. The prescribed procedure EPA uses to process this data is outlined below, and the pipline used to process data is available as an R package (tcpl). While developed primarily for ToxCast, EPA has attempted to make the tcpl package generally applicable to the community for processing any high-throughput chemical screening data.
- The data, received in unique formats from each vendor, are transformed to a standard computable format and loaded into the ToxCast database (invitrodb) by vendor-specific R scripts.
- Once data is loaded into the database, the ToxCast Pipeline (tcpl) uses the generalized processing functions implemented in the package to process, normalize, model, qualify, flag, inspect, and visualize the data.
- The data resulting from the processing is available for download and in the Computational Toxicology Chemicals Dashboard.