Open Data Resources
Blood Brain Barrier (logBB) data set
This data set contains the PubChem CIDs (Compound ID), InChi (International Chemical Identifier), and the logBB experimental values of 438 chemicals. The logBB is the steady-state logarithmic ratio of brain-plasma concentration for a chemical.
Details regarding the aggregation and curation of this data set can be found in the following paper:
BCRP (Breast Cancer Resistance Protein) data set
This data set contains the names, InChi (International Chemical Identifier), and binary values for evidence of inhibition of BCRP at a concentration of 10 uM for 395 chemicals. BCRP is an ABC (ATP-Binding Cassette) transporter responsible for the efflux of chemicals out of cells in organs like the intestines, brain, and placenta.
Details regarding the aggregation and curation of this data set can be found in the following papers:
Bioavailability data set
This data set contains the names, InChi (International Chemical Identifier), and oral bioavailability (%F) values for 1070 chemicals.
Details regarding the aggregation and curation of this data set can be found in the following paper:
BSEP (Bile Salt Export Pump) data set
This data set contains the names, InChi (International Chemical Identifier), and binary values for evidence of inhibition of BSEP at a concentration of 100 uM for 725 chemicals. BSEP is an ABC (ATP-Binding Cassette) transporter responsible for secreting bile salts from liver hepatocytes into bile.
Details regarding the aggregation and curation of this data set can be found in the following paper:
DART data set
This data set contains the CAS (Chemical Abstracts Service) numbers, InChi (International Chemical Identifier), and experimental data for developmental and reproductive toxicity for a collection of 696 chemicals.
Details regarding the aggregation and curation of this data set can be found in the following paper:
Estrogen receptor data set
This data set contains the InChi (International Chemical Identifier) and experimental data for estrogen receptor activity for a collection of 2036 chemicals. This data set was merged from 3 data sets that were aggregated and curated for predicting estrogen receptor activity in the following papers:
Fathead minnow data set
This data set contains the names, InChi (International Chemical Identifier), and the experimental -log of concentration (uM) values representative of toxicity to fathead minnows for a collection of 674 chemicals.
Details regarding the aggregation and curation of this data set can be found in the following paper:
Hepatotoxicity data set
This data set contains the PubChem CIDs (Compound ID), InChi (International Chemical Identifier), and experimental data for hepatotoxicity for 3709 chemicals. Details regarding the aggregation and curation of this data set can be found in the following paper:
Mulliner, D.; Schmidt, F.; Stolte, M.; Spirkl, H.-P.; Czich, A.; Amberg, A. Computational Models for Human and Animal Hepatotoxicity with a Global Application Scope. Chem. Res. Toxicol. 2016, 29 (5), 757–767. https://doi.org/10.1021/acs.chemrestox.5b00465.
MDR1 (Multidrug Resistance 1) transporter data set
This data set contains the names, InChi (International Chemical Identifier), and binary values for evidence of inhibition of MDR1 at a concentration of 10 uM for 1585 chemicals. MDR1, also known as P-glycoprotein (P-gp), is an ABC (ATP-Binding Cassette) transporter responsible for the efflux of foreign chemicals out of cells.
Details regarding the aggregation and curation of this data set can be found in the following papers: