Cheminformatics: solution delivers predictive models to chemistry and related sciences

Scientist at University of Tartu (UT) are hosting and developing a smart repository solution that allows easy access to quantitative and qualitative structure–activity relationships, (Q)SAR, models and improves publishing of research results.


(Q)SAR are important tools for describing and assessing the properties of molecules in chemistry and related science fields like biochemistry, medicinal chemistry, biotechnology, toxicology, etc. These relationships have been used for decades to explain physical, chemical, biomedical, etc. phenomena and technological processes and have also found use in decision support (for example in environmental risk assessment, in drug design, etc.).


The publishing of (Q)SAR models is sophisticated because the model development is a complex process that can be divided into three phases. The first phase is the collection of relevant experimental data about the modelled phenomena. The second phase characterises relevant chemical structures with numerical parameters, called molecular descriptors, by using various computational chemistry approaches. The third phase involves statistical analysis to establish and validate mathematical relationship between calculated molecular descriptors and properties measured from various experiments.


Mathematically (Q)SARs range from simple linear and multilinear relationships (MLR) to complicated machine learning models like decision trees (DT), artificial neural networks (ANN), random forests (RF), support vector machines (SVM), Bayesian models, etc. Correct, compact and machine readable representation of those models together with relevant chemical data is vital for the application of published models.


The repository of (Q)SAR models (aka QsarDB repository) created at the UT Institute of Chemistry is an open and smart web environment for uploading the developed and published (Q)SAR-s, presented according to the data format designed especially for transparent (Q)SAR models archiving. “QsarDB is not just an ordinary repository for storing static information, but it also provides smart tools for exploring of models, visualizing data and chemical structures, evaluating models’ performance and also performing and evaluation predictions. Not less important is a fact that models and the underling raw data can be freely downloaded and evaluated independently” said Uko Maran, senior researcher at UT Institute of Chemistry.


QsarDB is the first repository to provide digital identifiers for unique identification of published (Q)SAR models. In particular, QsarDB assigns DOI (digital object identifier) numbers and HDL identifiers to models uploaded to QsarDB. QsarDB makes (Q)SAR models citable and classical PDF research publications interactive. And more importantly, QsarDB repository saves a lot of time and effort needed for using already available models.

Read more about QsarDB repository:

QsarDB web site:


Additional information: Uko Maran, UT Institute of Chemistry, e-mail: