Unknown

Dataset Information

0

Enhanced taxonomy annotation of antiviral activity data from ChEMBL.


ABSTRACT: The discovery of antiviral drugs is a rapidly developing area of medicinal chemistry research. The emergence of resistant variants and outbreaks of poorly studied viral diseases make this area constantly developing. The amount of antiviral activity data available in ChEMBL consistently grows, but virus taxonomy annotation of these data is not sufficient for thorough studies of antiviral chemical space. We developed a procedure for semi-automatic extraction of antiviral activity data from ChEMBL and mapped them to the virus taxonomy developed by the International Committee for Taxonomy of Viruses (ICTV). The procedure is based on the lists of virus-related values of ChEMBL annotation fields and a dictionary of virus names and acronyms mapped to ICTV taxa. Application of this data extraction procedure allows retrieving from ChEMBL 1.6 times more assays linked to 2.5 times more compounds and data points than ChEMBL web interface allows. Mapping of these data to ICTV taxa allows analyzing all the compounds tested against each viral species. Activity values and structures of the compounds were standardized, and the antiviral activity profile was created for each standard structure. Data set compiled using this algorithm was called ViralChEMBL. As case studies, we compared descriptor and scaffold distributions for the full ChEMBL and its `viral' and `non-viral' subsets, identified the most studied compounds and created a self-organizing map for ViralChEMBL. Our approach to data annotation appeared to be a very efficient tool for the study of antiviral chemical space.

SUBMITTER: Nikitina AA 

PROVIDER: S-EPMC6367519 | biostudies-literature | 2019 Jan

REPOSITORIES: biostudies-literature

altmetric image

Publications

Enhanced taxonomy annotation of antiviral activity data from ChEMBL.

Nikitina Anastasia A AA   Orlov Alexey A AA   Kozlovskaya Liubov I LI   Palyulin Vladimir A VA   Osolodkin Dmitry I DI  

Database : the journal of biological databases and curation 20190101


The discovery of antiviral drugs is a rapidly developing area of medicinal chemistry research. The emergence of resistant variants and outbreaks of poorly studied viral diseases make this area constantly developing. The amount of antiviral activity data available in ChEMBL consistently grows, but virus taxonomy annotation of these data is not sufficient for thorough studies of antiviral chemical space. We developed a procedure for semi-automatic extraction of antiviral activity data from ChEMBL  ...[more]

Similar Datasets

| S-EPMC6323927 | biostudies-literature
| S-EPMC3700754 | biostudies-literature
| S-EPMC4353373 | biostudies-literature
| S-EPMC4489243 | biostudies-literature
| S-EPMC434490 | biostudies-literature
| S-EPMC6003391 | biostudies-literature
| S-EPMC8166773 | biostudies-literature
| S-EPMC7888266 | biostudies-literature
| S-EPMC5415185 | biostudies-literature
| S-EPMC5210557 | biostudies-literature