Ontology highlight
ABSTRACT: Motivation
Proteins containing tandem repeats (TRs) are abundant, frequently fold in elongated non-globular structures and perform vital functions. A number of computational tools have been developed to detect TRs in protein sequences. A blurred boundary between imperfect TR motifs and non-repetitive sequences gave rise to necessity to validate the detected TRs.Results
Tally-2.0 is a scoring tool based on a machine learning (ML) approach, which allows to validate the results of TR detection. It was upgraded by using improved training datasets and additional ML features. Tally-2.0 performs at a level of 93% sensitivity, 83% specificity and an area under the receiver operating characteristic curve of 95%.Availability and implementation
Tally-2.0 is available, as a web tool and as a standalone application published under Apache License 2.0, on the URL https://bioinfo.crbm.cnrs.fr/index.php? route=tools&tool=27. It is supported on Linux. Source code is available upon request.Supplementary information
Supplementary data are available at Bioinformatics online.
SUBMITTER: Perovic V
PROVIDER: S-EPMC7214015 | biostudies-literature | 2020 May
REPOSITORIES: biostudies-literature
Perovic Vladimir V Leclercq Jeremy Y JY Sumonja Neven N Richard Francois D FD Veljkovic Nevena N Kajava Andrey V AV
Bioinformatics (Oxford, England) 20200501 10
<h4>Motivation</h4>Proteins containing tandem repeats (TRs) are abundant, frequently fold in elongated non-globular structures and perform vital functions. A number of computational tools have been developed to detect TRs in protein sequences. A blurred boundary between imperfect TR motifs and non-repetitive sequences gave rise to necessity to validate the detected TRs.<h4>Results</h4>Tally-2.0 is a scoring tool based on a machine learning (ML) approach, which allows to validate the results of T ...[more]