Dataset Information

Antimicrobial peptide similarity and classification through rough set theory using physicochemical boundaries.

ABSTRACT: Antimicrobial peptides attract considerable interest as novel agents to combat infections. Their long-time potency across bacteria, viruses and fungi as part of diverse innate immune systems offers a solution to overcome the rising concerns from antibiotic resistance. With the rapid increase of antimicrobial peptides reported in the databases, peptide selection becomes a challenge. We propose similarity analyses to describe key properties that distinguish between active and non-active peptide sequences building upon the physicochemical properties of antimicrobial peptides. We used an iterative supervised machine learning approach to classify active peptides from inactive peptides with low false discovery rates in a relatively short computational search time. By generating explicit boundaries, our method defines new categories of active and inactive peptides based on their physicochemical properties. Consequently, it describes physicochemical characteristics of similarity among active peptides and the physicochemical boundaries between active and inactive peptides in a single process. To build the similarity boundaries, we used the rough set theory approach; to our knowledge, this is the first time that this approach has been used to classify peptides. The modified rough set theory method limits the number of values describing a boundary to a user-defined limit. Our method is optimized for specificity over selectivity. Noting that false positives increase activity assays while false negatives only increase computational search time, our method provided a low false discovery rate. Published datasets were used to compare our rough set theory method to other published classification methods and based on this comparison, we achieved high selectivity and comparable sensitivity to currently available methods. We developed rule sets that define physicochemical boundaries which allow us to directly classify the active sequences from inactive peptides. Existing classification methods are either sequence-order insensitive or length-dependent, whereas our method generates the rule sets that combine order-sensitive descriptors with length-independent descriptors. The method provides comparable or improved performance to currently available methods. Discovering the boundaries of physicochemical properties may lead to a new understanding of peptide similarity.

SUBMITTER: Boone K

PROVIDER: S-EPMC6282327 | biostudies-other | 2018 Dec

REPOSITORIES: biostudies-other

ACCESS DATA

Publications

Antimicrobial peptide similarity and classification through rough set theory using physicochemical boundaries.

Boone Kyle K Camarda Kyle K Spencer Paulette P Tamerler Candan C

BMC bioinformatics 20181206 1

<h4>Background</h4>Antimicrobial peptides attract considerable interest as novel agents to combat infections. Their long-time potency across bacteria, viruses and fungi as part of diverse innate immune systems offers a solution to overcome the rising concerns from antibiotic resistance. With the rapid increase of antimicrobial peptides reported in the databases, peptide selection becomes a challenge. We propose similarity analyses to describe key properties that distinguish between active and no ...[more]

PMID: 30522443

Similar Datasets

Project description:Antimicrobial peptides (AMPs) are part of the inherent immune system. In fact, they occur in almost all organisms including, e.g., plants, animals, and humans. Remarkably, they show effectivity also against multi-resistant pathogens with a high selectivity. This is especially crucial in times, where society is faced with the major threat of an ever-increasing amount of antibiotic resistant microbes. In addition, AMPs can also exhibit antitumor and antiviral effects, thus a variety of scientific studies dealt with the prediction of active peptides in recent years. Due to their potential, even the pharmaceutical industry is keen on discovering and developing novel AMPs. However, AMPs are difficult to verify in vitro, hence researchers conduct sequence similarity experiments against known, active peptides. Unfortunately, this approach is very time-consuming and limits potential candidates to sequences with a high similarity to known AMPs. Machine learning methods offer the opportunity to explore the huge space of sequence variations in a timely manner. These algorithms have, in principal, paved the way for an automated discovery of AMPs. However, machine learning models require a numerical input, thus an informative encoding is very important. Unfortunately, developing an appropriate encoding is a major challenge, which has not been entirely solved so far. For this reason, the development of novel amino acid encodings is established as a stand-alone research branch. The present review introduces state-of-the-art encodings of amino acids as well as their properties in sequence and structure based aggregation. Moreover, albeit a well-chosen encoding is essential, performant classifiers are required, which is reflected by a tendency towards specifically designed models in the literature. Furthermore, we introduce these models with a particular focus on encodings derived from support vector machines and deep learning approaches. Albeit a strong focus has been set on AMP predictions, not all of the mentioned encodings have been elaborated as part of antimicrobial research studies, but rather as general protein or peptide representations.

Dataset Information

Antimicrobial peptide similarity and classification through rough set theory using physicochemical boundaries.

Publications

Antimicrobial peptide similarity and classification through rough set theory using physicochemical boundaries.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets