Unknown

Dataset Information

0

ProdMX: Rapid query and analysis of protein functional domain based on compressed sparse matrices.


ABSTRACT: Large-scale protein analysis has been used to characterize large numbers of proteins across numerous species. One of the applications is to use as a high-throughput screening method for pathogenicity of genomes. Unlike sequence homology methods, protein comparison at a functional level provides us with a unique opportunity to classify proteins, based on their functional structures without dealing with sequence complexity of distantly related species. Protein functions can be abstractly described by a set of protein functional domains, such as PfamA domains; a set of genomes can then be mapped to a matrix, with each row representing a genome, and the columns representing the presence or absence of a given functional domain. However, a powerful tool is needed to analyze the large sparse matrices generated by millions of genomes that will become available in the near future. The ProdMX is a tool with user-friendly utilities developed to facilitate high-throughput analysis of proteins with an ability to be included as an effective module in the high-throughput pipeline. The ProdMX employs a compressed sparse matrix algorithm to reduce computational resources and time used to perform the matrix manipulation during functional domain analysis. The ProdMX is a free and publicly available Python package which can be installed with popular package mangers such as PyPI and Conda, or with a standard installer from source code available on the ProdMX GitHub repository at https://github.com/visanuwan/prodmx.

SUBMITTER: Wanchai V 

PROVIDER: S-EPMC7719867 | biostudies-literature | 2020

REPOSITORIES: biostudies-literature

altmetric image

Publications

ProdMX: Rapid query and analysis of protein functional domain based on compressed sparse matrices.

Wanchai Visanu V   Nookaew Intawat I   Ussery David W DW  

Computational and structural biotechnology journal 20201124


Large-scale protein analysis has been used to characterize large numbers of proteins across numerous species. One of the applications is to use as a high-throughput screening method for pathogenicity of genomes. Unlike sequence homology methods, protein comparison at a functional level provides us with a unique opportunity to classify proteins, based on their functional structures without dealing with sequence complexity of distantly related species. Protein functions can be abstractly described  ...[more]

Similar Datasets

| S-EPMC3557083 | biostudies-literature
| S-EPMC4250157 | biostudies-literature
| S-EPMC5154690 | biostudies-literature
| S-EPMC4719663 | biostudies-literature
| S-EPMC3697978 | biostudies-literature
| S-EPMC9053777 | biostudies-literature
| S-EPMC10244210 | biostudies-literature
| S-EPMC4032817 | biostudies-literature
| S-EPMC8016461 | biostudies-literature
| S-EPMC3375009 | biostudies-literature