Dataset Information

ProdMX: Rapid query and analysis of protein functional domain based on compressed sparse matrices.

ABSTRACT: Large-scale protein analysis has been used to characterize large numbers of proteins across numerous species. One of the applications is to use as a high-throughput screening method for pathogenicity of genomes. Unlike sequence homology methods, protein comparison at a functional level provides us with a unique opportunity to classify proteins, based on their functional structures without dealing with sequence complexity of distantly related species. Protein functions can be abstractly described by a set of protein functional domains, such as PfamA domains; a set of genomes can then be mapped to a matrix, with each row representing a genome, and the columns representing the presence or absence of a given functional domain. However, a powerful tool is needed to analyze the large sparse matrices generated by millions of genomes that will become available in the near future. The ProdMX is a tool with user-friendly utilities developed to facilitate high-throughput analysis of proteins with an ability to be included as an effective module in the high-throughput pipeline. The ProdMX employs a compressed sparse matrix algorithm to reduce computational resources and time used to perform the matrix manipulation during functional domain analysis. The ProdMX is a free and publicly available Python package which can be installed with popular package mangers such as PyPI and Conda, or with a standard installer from source code available on the ProdMX GitHub repository at https://github.com/visanuwan/prodmx.

SUBMITTER: Wanchai V

PROVIDER: S-EPMC7719867 | biostudies-literature | 2020

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

ProdMX: Rapid query and analysis of protein functional domain based on compressed sparse matrices.

Wanchai Visanu V Nookaew Intawat I Ussery David W DW

Computational and structural biotechnology journal 20201124

Large-scale protein analysis has been used to characterize large numbers of proteins across numerous species. One of the applications is to use as a high-throughput screening method for pathogenicity of genomes. Unlike sequence homology methods, protein comparison at a functional level provides us with a unique opportunity to classify proteins, based on their functional structures without dealing with sequence complexity of distantly related species. Protein functions can be abstractly described ...[more]

PMID: 33335686

Dataset Information

ProdMX: Rapid query and analysis of protein functional domain based on compressed sparse matrices.

Publications

ProdMX: Rapid query and analysis of protein functional domain based on compressed sparse matrices.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Deterministic matrices matching the compressed sensing phase transitions of Gaussian random matrices.
| S-EPMC3557083 | biostudies-literature

Sparse sign-consistent Johnson-Lindenstrauss matrices: compression with neuroscience-based constraints.
| S-EPMC4250157 | biostudies-literature

Reconstruction of Gene Regulatory Networks based on Repairing Sparse Low-rank Matrices.
| S-EPMC5154690 | biostudies-literature

ESTIMATION OF FUNCTIONALS OF SPARSE COVARIANCE MATRICES.
| S-EPMC4719663 | biostudies-literature

Dissecting high-dimensional phenotypes with bayesian sparse factor analysis of genetic covariance matrices.
| S-EPMC3697978 | biostudies-literature

RaPID-Query for Fast Identity by Descent Search and Genealogical Analysis.
| S-EPMC10244210 | biostudies-literature

Hyperspectral image compressed processing: Evolutionary multi-objective optimization sparse decomposition.
| S-EPMC9053777 | biostudies-literature

Multilevel sparse functional principal component analysis.
| S-EPMC4032817 | biostudies-literature

Sparse Project VCF: efficient encoding of population genotype matrices.
| S-EPMC8016461 | biostudies-literature

Reassessing a sparse energetic network within a single protein domain.
| S-EPMC2290805 | biostudies-literature