Unknown

Dataset Information

0

NMPFamsDB: a database of novel protein families from microbial metagenomes and metatranscriptomes.


ABSTRACT: The Novel Metagenome Protein Families Database (NMPFamsDB) is a database of metagenome- and metatranscriptome-derived protein families, whose members have no hits to proteins of reference genomes or Pfam domains. Each protein family is accompanied by multiple sequence alignments, Hidden Markov Models, taxonomic information, ecosystem and geolocation metadata, sequence and structure predictions, as well as 3D structure models predicted with AlphaFold2. In its current version, NMPFamsDB hosts over 100 000 protein families, each with at least 100 members. The reported protein families significantly expand (more than double) the number of known protein sequence clusters from reference genomes and reveal new insights into their habitat distribution, origins, functions and taxonomy. We expect NMPFamsDB to be a valuable resource for microbial proteome-wide analyses and for further discovery and characterization of novel functions. NMPFamsDB is publicly available in http://www.nmpfamsdb.org/ or https://bib.fleming.gr/NMPFamsDB.

SUBMITTER: Baltoumas FA 

PROVIDER: S-EPMC10767849 | biostudies-literature | 2024 Jan

REPOSITORIES: biostudies-literature

altmetric image

Publications

NMPFamsDB: a database of novel protein families from microbial metagenomes and metatranscriptomes.

Baltoumas Fotis A FA   Karatzas Evangelos E   Liu Sirui S   Ovchinnikov Sergey S   Sofianatos Yorgos Y   Chen I-Min IM   Kyrpides Nikos C NC   Pavlopoulos Georgios A GA  

Nucleic acids research 20240101 D1


The Novel Metagenome Protein Families Database (NMPFamsDB) is a database of metagenome- and metatranscriptome-derived protein families, whose members have no hits to proteins of reference genomes or Pfam domains. Each protein family is accompanied by multiple sequence alignments, Hidden Markov Models, taxonomic information, ecosystem and geolocation metadata, sequence and structure predictions, as well as 3D structure models predicted with AlphaFold2. In its current version, NMPFamsDB hosts over  ...[more]

Similar Datasets

| S-EPMC6646334 | biostudies-literature
| S-EPMC10112233 | biostudies-literature
| S-EPMC8333335 | biostudies-literature
| S-EPMC6235447 | biostudies-literature
| S-EPMC6704655 | biostudies-literature
| S-EPMC7595945 | biostudies-literature
| S-EPMC4744870 | biostudies-literature
| S-EPMC6795848 | biostudies-literature
| S-EPMC9302077 | biostudies-literature
| S-EPMC2808889 | biostudies-literature