Project description:Next-Generation-Sequencing (NGS) technologies have led to important improvement in the detection of new or unrecognized infective agents, related to infectious diseases. In this context, NGS high-throughput technology can be used to achieve a comprehensive and unbiased sequencing of the nucleic acids present in a clinical sample (i.e. tissues). Metagenomic shotgun sequencing has emerged as powerful high-throughput approaches to analyze and survey microbial composition in the field of infectious diseases. By directly sequencing millions of nucleic acid molecules in a sample and matching the sequences to those available in databases, pathogens of an infectious disease can be inferred. Despite the large amount of metagenomic shotgun data produced, there is a lack of a comprehensive and easy-use pipeline for data analysis that avoid annoying and complicated bioinformatics steps. Here we present HOME-BIO, a modular and exhaustive pipeline for analysis of biological entity estimation, specific designed for shotgun sequenced clinical samples. HOME-BIO analysis provides comprehensive taxonomy classification by querying different source database and carry out main steps in metagenomic investigation. HOME-BIO is a powerful tool in the hand of biologist without computational experience, which are focused on metagenomic analysis. Its easy-to-use intrinsic characteristic allows users to simply import raw sequenced reads file and obtain taxonomy profile of their samples.
Project description:Next-Generation-Sequencing (NGS) technologies have led to important improvement in the detection of new or unrecognized infective agents, related to infectious diseases. In this context, NGS high-throughput technology can be used to achieve a comprehensive and unbiased sequencing of the nucleic acids present in a clinical sample (i.e. tissues). Metagenomic shotgun sequencing has emerged as powerful high-throughput approaches to analyze and survey microbial composition in the field of infectious diseases. By directly sequencing millions of nucleic acid molecules in a sample and matching the sequences to those available in databases, pathogens of an infectious disease can be inferred. Despite the large amount of metagenomic shotgun data produced, there is a lack of a comprehensive and easy-use pipeline for data analysis that avoid annoying and complicated bioinformatics steps. Here we present HOME-BIO, a modular and exhaustive pipeline for analysis of biological entity estimation, specific designed for shotgun sequenced clinical samples. HOME-BIO analysis provides comprehensive taxonomy classification by querying different source database and carry out main steps in metagenomic investigation. HOME-BIO is a powerful tool in the hand of biologist without computational experience, which are focused on metagenomic analysis. Its easy-to-use intrinsic characteristic allows users to simply import raw sequenced reads file and obtain taxonomy profile of their samples.
Project description:Recent progress in unbiased metagenomic next-generation sequencing (mNGS) allows simultaneous examination of microbial and host genetic material in a single test. Leveraging affordable bronchoalveolar lavage fluid (BALF) mNGS data, we employed machine learning to create a diagnostic approach distinguishing lung cancer from pulmonary infections, conditions prone to misdiagnosis in clinical settings. This prospective study analyzed BALF-mNGS data from lung cancer and pulmonary infection patients, delineating differences in DNA/RNA microbial composition, bacteriophage abundances, and host responses, including gene expression, transposable element levels, immune cell composition, and tumor fraction derived from copy number variation (CNV). Integrating these metrics into a host/microbe metagenomics-driven machine learning model (Model VI) demonstrated robustness, achieving an AUC of 0.87 (95% CI = 0.857-0.883), sensitivity = 73.8%, and specificity = 84.5% in the training cohort, and an AUC of 0.831 (95% CI = 0.819-0.843), sensitivity = 67.1%, and specificity = 94.4% in the validation cohort for distinguishing lung cancer from pulmonary infections. The application of a rule-in and rule-out strategy-based composite predictive model significantly enhances accuracy (ACC) in distinguishing between lung cancer and tuberculosis (ACC=0.913), fungal infection (ACC=0.955), and bacterial infection (ACC=0.836). These findings highlight the potential of cost-effective mNGS-based analysis as a valuable tool for early differentiation between lung cancer and pulmonary infections, offering significant benefits through a single comprehensive testing.