Project description:<p>Recently developed methods that utilize partitioning of long genomic DNA fragments, and barcoding of shorter fragments derived from them, have succeeded in retaining long-range information in short sequencing reads. These so-called read cloud approaches represent a powerful, accurate, and cost-effective alternative to single-molecule long-read sequencing. We developed software, GROC-SVs, that takes advantage of read clouds for structural variant detection and assembly. We apply the method to two 10x Genomics data sets, one chromothriptic sarcoma with several spatially separated samples, and one breast cancer cell line, all Illumina-sequenced to high coverage. Comparison to short-fragment data from the same samples, and validation by mate-pair data from a subset of the sarcoma samples, demonstrate substantial improvement in specificity of breakpoint detection compared to short-fragment sequencing, at comparable sensitivity, and vice versa. The embedded long-range information also facilitates sequence assembly of a large fraction of the breakpoints; importantly, consecutive breakpoints that are closer than the average length of the input DNA molecules can be assembled together and their order and arrangement reconstructed, with some events exhibiting remarkable complexity. These features facilitated an analysis of the structural evolution of the sarcoma. In the chromothripsis, rearrangements occurred before copy number amplifications, and using the phylogenetic tree built from point mutation data, we show that single nucleotide variants and structural variants are not correlated. We predict significant future advances in structural variant science using 10x data analyzed with GROC-SVs and other read cloud-specific methods.</p>
Project description:Repeatable, convergent outcomes are prima facie evidence for determinism in evolutionary processes. Among fishes, well-known examples include microevolutionary habitat transitions into the water column, where freshwater populations (e.g., sticklebacks, cichlids, and whitefishes) recurrently diverge toward slender-bodied pelagic forms and deep-bodied benthic forms. However, the consequences of such processes at deeper macroevolutionary scales in the marine environment are less clear. We applied a phylogenomics-based integrative, comparative approach to test hypotheses about the scope and strength of convergence in a marine fish clade with a worldwide distribution (snappers and fusiliers, family Lutjanidae) featuring multiple water-column transitions over the past 45 million years. We collected genome-wide exon data for 110 (∼80%) species in the group and aggregated data layers for body shape, habitat occupancy, geographic distribution, and paleontological and geological information. We also implemented approaches using genomic subsets to account for phylogenetic uncertainty in comparative analyses. Our results show independent incursions into the water column by ancestral benthic lineages in all major oceanic basins. These evolutionary transitions are persistently associated with convergent phenotypes, where deep-bodied benthic forms with truncate caudal fins repeatedly evolve into slender midwater species with furcate caudal fins. Lineage diversification and transition dynamics vary asymmetrically between habitats, with benthic lineages diversifying faster and colonizing midwater habitats more often than the reverse. Convergent ecological and functional phenotypes along the benthic-pelagic axis are pervasive among different lineages and across vastly different evolutionary scales, achieving predictable high-fitness solutions for similar environmental challenges, ultimately demonstrating strong determinism in fish body-shape evolution.
| S-EPMC7777220 | biostudies-literature
Project description:Comparison of short-read and long-read amplicon and metagenomic sequencing data from clinical microbiome samples
Project description:gnp_blan06_torpten - lst8 - The impact of the TOR pathway on growth and stress responses. Comparison between the lst8 mutant vs. WT in short day and in long day conditions.
Project description:Pioneering studies (PXD014844) have identified many interesting molecules by LC-MS/MS proteomics, but the protein databases used to assign mass spectra were based on short Illumina reads of the Amblyomma americanum transcriptome and may not have captured the diversity and complexity of longer transcripts. Here we apply long-read Pacific Bioscience technologies to complement the previously reported short-read Illumina transcriptome-based proteome in an effort to increase spectrum assignments. Our dataset reveals a small increase in assignable spectra to supplement previously released short-read transcriptome-based proteome.
Project description:Pioneering studies (PXD014844) have identified many interesting molecules in tick saliva by LC-MS/MS proteomics, but the protein databases used to assign mass spectra were based on short Illumina reads of the Amblyomma americanum transcriptome and may not have captured the diversity and complexity of longer transcripts. Here we apply long-read Pacific Bioscience technologies to complement the previously reported short-read Illumina transcriptome-based proteome in an effort to increase spectrum assignments. Our dataset reveals a small increase in assignable spectra to supplement the previously released short-read transcriptome-based proteome.