Project description:The increasing applicability and sensitivity of next generation sequencing methods exacerbate one of the main issues in the molecular biology laboratory, namely cross-sample contamination. This type of contamination, which could massively increase the rate of false-positive calls in sequencing experiments, can originate at each step during the processing of multiple myeloma samples, such as CD138-selection of tumor cells, RNA and DNA isolation or the processing of sequencing libraries. Here we describe a Droplet Digital PCR (ddPCR) method and a simple bioinformatic solution for the detection of contamination in patient's samples and derived sequencing data, which are based on the same principle: detection of alternative alleles for single-nucleotide polymorphisms (SNPs) that are homozygous according to the control (germ line) sample.

| S-EPMC9481068 | biostudies-literature

Identification of Streptococcus parasanguinis DNA contamination in human buccal DNA samples.

Project description:BACKGROUND: The use of buccal swabs in clinical and scientific studies is a very popular method of collecting DNA, due to its non-invasive nature of collection. However, contamination of the DNA sample may interfere with analysis. FINDINGS: Here we report the finding of Streptococcus parasanguinis bacterial DNA contamination in human buccal DNA samples, which led to preferential amplification of bacterial sequence with PCR primers designed against human sequence. CONCLUSION: Contamination of buccal-derived DNA with bacterial DNA can be significant, and may influence downstream genetic analysis. One needs to be aware of possible bacterial contamination when interpreting abnormal findings following PCR amplification of buccal swab DNA samples.

| S-EPMC4222080 | biostudies-literature

De novo hematopoiesis from the fetal lung.

Project description:Hemogenic endothelial cells (HECs) are specialized cells that undergo endothelial-to-hematopoietic transition (EHT) to give rise to the earliest precursors of hematopoietic progenitors that will eventually sustain hematopoiesis throughout the lifetime of an organism. Although HECs are thought to be primarily limited to the aorta-gonad-mesonephros (AGM) during early development, EHT has been described in various other hematopoietic organs and embryonic vessels. Though not defined as a hematopoietic organ, the lung houses many resident hematopoietic cells, aids in platelet biogenesis, and is a reservoir for hematopoietic stem and progenitor cells (HSPCs). However, lung HECs have never been described. Here, we demonstrate that the fetal lung is a potential source of HECs that have the functional capacity to undergo EHT to produce de novo HSPCs and their resultant progeny. Explant cultures of murine and human fetal lungs display adherent endothelial cells transitioning into floating hematopoietic cells, accompanied by the gradual loss of an endothelial signature. Flow cytometric and functional assessment of fetal-lung explants showed the production of multipotent HSPCs that expressed the EHT and pre-HSPC markers EPCR, CD41, CD43, and CD44. scRNA-seq and small molecule modulation demonstrated that fetal lung HECs rely on canonical signaling pathways to undergo EHT, including TGFβ/BMP, Notch, and YAP. Collectively, these data support the possibility that post-AGM development, functional HECs are present in the fetal lung, establishing this location as a potential extramedullary site of de novo hematopoiesis.

| S-EPMC10685174 | biostudies-literature

Ancestry-agnostic estimation of DNA sample contamination from sequence reads.

Project description:Detecting and estimating DNA sample contamination are important steps to ensure high-quality genotype calls and reliable downstream analysis. Existing methods rely on population allele frequency information for accurate estimation of contamination rates. Correctly specifying population allele frequencies for each individual in early stage of sequence analysis is impractical or even impossible for large-scale sequencing centers that simultaneously process samples from multiple studies across diverse populations. On the other hand, incorrectly specified allele frequencies may result in substantial bias in estimated contamination rates. For example, we observed that existing methods often fail to identify 10% contaminated samples at a typical 3% contamination exclusion threshold when genetic ancestry is misspecified. Such an incomplete screening of contaminated samples substantially inflates the estimated rate of genotyping errors even in deeply sequenced genomes and exomes. We propose a robust statistical method that accurately estimates DNA contamination and is agnostic to genetic ancestry of the intended or contaminating sample. Our method integrates the estimation of genetic ancestry and DNA contamination in a unified likelihood framework by leveraging individual-specific allele frequencies projected from reference genotypes onto principal component coordinates. Our method can also be used for estimating genetic ancestries, similar to LASER or TRACE, but simultaneously accounting for potential contamination. We demonstrate that our method robustly estimates contamination rates and genetic ancestries across populations and contamination scenarios. We further demonstrate that, in the presence of contamination, genetic ancestry inference can be substantially biased with existing methods that ignore contamination, while our method corrects for such biases.

| S-EPMC7050530 | biostudies-literature

Estimation of DNA contamination and its sources in genotyped samples.

Project description:Array genotyping is a cost-effective and widely used tool that enables assessment of up to millions of genetic markers in hundreds of thousands of individuals. Genotyping array data are typically highly accurate but sensitive to mixing of DNA samples from multiple individuals before or during genotyping. Contaminated samples can lead to genotyping errors and consequently cause false positive signals or reduce power of association analyses. Here, we propose a new method to identify contaminated samples and the sources of contamination within a genotyping batch. Through analysis of array intensity and genotype data from intentionally mixed samples and 22,366 samples of the Michigan Genomics Initiative, an ongoing biobank-based study, we show that our method can reliably estimate contamination. We also show that identifying sources of contamination can implicate problematic sample processing steps and guide process improvements. Compared to existing methods, our approach can estimate the proportion of contaminating DNA more accurately, eliminate the need for external databases of allele frequencies, and provide contamination estimates that are more robust to the ancestral origin of the contaminating sample.

| S-EPMC6829038 | biostudies-literature

Correcting for Sample Contamination in Genotype Calling of DNA Sequence Data.

Project description:DNA sample contamination is a frequent problem in DNA sequencing studies and can result in genotyping errors and reduced power for association testing. We recently described methods to identify within-species DNA sample contamination based on sequencing read data, showed that our methods can reliably detect and estimate contamination levels as low as 1%, and suggested strategies to identify and remove contaminated samples from sequencing studies. Here we propose methods to model contamination during genotype calling as an alternative to removal of contaminated samples from further analyses. We compare our contamination-adjusted calls to calls that ignore contamination and to calls based on uncontaminated data. We demonstrate that, for moderate contamination levels (5%-20%), contamination-adjusted calls eliminate 48%-77% of the genotyping errors. For lower levels of contamination, our contamination correction methods produce genotypes nearly as accurate as those based on uncontaminated data. Our contamination correction methods are useful generally, but are particularly helpful for sample contamination levels from 2% to 20%.

| S-EPMC4573246 | biostudies-literature

Reply to: APP gene copy number changes reflect exogenous contamination.

Project description: Not available

| S-EPMC8522531 | biostudies-literature

Reply to: APP gene copy number changes reflect exogenous contamination.

Project description: Not available

| S-EPMC7507937 | biostudies-literature

Optimized DNA extraction and purification method for characterization of bacterial and fungal communities in lung tissue samples.

Project description:Human lungs harbor a scarce microbial community, requiring to develop methods to enhance the recovery of nucleic acids from bacteria and fungi, leading to a more efficient analysis of the lung tissue microbiota. Here we describe five extraction protocols including pre-treatment, bead-beating and/or Phenol:Chloroform:Isoamyl alcohol steps, applied to lung tissue samples from autopsied individuals. The resulting total DNA yield and quality, bacterial and fungal DNA amount and the microbial community structure were analyzed by qPCR and Illumina sequencing of bacterial 16S rRNA and fungal ITS genes. Bioinformatic modeling revealed that a large part of microbiome from lung tissue is composed of microbial contaminants, although our controls clustered separately from biological samples. After removal of contaminant sequences, the effects of extraction protocols on the microbiota were assessed. The major differences among samples could be attributed to inter-individual variations rather than DNA extraction protocols. However, inclusion of the bead-beater and Phenol:Chloroform:Isoamyl alcohol steps resulted in changes in the relative abundance of some bacterial/fungal taxa. Furthermore, inclusion of a pre-treatment step increased microbial DNA concentration but not diversity and it may contribute to eliminate DNA fragments from dead microorganisms in lung tissue samples, making the microbial profile closer to the actual one.

| S-EPMC7562954 | biostudies-literature

OmicsDI is part of the ELIXIR infrastructure

OmicsDI is an Elixir interoperability service. Learn more ›

Tweets

OmicsDI Databases

PRIDE
PeptideAtlas
MassIVE
JPOST Repository
Physiome Model Repository

EGA
EVA
ENA
LINCS
PAXDB
Cell Collective

MetaboLights
Metabolomics Workbench
MetabolomeExpress
GNPS
BioModels
FAIRDOMHub

ArrayExpress
dbGaP
ExpressionAtlas
GEO
NODE

Information

Databases
Help
API
Contact us
Code on GitHub
Terms of use
Submit Data