Unknown

Dataset Information

0

T-lex: a program for fast and accurate assessment of transposable element presence using next-generation sequencing data.


ABSTRACT: Transposable elements (TEs) are repetitive DNA sequences that are ubiquitous, extremely abundant and dynamic components of practically all genomes. Much effort has gone into annotation of TE copies in reference genomes. The sequencing cost reduction and the newly available next-generation sequencing (NGS) data from multiple strains within a species offer an unprecedented opportunity to study population genomics of TEs in a range of organisms. Here, we present a computational pipeline (T-lex) that uses NGS data to detect the presence/absence of annotated TE copies. T-lex can use data from a large number of strains and returns estimates of population frequencies of individual TE insertions in a reasonable time. We experimentally validated the accuracy of T-lex detecting presence or absence of 768 previously identified TE copies in two resequenced Drosophila melanogaster strains. Approximately 95% of the TE insertions were detected with 100% sensitivity and 97% specificity. We show that even at low levels of coverage T-lex produces accurate results for TE copies that it can identify reliably but that the rate of 'no data' calls increases as the coverage falls below 15×. T-lex is a broadly applicable and flexible tool that can be used in any genome provided the availability of the reference genome, individual TE copy annotation and NGS data.

SUBMITTER: Fiston-Lavier AS 

PROVIDER: S-EPMC3064797 | biostudies-other | 2011 Mar

REPOSITORIES: biostudies-other

altmetric image

Publications

T-lex: a program for fast and accurate assessment of transposable element presence using next-generation sequencing data.

Fiston-Lavier Anna-Sophie AS   Carrigan Matthew M   Petrov Dmitri A DA   González Josefa J  

Nucleic acids research 20101221 6


Transposable elements (TEs) are repetitive DNA sequences that are ubiquitous, extremely abundant and dynamic components of practically all genomes. Much effort has gone into annotation of TE copies in reference genomes. The sequencing cost reduction and the newly available next-generation sequencing (NGS) data from multiple strains within a species offer an unprecedented opportunity to study population genomics of TEs in a range of organisms. Here, we present a computational pipeline (T-lex) tha  ...[more]

Similar Datasets

| S-EPMC3562067 | biostudies-literature
| S-EPMC9682801 | biostudies-literature
| S-EPMC3548894 | biostudies-literature
| S-EPMC9234764 | biostudies-literature
| S-EPMC7914406 | biostudies-literature
| S-EPMC4074385 | biostudies-literature
| S-EPMC4009769 | biostudies-literature
| S-EPMC6580563 | biostudies-literature
2017-04-03 | PXD003804 | Pride
| S-EPMC5557969 | biostudies-literature