Ontology highlight
ABSTRACT: Premise
Universal target enrichment kits maximize utility across wide evolutionary breadth while minimizing the number of baits required to create a cost-efficient kit. The Angiosperms353 kit has been successfully used to capture loci throughout the angiosperms, but the default target reference file includes sequence information from only 6-18 taxa per locus. Consequently, reads sequenced from on-target DNA molecules may fail to map to references, resulting in fewer on-target reads for assembly, and reducing locus recovery.Methods
We expanded the Angiosperms353 target file, incorporating sequences from 566 transcriptomes to produce a 'mega353' target file, with each locus represented by 17-373 taxa. This mega353 file is a drop-in replacement for the original Angiosperms353 file in HybPiper analyses. We provide tools to subsample the file based on user-selected taxon groups, and to incorporate other transcriptome or protein-coding gene data sets.Results
Compared to the default Angiosperms353 file, the mega353 file increased the percentage of on-target reads by an average of 32%, increased locus recovery at 75% length by 49%, and increased the total length of the concatenated loci by 29%.Discussion
Increasing the phylogenetic density of the target reference file results in improved recovery of target capture loci. The mega353 file and associated scripts are available at: https://github.com/chrisjackson-pellicle/NewTargets.
SUBMITTER: McLay TGB
PROVIDER: S-EPMC8312740 | biostudies-literature |
REPOSITORIES: biostudies-literature