Unknown

Dataset Information

0

GMASS: a novel measure for genome assembly structural similarity.


ABSTRACT:

Background

Thanks to the recent advancements in next-generation sequencing (NGS) technologies, large amount of genomic data, which are short DNA sequences known as reads, has been accumulating. Diverse assemblers have been developed to generate high quality de novo assemblies using the NGS reads, but their output is very different because of algorithmic differences. However, there are not properly structured measures to show the similarity or difference in assemblies.

Results

We developed a new measure, called the GMASS score, for comparing two genome assemblies in terms of their structure. The GMASS score was developed based on the distribution pattern of the number and coverage of similar regions between a pair of assemblies. The new measure was able to show structural similarity between assemblies when evaluated by simulated assembly datasets. The application of the GMASS score to compare assemblies in recently published benchmark datasets showed the divergent performance of current assemblers as well as its ability to compare assemblies.

Conclusion

The GMASS score is a novel measure for representing structural similarity between two assemblies. It will contribute to the understanding of assembly output and developing de novo assemblers.

SUBMITTER: Kwon D 

PROVIDER: S-EPMC6423833 | biostudies-literature | 2019 Mar

REPOSITORIES: biostudies-literature

altmetric image

Publications

GMASS: a novel measure for genome assembly structural similarity.

Kwon Daehong D   Lee Jongin J   Kim Jaebum J  

BMC bioinformatics 20190318 1


<h4>Background</h4>Thanks to the recent advancements in next-generation sequencing (NGS) technologies, large amount of genomic data, which are short DNA sequences known as reads, has been accumulating. Diverse assemblers have been developed to generate high quality de novo assemblies using the NGS reads, but their output is very different because of algorithmic differences. However, there are not properly structured measures to show the similarity or difference in assemblies.<h4>Results</h4>We d  ...[more]

Similar Datasets

| S-EPMC11225810 | biostudies-literature
| S-EPMC5016231 | biostudies-literature
| S-EPMC4983430 | biostudies-literature
2023-01-05 | PXD027791 | Pride
| S-EPMC8556919 | biostudies-literature
| S-EPMC7435890 | biostudies-literature
| S-EPMC2697648 | biostudies-literature
| S-EPMC4352269 | biostudies-literature
| S-EPMC3663825 | biostudies-other
| S-EPMC9913042 | biostudies-literature