Dataset Information

The State of Software for Evolutionary Biology.

ABSTRACT: With Next Generation Sequencing data being routinely used, evolutionary biology is transforming into a computational science. Thus, researchers have to rely on a growing number of increasingly complex software. All widely used core tools in the field have grown considerably, in terms of the number of features as well as lines of code and consequently, also with respect to software complexity. A topic that has received little attention is the software engineering quality of widely used core analysis tools. Software developers appear to rarely assess the quality of their code, and this can have potential negative consequences for end-users. To this end, we assessed the code quality of 16 highly cited and compute-intensive tools mainly written in C/C++ (e.g., MrBayes, MAFFT, SweepFinder, etc.) and JAVA (BEAST) from the broader area of evolutionary biology that are being routinely used in current data analysis pipelines. Because, the software engineering quality of the tools we analyzed is rather unsatisfying, we provide a list of best practices for improving the quality of existing tools and list techniques that can be deployed for developing reliable, high quality scientific software from scratch. Finally, we also discuss journal as well as science policy and, more importantly, funding issues that need to be addressed for improving software engineering quality as well as ensuring support for developing new and maintaining existing software. Our intention is to raise the awareness of the community regarding software engineering quality issues and to emphasize the substantial lack of funding for scientific software development.

SUBMITTER: Darriba D

PROVIDER: S-EPMC5913673 | biostudies-literature | 2018 May

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

The State of Software for Evolutionary Biology.

Darriba Diego D Flouri Tomáš T Stamatakis Alexandros A

Molecular biology and evolution 20180501 5

With Next Generation Sequencing data being routinely used, evolutionary biology is transforming into a computational science. Thus, researchers have to rely on a growing number of increasingly complex software. All widely used core tools in the field have grown considerably, in terms of the number of features as well as lines of code and consequently, also with respect to software complexity. A topic that has received little attention is the software engineering quality of widely used core analy ...[more]

PMID: 29385525

Similar Datasets

Project description:We provide the first analysis and synthesis of the evolutionary and mechanistic bases for risk of endometriosis in humans, structured around Niko Tinbergen's four questions about phenotypes: phylogenetic history, development, mechanism and adaptive significance. Endometriosis, which is characterized by the proliferation of endometrial tissue outside of the uterus, has its phylogenetic roots in the evolution of three causally linked traits: (1) highly invasive placentation, (2) spontaneous rather than implantation-driven endometrial decidualization and (3) frequent extensive estrogen-driven endometrial proliferation and inflammation, followed by heavy menstrual bleeding. Endometriosis is potentiated by these traits and appears to be driven, proximately, by relatively low levels of prenatal and postnatal testosterone. Testosterone affects the developing hypothalamic-pituitary-ovarian (HPO) axis, and at low levels, it can result in an altered trajectory of reproductive and physiological phenotypes that in extreme cases can mediate the symptoms of endometriosis. Polycystic ovary syndrome, by contrast, is known from previous work to be caused primarily by high prenatal and postnatal testosterone, and it demonstrates a set of phenotypes opposite to those found in endometriosis. The hypothesis that endometriosis risk is driven by low prenatal testosterone, and involves extreme expression of some reproductive phenotypes, is supported by a suite of evidence from genetics, development, endocrinology, morphology and life history. The hypothesis also provides insights into why these two diametric, fitness-reducing disorders are maintained at such high frequencies in human populations. Finally, the hypotheses described and evaluated here lead to numerous testable predictions and have direct implications for the treatment and study of endometriosis. Lay summary: Endometriosis is caused by endometrial tissue outside of the uterus. We explain why and how humans are vulnerable to this disease, and new perspectives on understanding and treating it. Endometriosis shows evidence of being caused in part by relatively low testosterone during fetal development, that 'programs' female reproductive development. By contrast, polycystic ovary syndrome is associated with relatively high testosterone in prenatal development. These two disorders can thus be seen as 'opposite' to one another in their major causes and correlates. Important new insights regarding diagnosis, study and treatment of endometriosis follow from these considerations.

Dataset Information

The State of Software for Evolutionary Biology.

Publications

The State of Software for Evolutionary Biology.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets