Dataset Information

Machines first, humans second: on the importance of algorithmic interpretation of open chemistry data.

ABSTRACT: The current rise in the use of open lab notebook techniques means that there are an increasing number of scientists who make chemical information freely and openly available to the entire community as a series of micropublications that are released shortly after the conclusion of each experiment. We propose that this trend be accompanied by a thorough examination of data sharing priorities. We argue that the most significant immediate benefactor of open data is in fact chemical algorithms, which are capable of absorbing vast quantities of data, and using it to present concise insights to working chemists, on a scale that could not be achieved by traditional publication methods. Making this goal practically achievable will require a paradigm shift in the way individual scientists translate their data into digital form, since most contemporary methods of data entry are designed for presentation to humans rather than consumption by machine learning algorithms. We discuss some of the complex issues involved in fixing current methods, as well as some of the immediate benefits that can be gained when open data is published correctly using unambiguous machine readable formats. Graphical AbstractLab notebook entries must target both visualisation by scientists and use by machine learning algorithms.

SUBMITTER: Clark AM

PROVIDER: S-EPMC4369291 | biostudies-other | 2015

REPOSITORIES: biostudies-other

ACCESS DATA

Publications

Machines first, humans second: on the importance of algorithmic interpretation of open chemistry data.

Clark Alex M AM Williams Antony J AJ Ekins Sean S

Journal of cheminformatics 20150322

The current rise in the use of open lab notebook techniques means that there are an increasing number of scientists who make chemical information freely and openly available to the entire community as a series of micropublications that are released shortly after the conclusion of each experiment. We propose that this trend be accompanied by a thorough examination of data sharing priorities. We argue that the most significant immediate benefactor of open data is in fact chemical algorithms, which ...[more]

PMID: 25798198

Similar Datasets

Project description:BackgroundThere is no uniform definition for cerebral microdialysis (CMD) probe location with respect to focal brain lesions, and the impact of CMD-probe location on measured molecule concentrations is unclear.MethodsWe retrospectively analyzed data of 51 consecutive subarachnoid hemorrhage patients with CMD-monitoring between 2010 and 2016 included in a prospective observational cohort study. Microdialysis probe location was assessed on all brain computed tomography (CT) scans performed during CMD-monitoring and defined as perilesional in the presence of a focal hypodense or hyperdense lesion within a 1-cm radius of the gold tip of the CMD-probe, or otherwise as normal-appearing brain tissue.ResultsProbe location was detected in normal-appearing brain tissue on 53/143 (37%) and in perilesional location on 90/143 (63%) CT scans. In the perilesional area, CMD-glucose levels were lower (p = 0.003), whereas CMD-lactate (p = 0.002), CMD-lactate-to-pyruvate-ratio (LPR; p < 0.001), CMD-glutamate (p = 0.002), and CMD-glycerol levels (p < 0.001) were higher. Neuroglucopenia (CMD-glucose < 0.7 mmol/l, p = 0.002), metabolic distress (p = 0.002), and mitochondrial dysfunction (p = 0.005) were more common in perilesional compared to normal-appearing brain tissue. Development of new lesions in the proximity of the CMD-probe (n = 13) was associated with a decrease in CMD-glucose levels, evidence of neuroglucopenia, metabolic distress, as well as increasing CMD-glutamate and CMD-glycerol levels. Neuroglucopenia was associated with poor outcome independent of probe location, whereas elevated CMD-lactate, CMD-LPR, CMD-glutamate, and CMD-glycerol levels were only predictive of poor outcome in normal-appearing brain tissue.ConclusionsFocal brain lesions significantly impact on concentrations of brain metabolites assessed by CMD. With the exception of CMD-glucose, the prognostic value of CMD-derived parameters seems to be higher when assessed in normal-appearing brain tissue. CMD was sensitive to detect the development of new focal lesions in vicinity to the neuromonitoring probe. Probe location should be described in the research reporting brain metabolic changes measured by CMD and integrated in statistical models.

Project description:Background: Coronary artery disease distribution along the vessel is a main determinant of FFR improvement after PCI. Identifying focal from diffuse disease from visual inspections of coronary angiogram (CA) and FFR pullback (FFR-PB) are operator-dependent. Computer science may standardize interpretations of such curves. Methods: A virtual stenting algorithm (VSA) was developed to perform an automated FFR-PB curve analysis. A survey analysis of the evaluations of 39 vessels with intermediate disease on CA and a distal FFR <0.8, rated by 5 interventional cardiologists, was performed. Vessel disease distribution and PCI strategy were successively rated based on CA and distal FFR (CA); CA and FFR-PB curve (CA/FFR-PB); and CA and VSA (CA/VSA). Inter-rater reliability was assessed using Fleiss kappa and an agreement analysis of CA/VSA rating with both algorithmic and human evaluation (operator) was performed. We hypothesize that VSA would increase rater agreement in interpretation of epicardial disease distribution and subsequent evaluation of PCI eligibility. Results: Inter-rater reliability in vessel disease assessment by CA, CA/FFR-PB, and CA/VSA were respectively, 0.32 (95% CI: 0.17-0.47), 0.38 (95% CI: 0.23-0.53), and 0.4 (95% CI: 0.25-0.55). The raters' overall agreement in vessel disease distribution and PCI eligibility was higher with the VSA than with the operator (respectively, 67 vs. 42%, and 80 vs. 70%, both p < 0.05). Compared to CA/FFR-PB, CA/VSA induced more reclassification toward a focal disease (92 vs. 56.2%, p < 0.01) with a trend toward more reclassification as eligible for PCI (70.6 vs. 33%, p = 0.06). Change in PCI strategy did not differ between CA/FFR-PB and CA/VSA (23.6 vs. 28.5%, p = 0.38). Conclusions: VSA is a new program to facilitate and standardize the FFR pullback curves analysis. When expert reviewers integrate VSA data, their assessments are less variable which might help to standardize PCI eligibility and strategy evaluations. Clinical Trial Registration: https://www.clinicaltrials.gov/ct2/show/NCT03824600.

Dataset Information

Machines first, humans second: on the importance of algorithmic interpretation of open chemistry data.

Publications

Machines first, humans second: on the importance of algorithmic interpretation of open chemistry data.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets