Project description:BackgroundPolicy makers, clinicians and researchers are demonstrating increasing interest in using data linked from multiple sources to support measurement of clinical performance and patient health outcomes. However, the utility of data linkage may be compromised by sub-optimal or incomplete linkage, leading to systematic bias. In this study, we synthesize the evidence identifying participant or population characteristics that can influence the validity and completeness of data linkage and may be associated with systematic bias in reported outcomes.MethodsA narrative review, using structured search methods was undertaken. Key words "data linkage" and Mesh term "medical record linkage" were applied to Medline, EMBASE and CINAHL databases between 1991 and 2007. Abstract inclusion criteria were; the article attempted an empirical evaluation of methodological issues relating to data linkage and reported on patient characteristics, the study design included analysis of matched versus unmatched records, and the report was in English. Included articles were grouped thematically according to patient characteristics that were compared between matched and unmatched records.ResultsThe search identified 1810 articles of which 33 (1.8%) met inclusion criteria. There was marked heterogeneity in study methods and factors investigated. Characteristics that were unevenly distributed among matched and unmatched records were; age (72% of studies), sex (50% of studies), race (64% of studies), geographical/hospital site (93% of studies), socio-economic status (82% of studies) and health status (72% of studies).ConclusionA number of relevant patient or population factors may be associated with incomplete data linkage resulting in systematic bias in reported clinical outcomes. Readers should consider these factors in interpreting the reported results of data linkage studies.
Project description:A human cell is a precisely regulated system that relies on the complex interaction of molecules. Structural insights into the cellular machinery at the atomic level allow us to understand the underlying regulatory mechanism and provide us with a roadmap for the development of novel drugs to fight diseases. Facilitated by recent technological breakthroughs, the Nobel prize-winning technique electron cryomicroscopy (cryo-EM) has become a versatile and extremely powerful tool to solve routinely near-atomic resolution three-dimensional protein structures. Consequently, it has become the focus of attention for structure-based drug design. In this review, we describe the basics of cryo-EM and highlight its growing role in biomedical research. Furthermore, we discuss latest developments as well as future perspectives.
Project description:BackgroundThe use of real-world data has become increasingly popular, also in the field of infectious disease (ID), particularly since the COVID-19 pandemic emerged. While much useful data for research is being collected, these data are generally stored across different sources. Privacy concerns limit the possibility to store the data centrally, thereby also limiting the possibility of fully leveraging the potential power of combined data. Federated learning (FL) has been suggested to overcome privacy issues by making it possible to perform research on data from various sources without those data leaving local servers. In this review, we discuss existing applications of FL in ID research, as well as the most relevant opportunities and challenges of this method.MethodsReferences for this review were identified through searches of MEDLINE/PubMed, Google Scholar, Embase and Scopus until July 2023. We searched for studies using FL in different applications related to ID.ResultsThirty references were included and divided into four sub-topics: disease screening, prediction of clinical outcomes, infection epidemiology, and vaccine research. Most research was related to COVID-19. In all studies, FL achieved good accuracy when predicting diseases and outcomes, also in comparison to non-federated methods. However, most studies did not make use of real-world federated data, but rather showed the potential of FL by using data that was manually partitioned.ConclusionsFL is a promising methodology which allows using data from several sources, potentially generating stronger and more generalisable results. However, further exploration of FL application possibilities in ID research is needed.
Project description:ObjectiveThe study sought to design, pilot, and evaluate a federated data completeness tracking system (CTX) for assessing completeness in research data extracted from electronic health record data across the Accessible Research Commons for Health (ARCH) Clinical Data Research Network.Materials and methodsThe CTX applies a systems-based approach to design workflow and technology for assessing completeness across distributed electronic health record data repositories participating in a queryable, federated network. The CTX invokes 2 positive feedback loops that utilize open source tools (DQe-c and Vue) to integrate technology and human actors in a system geared for increasing capacity and taking action. A pilot implementation of the system involved 6 ARCH partner sites between January 2017 and May 2018.ResultsThe ARCH CTX has enabled the network to monitor and, if needed, adjust its data management processes to maintain complete datasets for secondary use. The system allows the network and its partner sites to profile data completeness both at the network and partner site levels. Interactive visualizations presenting the current state of completeness in the context of the entire network as well as changes in completeness across time were valued among the CTX user base.DiscussionDistributed clinical data networks are complex systems. Top-down approaches that solely rely on technology to report data completeness may be necessary but not sufficient for improving completeness (and quality) of data in large-scale clinical data networks. Improving and maintaining complete (high-quality) data in such complex environments entails sociotechnical systems that exploit technology and empower human actors to engage in the process of high-quality data curating.ConclusionsThe CTX has increased the network's capacity to rapidly identify data completeness issues and empowered ARCH partner sites to get involved in improving the completeness of respective data in their repositories.
Project description:Since the first great oxygenation event, photosynthetic microorganisms have continuously shaped the Earth's atmosphere. Studying biological mechanisms involved in the interaction between microalgae and cyanobacteria with the Earth's atmosphere requires the monitoring of gas exchange. Membrane inlet mass spectrometry (MIMS) has been developed in the early 1960s to study gas exchange mechanisms of photosynthetic cells. It has since played an important role in investigating various cellular processes that involve gaseous compounds (O2, CO2, NO, or H2) and in characterizing enzymatic activities in vitro or in vivo. With the development of affordable mass spectrometers, MIMS is gaining wide popularity and is now used by an increasing number of laboratories. However, it still requires an important theory and practical considerations to be used. Here, we provide a practical guide describing the current technical basis of a MIMS setup and the general principles of data processing. We further review how MIMS can be used to study various aspects of algal research and discuss how MIMS will be useful in addressing future scientific challenges.
Project description:BackgroundDNA microarray technology provides us with a first step toward the goal of uncovering gene functions on a genomic scale. In recent years, vast amounts of gene expression data have been collected, much of which are available in public databases, such as the Gene Expression Omnibus (GEO). To date, most researchers have been manually retrieving data from databases through web browsers using accession numbers (IDs) or keywords, but gene-expression patterns are not considered when retrieving such data. The Connectivity Map was recently introduced to compare gene expression data by introducing gene-expression signatures (represented by a set of genes with up- or down-regulated labels according to their biological states) and is available as a web tool for detecting similar gene-expression signatures from a limited data set (approximately 7,000 expression profiles representing 1,309 compounds). In order to support researchers to utilize the public gene expression data more effectively, we developed a web tool for finding similar gene expression data and generating its co-expression networks from a publicly available database.ResultsGEM-TREND, a web tool for searching gene expression data, allows users to search data from GEO using gene-expression signatures or gene expression ratio data as a query and retrieve gene expression data by comparing gene-expression pattern between the query and GEO gene expression data. The comparison methods are based on the nonparametric, rank-based pattern matching approach of Lamb et al. (Science 2006) with the additional calculation of statistical significance. The web tool was tested using gene expression ratio data randomly extracted from the GEO and with in-house microarray data, respectively. The results validated the ability of GEM-TREND to retrieve gene expression entries biologically related to a query from GEO. For further analysis, a network visualization interface is also provided, whereby genes and gene annotations are dynamically linked to external data repositories.ConclusionGEM-TREND was developed to retrieve gene expression data by comparing query gene-expression pattern with those of GEO gene expression data. It could be a very useful resource for finding similar gene expression profiles and constructing its gene co-expression networks from a publicly available database. GEM-TREND was designed to be user-friendly and is expected to support knowledge discovery. GEM-TREND is freely available at http://cgs.pharm.kyoto-u.ac.jp/services/network.
Project description:BackgroundCommonly, several traits are assessed in agronomic experiments to better understand the factors under study. However, it is also common to see that even when several traits are available, researchers opt to follow the easiest way by applying univariate analyses and post-hoc tests for mean comparison for each trait, which arouses the hypothesis that the benefits of a multi-trait framework analysis may have not been fully exploited in this area.ResultsIn this paper, we extended the theoretical foundations of the multi-trait genotype-ideotype distance index (MGIDI) to analyze multivariate data either in simple experiments (e.g., one-way layout with few treatments and traits) or complex experiments (e.g., with a factorial treatment structure). We proposed an optional weighting process that makes the ranking of treatments that stands out in traits with higher weights more likely. Its application is illustrated using (1) simulated data and (2) real data from a strawberry experiment that aims to select better factor combinations (namely, cultivar, transplant origin, and substrate mixture) based on the desired performance of 22 phenological, productive, physiological, and qualitative traits. Our results show that most of the strawberry traits are influenced by the cultivar, transplant origin, cultivation substrates, as well as by the interaction between cultivar and transplant origin. The MGIDI ranked the Albion cultivar originated from Imported transplants and the Camarosa cultivar originated from National transplants as the better factor combinations. The substrates with burned rice husk as the main component (70%) showed satisfactory physical proprieties, providing higher water use efficiency. The strengths and weakness view provided by the MGIDI revealed that looking for an ideal treatment should direct the efforts on increasing fruit production of Albion transplants from Imported origin. On the other hand, this treatment has strengths related to productive precocity, total soluble solids, and flesh firmness.ConclusionsOverall, this study opens the door to the use of MGIDI beyond the plant breeding context, providing a unique, practical, robust, and easy-to-handle multi-trait-based framework to analyze multivariate data. There is an exciting possibility for this to open up new avenues of research, mainly because using the MGIDI in future studies will dramatically reduce the number of tables/figures needed, serving as a powerful tool to guide researchers toward better treatment recommendations.
Project description:Among the many benefits of the Human Genome Project are new and powerful tools such as the genome-wide hybridization devices referred to as microarrays. Initially designed to measure gene transcriptional levels, microarray technologies are now used for comparing other genome features among individuals and their tissues and cells. Results provide valuable information on disease subcategories, disease prognosis, and treatment outcome. Likewise, they reveal differences in genetic makeup, regulatory mechanisms, and subtle variations and move us closer to the era of personalized medicine. To understand this powerful tool, its versatility, and how dramatically it is changing the molecular approach to biomedical and clinical research, this review describes the technology, its applications, a didactic step-by-step review of a typical microarray protocol, and a real experiment. Finally, it calls the attention of the medical community to the importance of integrating multidisciplinary teams to take advantage of this technology and its expanding applications that, in a slide, reveals our genetic inheritance and destiny.
Project description:Protein biomarkers offer major benefits for diagnosis and monitoring of disease processes. Recent advances in protein mass spectrometry make it feasible to use this very sensitive technology to detect and quantify proteins in blood. To explore the potential of blood biomarkers, we conducted a thorough review to evaluate the reliability of data in the literature and to determine the spectrum of proteins reported to exist in blood with a goal of creating a Federated Database of Blood Proteins (FDBP). A unique feature of our approach is the use of a SQL database for all of the peptide data; the power of the SQL database combined with standard informatic algorithms such as BLAST and the statistical analysis system (SAS) allowed the rapid annotation and analysis of the database without the need to create special programs to manage the data. Our mathematical analysis and review shows that in addition to the usual secreted proteins found in blood, there are many reports of intracellular proteins and good agreement on transcription factors, DNA remodelling factors in addition to cellular receptors and their signal transduction enzymes. Overall, we have catalogued about 12,130 proteins identified by at least one unique peptide, and of these 3858 have 3 or more peptide correlations. The FDBP with annotations should facilitate testing blood for specific disease biomarkers.
Project description:BackgroundNew approaches and tools were needed to support the strategic planning, implementation and management of a Program launched by the Brazilian Government to fund research, development and capacity building on neglected tropical diseases with strong focus on the North, Northeast and Center-West regions of the country where these diseases are prevalent.Methodology/principal findingsBased on demographic, epidemiological and burden of disease data, seven diseases were selected by the Ministry of Health as targets of the initiative. Publications on these diseases by Brazilian researchers were retrieved from international databases, analyzed and processed with text-mining tools in order to standardize author- and institution's names and addresses. Co-authorship networks based on these publications were assembled, visualized and analyzed with social network analysis software packages. Network visualization and analysis generated new information, allowing better design and strategic planning of the Program, enabling decision makers to characterize network components by area of work, identify institutions as well as authors playing major roles as central hubs or located at critical network cut-points and readily detect authors or institutions participating in large international scientific collaborating networks.Conclusions/significanceTraditional criteria used to monitor and evaluate research proposals or R&D Programs, such as researchers' productivity and impact factor of scientific publications, are of limited value when addressing research areas of low productivity or involving institutions from endemic regions where human resources are limited. Network analysis was found to generate new and valuable information relevant to the strategic planning, implementation and monitoring of the Program. It afforded a more proactive role of the funding agencies in relation to public health and equity goals, to scientific capacity building objectives and a more consistent engagement of institutions and authors from endemic regions based on innovative criteria and parameters anchored on objective scientific data.