Project description:Kidney diseases form part of the major health burdens experienced all over the world. Kidney diseases are linked to high economic burden, deaths, and morbidity rates. The great importance of collecting a large quantity of health-related data among human cohorts, what scholars refer to as "big data", has increasingly been identified, with the establishment of a large group of cohorts and the usage of electronic health records (EHRs) in nephrology and transplantation. These data are valuable, and can potentially be utilized by researchers to advance knowledge in the field. Furthermore, progress in big data is stimulating the flourishing of artificial intelligence (AI), which is an excellent tool for handling, and subsequently processing, a great amount of data and may be applied to highlight more information on the effectiveness of medicine in kidney-related complications for the purpose of more precise phenotype and outcome prediction. In this article, we discuss the advances and challenges in big data, the use of EHRs and AI, with great emphasis on the usage of nephrology and transplantation.
Project description:Technological advances in big data (large amounts of highly varied data from many different sources that may be processed rapidly), data sciences and artificial intelligence can improve health-system functions and promote personalized care and public good. However, these technologies will not replace the fundamental components of the health system, such as ethical leadership and governance, or avoid the need for a robust ethical and regulatory environment. In this paper, we discuss what a robust ethical and regulatory environment might look like for big data analytics in health insurance, and describe examples of safeguards and participatory mechanisms that should be established. First, a clear and effective data governance framework is critical. Legal standards need to be enacted and insurers should be encouraged and given incentives to adopt a human-centred approach in the design and use of big data analytics and artificial intelligence. Second, a clear and accountable process is necessary to explain what information can be used and how it can be used. Third, people whose data may be used should be empowered through their active involvement in determining how their personal data may be managed and governed. Fourth, insurers and governance bodies, including regulators and policy-makers, need to work together to ensure that the big data analytics based on artificial intelligence that are developed are transparent and accurate. Unless an enabling ethical environment is in place, the use of such analytics will likely contribute to the proliferation of unconnected data systems, worsen existing inequalities, and erode trustworthiness and trust.
Project description:SARS-CoV2 is a novel coronavirus, responsible for the COVID-19 pandemic declared by the World Health Organization. Thanks to the latest advancements in the field of molecular and computational techniques and information and communication technologies (ICTs), artificial intelligence (AI) and Big Data can help in handling the huge, unprecedented amount of data derived from public health surveillance, real-time epidemic outbreaks monitoring, trend now-casting/forecasting, regular situation briefing and updating from governmental institutions and organisms, and health facility utilization information. The present review is aimed at overviewing the potential applications of AI and Big Data in the global effort to manage the pandemic.
Project description:Artificial intelligence (AI) is expected to support clinical judgement in medicine. We constructed a new predictive model for diabetic kidney diseases (DKD) using AI, processing natural language and longitudinal data with big data machine learning, based on the electronic medical records (EMR) of 64,059 diabetes patients. AI extracted raw features from the previous 6 months as the reference period and selected 24 factors to find time series patterns relating to 6-month DKD aggravation, using a convolutional autoencoder. AI constructed the predictive model with 3,073 features, including time series data using logistic regression analysis. AI could predict DKD aggravation with 71% accuracy. Furthermore, the group with DKD aggravation had a significantly higher incidence of hemodialysis than the non-aggravation group, over 10 years (N?=?2,900). The new predictive model by AI could detect progression of DKD and may contribute to more effective and accurate intervention to reduce hemodialysis.
Project description:Human subject experiments are performed to evaluate the influence of artificial intelligence (AI) process management on human design teams solving a complex engineering problem and compare that to the influence of human process management. Participants are grouped into teams of five individuals and asked to generate a drone fleet and plan routes to deliver parcels to a given customer market. The teams are placed under the guidance of either a human or an AI external process manager. Halfway through the experiment, the customer market is changed unexpectedly, requiring teams to adjust their strategy. During the experiment, participants can create, evaluate, share their drone designs and delivery routes, and communicate with their team through a text chat tool using a collaborative research platform called HyForm. The research platform collects step-by-step logs of the actions made by and communication amongst participants in both the design team's roles and the process managers. This article presents the data sets collected for 171 participants assigned to 31 design teams, 15 teams under the guidance of an AI agent (5 participants), and 16 teams under the guidance of a human manager (6 participants). These data sets can be used for data-driven design, behavioral analyses, sequence-based analyses, and natural language processing.
Project description:ObjectiveModern healthcare systems face challenges related to the stable and sufficient blood supply of blood due to shortages. This study aimed to predict the monthly blood transfusion requirements in medical institutions using an artificial intelligence model based on national open big data related to transfusion.MethodsData regarding blood types and components in Korea from January 2010 to December 2021 were obtained from the Health Insurance Review and Assessment Service and Statistics Korea. The data were collected from a single medical institution. Using the obtained information, predictive models were developed, including eXtreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LGBM), and category boosting (CatBoost). An ensemble model was created using these three models.ResultsThe prediction performance of XGBoost, LGBM, and CatBoost demonstrated a mean absolute error ranging from 14.6657 for AB+ red blood cells (RBCs) to 84.0433 for A+ platelet concentrate (PC) and a root mean squared error ranging from 18.5374 for AB+ RBCs to 118.6245 for B+ PC. The error range was further improved by creating ensemble models, wherein the department requesting blood was the most influential parameter affecting transfusion prediction performance for different blood products and types. Except for the department, the features that affected the prediction performance varied for each product and blood type, including the number of RBC antibody screens, crossmatch, nationwide blood donations, and surgeries.ConclusionBased on blood-related open big data, the developed blood-demand prediction algorithm can efficiently provide medical facilities with an appropriate volume of blood ahead of time.
Project description:We are facing a major challenge in bridging the gap between identifying subtypes of asthma to understand causal mechanisms and translating this knowledge into personalized prevention and management strategies. In recent years, "big data" has been sold as a panacea for generating hypotheses and driving new frontiers of health care; the idea that the data must and will speak for themselves is fast becoming a new dogma. One of the dangers of ready accessibility of health care data and computational tools for data analysis is that the process of data mining can become uncoupled from the scientific process of clinical interpretation, understanding the provenance of the data, and external validation. Although advances in computational methods can be valuable for using unexpected structure in data to generate hypotheses, there remains a need for testing hypotheses and interpreting results with scientific rigor. We argue for combining data- and hypothesis-driven methods in a careful synergy, and the importance of carefully characterized birth and patient cohorts with genetic, phenotypic, biological, and molecular data in this process cannot be overemphasized. The main challenge on the road ahead is to harness bigger health care data in ways that produce meaningful clinical interpretation and to translate this into better diagnoses and properly personalized prevention and treatment plans. There is a pressing need for cross-disciplinary research with an integrative approach to data science, whereby basic scientists, clinicians, data analysts, and epidemiologists work together to understand the heterogeneity of asthma.
Project description:Despite the important discoveries reported by genome-wide association (GWA) studies, for most traits and diseases the prediction R-squared (R-sq.) achieved with genetic scores remains considerably lower than the trait heritability. Modern biobanks will soon deliver unprecedentedly large biomedical data sets: Will the advent of big data close the gap between the trait heritability and the proportion of variance that can be explained by a genomic predictor? We addressed this question using Bayesian methods and a data analysis approach that produces a surface response relating prediction R-sq. with sample size and model complexity (e.g., number of SNPs). We applied the methodology to data from the interim release of the UK Biobank. Focusing on human height as a model trait and using 80,000 records for model training, we achieved a prediction R-sq. in testing (n = 22,221) of 0.24 (95% C.I.: 0.23-0.25). Our estimates show that prediction R-sq. increases with sample size, reaching an estimated plateau at values that ranged from 0.1 to 0.37 for models using 500 and 50,000 (GWA-selected) SNPs, respectively. Soon much larger data sets will become available. Using the estimated surface response, we forecast that larger sample sizes will lead to further improvements in prediction R-sq. We conclude that big data will lead to a substantial reduction of the gap between trait heritability and the proportion of interindividual differences that can be explained with a genomic predictor. However, even with the power of big data, for complex traits we anticipate that the gap between prediction R-sq. and trait heritability will not be fully closed.