Project description:Kidney diseases form part of the major health burdens experienced all over the world. Kidney diseases are linked to high economic burden, deaths, and morbidity rates. The great importance of collecting a large quantity of health-related data among human cohorts, what scholars refer to as "big data", has increasingly been identified, with the establishment of a large group of cohorts and the usage of electronic health records (EHRs) in nephrology and transplantation. These data are valuable, and can potentially be utilized by researchers to advance knowledge in the field. Furthermore, progress in big data is stimulating the flourishing of artificial intelligence (AI), which is an excellent tool for handling, and subsequently processing, a great amount of data and may be applied to highlight more information on the effectiveness of medicine in kidney-related complications for the purpose of more precise phenotype and outcome prediction. In this article, we discuss the advances and challenges in big data, the use of EHRs and AI, with great emphasis on the usage of nephrology and transplantation.
Project description:Technological advances in big data (large amounts of highly varied data from many different sources that may be processed rapidly), data sciences and artificial intelligence can improve health-system functions and promote personalized care and public good. However, these technologies will not replace the fundamental components of the health system, such as ethical leadership and governance, or avoid the need for a robust ethical and regulatory environment. In this paper, we discuss what a robust ethical and regulatory environment might look like for big data analytics in health insurance, and describe examples of safeguards and participatory mechanisms that should be established. First, a clear and effective data governance framework is critical. Legal standards need to be enacted and insurers should be encouraged and given incentives to adopt a human-centred approach in the design and use of big data analytics and artificial intelligence. Second, a clear and accountable process is necessary to explain what information can be used and how it can be used. Third, people whose data may be used should be empowered through their active involvement in determining how their personal data may be managed and governed. Fourth, insurers and governance bodies, including regulators and policy-makers, need to work together to ensure that the big data analytics based on artificial intelligence that are developed are transparent and accurate. Unless an enabling ethical environment is in place, the use of such analytics will likely contribute to the proliferation of unconnected data systems, worsen existing inequalities, and erode trustworthiness and trust.
Project description:SARS-CoV2 is a novel coronavirus, responsible for the COVID-19 pandemic declared by the World Health Organization. Thanks to the latest advancements in the field of molecular and computational techniques and information and communication technologies (ICTs), artificial intelligence (AI) and Big Data can help in handling the huge, unprecedented amount of data derived from public health surveillance, real-time epidemic outbreaks monitoring, trend now-casting/forecasting, regular situation briefing and updating from governmental institutions and organisms, and health facility utilization information. The present review is aimed at overviewing the potential applications of AI and Big Data in the global effort to manage the pandemic.
Project description:Artificial intelligence (AI) is expected to support clinical judgement in medicine. We constructed a new predictive model for diabetic kidney diseases (DKD) using AI, processing natural language and longitudinal data with big data machine learning, based on the electronic medical records (EMR) of 64,059 diabetes patients. AI extracted raw features from the previous 6 months as the reference period and selected 24 factors to find time series patterns relating to 6-month DKD aggravation, using a convolutional autoencoder. AI constructed the predictive model with 3,073 features, including time series data using logistic regression analysis. AI could predict DKD aggravation with 71% accuracy. Furthermore, the group with DKD aggravation had a significantly higher incidence of hemodialysis than the non-aggravation group, over 10 years (N?=?2,900). The new predictive model by AI could detect progression of DKD and may contribute to more effective and accurate intervention to reduce hemodialysis.
Project description:ObjectiveModern healthcare systems face challenges related to the stable and sufficient blood supply of blood due to shortages. This study aimed to predict the monthly blood transfusion requirements in medical institutions using an artificial intelligence model based on national open big data related to transfusion.MethodsData regarding blood types and components in Korea from January 2010 to December 2021 were obtained from the Health Insurance Review and Assessment Service and Statistics Korea. The data were collected from a single medical institution. Using the obtained information, predictive models were developed, including eXtreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LGBM), and category boosting (CatBoost). An ensemble model was created using these three models.ResultsThe prediction performance of XGBoost, LGBM, and CatBoost demonstrated a mean absolute error ranging from 14.6657 for AB+ red blood cells (RBCs) to 84.0433 for A+ platelet concentrate (PC) and a root mean squared error ranging from 18.5374 for AB+ RBCs to 118.6245 for B+ PC. The error range was further improved by creating ensemble models, wherein the department requesting blood was the most influential parameter affecting transfusion prediction performance for different blood products and types. Except for the department, the features that affected the prediction performance varied for each product and blood type, including the number of RBC antibody screens, crossmatch, nationwide blood donations, and surgeries.ConclusionBased on blood-related open big data, the developed blood-demand prediction algorithm can efficiently provide medical facilities with an appropriate volume of blood ahead of time.
Project description:We are facing a major challenge in bridging the gap between identifying subtypes of asthma to understand causal mechanisms and translating this knowledge into personalized prevention and management strategies. In recent years, "big data" has been sold as a panacea for generating hypotheses and driving new frontiers of health care; the idea that the data must and will speak for themselves is fast becoming a new dogma. One of the dangers of ready accessibility of health care data and computational tools for data analysis is that the process of data mining can become uncoupled from the scientific process of clinical interpretation, understanding the provenance of the data, and external validation. Although advances in computational methods can be valuable for using unexpected structure in data to generate hypotheses, there remains a need for testing hypotheses and interpreting results with scientific rigor. We argue for combining data- and hypothesis-driven methods in a careful synergy, and the importance of carefully characterized birth and patient cohorts with genetic, phenotypic, biological, and molecular data in this process cannot be overemphasized. The main challenge on the road ahead is to harness bigger health care data in ways that produce meaningful clinical interpretation and to translate this into better diagnoses and properly personalized prevention and treatment plans. There is a pressing need for cross-disciplinary research with an integrative approach to data science, whereby basic scientists, clinicians, data analysts, and epidemiologists work together to understand the heterogeneity of asthma.
Project description:Despite the important discoveries reported by genome-wide association (GWA) studies, for most traits and diseases the prediction R-squared (R-sq.) achieved with genetic scores remains considerably lower than the trait heritability. Modern biobanks will soon deliver unprecedentedly large biomedical data sets: Will the advent of big data close the gap between the trait heritability and the proportion of variance that can be explained by a genomic predictor? We addressed this question using Bayesian methods and a data analysis approach that produces a surface response relating prediction R-sq. with sample size and model complexity (e.g., number of SNPs). We applied the methodology to data from the interim release of the UK Biobank. Focusing on human height as a model trait and using 80,000 records for model training, we achieved a prediction R-sq. in testing (n = 22,221) of 0.24 (95% C.I.: 0.23-0.25). Our estimates show that prediction R-sq. increases with sample size, reaching an estimated plateau at values that ranged from 0.1 to 0.37 for models using 500 and 50,000 (GWA-selected) SNPs, respectively. Soon much larger data sets will become available. Using the estimated surface response, we forecast that larger sample sizes will lead to further improvements in prediction R-sq. We conclude that big data will lead to a substantial reduction of the gap between trait heritability and the proportion of interindividual differences that can be explained with a genomic predictor. However, even with the power of big data, for complex traits we anticipate that the gap between prediction R-sq. and trait heritability will not be fully closed.
Project description:The discovery of targeted drugs heavily relies on three-dimensional (3D) structures of target proteins. When the 3D structure of a protein target is unknown, it is very difficult to design its corresponding targeted drugs. Although the 3D structures of some proteins (the so-called undruggable targets) are known, their targeted drugs are still absent. As increasing crystal/cryogenic electron microscopy structures are deposited in Protein Data Bank, it is much more possible to discover the targeted drugs. Moreover, it is also highly probable to turn previous undruggable targets into druggable ones when we identify their hidden allosteric sites. In this review, we focus on the currently available advanced methods for the discovery of novel compounds targeting proteins without 3D structure and how to turn undruggable targets into druggable ones.
Project description:This research presents a reverse engineering approach to discover the patterns and evolution behavior of SARS-CoV-2 using AI and big data. Accordingly, we have studied five viral families (Orthomyxoviridae, Retroviridae, Filoviridae, Flaviviridae, and Coronaviridae) that happened in the era of the past one hundred years. To capture the similarities, common characteristics, and evolution behavior for prediction concerning SARS-CoV-2. And how reverse engineering using Artificial intelligence (AI) and big data is efficient and provides wide horizons. The results show that SARS-CoV-2 shares the same highest active amino acids (S, L, and T) with the mentioned viral families. As known, that affects the building function of the proteins. We have also devised a mathematical formula representing how we calculate the evolution difference percentage between each virus concerning its phylogenic tree. It shows that SARS-CoV-2 has fast mutation evolution concerning its time of arising. Artificial Intelligence (AI) is used to predict the next evolved instance of SARS-CoV-2 by utilizing the phylogenic tree data as a corpus using Long Short-term Memory (LSTM). This paper has shown the evolved viral instance prediction process on ORF7a protein from SARS-CoV-2 as the first stage to predict the complete mutant virus. Finally, in this research, we have focused on analyzing the virus to its primary factors by reverse engineering using AI and big data to understand the viral similarities, patterns, and evolution behavior to predict future viral mutations of the virus artificially in a systematic and logical way.