Clinical chemistry in higher dimensions: Machine-learning and enhanced prediction from routine clinical chemistry data

AbstractRoutine liver function tests (LFTs) are central to serum testing profiles, particularly in community medicine. However there is concern about the redundancy of information provided to requesting clinicians. Large quantities of clinical laboratory data and advances in computational knowledge discovery methods provide opportunities to re-examine the value of individual routine laboratory results that combine for LFT profiles.The machine learning methods recursive partitioning (decision trees) and support vector machines (SVMs) were applied to aggregate clinical chemistry data that included elevated LFT profiles. Response categories for γ-glutamyl transferase (GGT) were established based on whether the patient results were within or above the sex-specific reference interval. Single decision tree and SVMs were applied to test the accuracy of GGT prediction by the highest ranked predictors of GGT response, alkaline phosphatase (ALP) and alanine amino-transaminase (ALT).Through interrogating more than 20,000 individual cases comprising both sexes and all ages, decision trees predicted GGT category at 90% accuracy using only ALP and ALT, with a SVM prediction accuracy of 82.6% after 10-fold training and testing. Bilirubin, lactate dehydrogenase (LD) and albumin did not enhance prediction, or reduced accuracy. Comparison of abnormal (elevated) GGT categories also supported the primacy of ALP and ALT as screening markers, with serum urate and cholesterol also useful.Machine-learning interrogation of massive clinical chemistry data sets demonstrated a strategy to address redundancy in routine LFT screening by identifying ALT and ALP in tandem as able to accurately predict GGT elevation, suggesting that GGT can be removed from routine LFT screening.

Download Full-text

Drug-induced liver injury classification model based on in vitro human transcriptomics and in vivo rat clinical chemistry data

Systems Biomedicine ◽

10.4161/sysb.29400 ◽

2014 ◽

Vol 2 (4) ◽

pp. 63-70 ◽

Cited By ~ 10

Author(s):

Danyel Jennen ◽

Jan Polman ◽

Mark Bessem ◽

Maarten Coonen ◽

Joost van Delft ◽

...

Keyword(s):

Liver Injury ◽

Clinical Chemistry ◽

Classification Model ◽

Drug Induced ◽

Drug Induced Liver Injury ◽

Chemistry Data ◽

Model Based ◽

Injury Classification

Download Full-text

Application of probabilistic neural network in the clinical diagnosis of cancers based on clinical chemistry data

Analytica Chimica Acta ◽

10.1016/s0003-2670(02)00924-8 ◽

2002 ◽

Vol 471 (1) ◽

pp. 77-86 ◽

Cited By ~ 35

Author(s):

Yichu Shan ◽

Ruihuan Zhao ◽

Guowang Xu ◽

H.M Liebich ◽

Yukui Zhang

Keyword(s):

Neural Network ◽

Clinical Diagnosis ◽

Probabilistic Neural Network ◽

Clinical Chemistry ◽

Chemistry Data

Download Full-text

Prediction of Diabetic Sensorimotor Polyneuropathy Using Machine Learning Techniques

Journal of Clinical Medicine ◽

10.3390/jcm10194576 ◽

2021 ◽

Vol 10 (19) ◽

pp. 4576

Author(s):

Dae Youp Shin ◽

Bora Lee ◽

Won Sang Yoo ◽

Joo Won Park ◽

Jung Keun Hyun

Keyword(s):

Machine Learning ◽

Clinical Chemistry ◽

Characteristic Curve ◽

Linear Regression Analysis ◽

Receiver Operator Characteristic Curve ◽

Machine Learning Techniques ◽

Support Vector ◽

Sensorimotor Polyneuropathy ◽

Learning Techniques ◽

Electrophysiological Findings

Diabetic sensorimotor polyneuropathy (DSPN) is a major complication in patients with diabetes mellitus (DM), and early detection or prediction of DSPN is important for preventing or managing neuropathic pain and foot ulcer. Our aim is to delineate whether machine learning techniques are more useful than traditional statistical methods for predicting DSPN in DM patients. Four hundred seventy DM patients were classified into four groups (normal, possible, probable, and confirmed) based on clinical and electrophysiological findings of suspected DSPN. Three ML methods, XGBoost (XGB), support vector machine (SVM), and random forest (RF), and their combinations were used for analysis. RF showed the best area under the receiver operator characteristic curve (AUC, 0.8250) for differentiating between two categories—criteria by clinical findings (normal, possible, and probable groups) and those by electrophysiological findings (confirmed group)—and the result was superior to that of linear regression analysis (AUC = 0.6620). Average values of serum glucose, International Federation of Clinical Chemistry (IFCC), HbA1c, and albumin levels were identified as the four most important predictors of DSPN. In conclusion, machine learning techniques, especially RF, can predict DSPN in DM patients effectively, and electrophysiological analysis is important for identifying DSPN.

Download Full-text

The Value of Clinical Chemistry Data in Animal Screening Studies for Safety Evaluation

Toxicologic Pathology ◽

10.1177/0192623392020003210 ◽

1992 ◽

Vol 20 (3-2) ◽

pp. 515-518 ◽

Cited By ~ 2

Author(s):

Hiltje Irausquin

Keyword(s):

Clinical Chemistry ◽

Safety Evaluation ◽

Chemistry Data

Download Full-text

Crystallographic prediction from diffraction and chemistry data for higher throughput classification using machine learning

Computational Materials Science ◽

10.1016/j.commatsci.2019.109409 ◽

2020 ◽

Vol 173 ◽

pp. 109409 ◽

Cited By ~ 6

Author(s):

Jeffery A. Aguiar ◽

Matthew L. Gong ◽

Tolga Tasdizen

Keyword(s):

Machine Learning ◽

Chemistry Data

Download Full-text

Heat map visualization of high-density clinical chemistry data

Physiological Genomics ◽

10.1152/physiolgenomics.00276.2006 ◽

2007 ◽

Vol 31 (2) ◽

pp. 352-356 ◽

Cited By ~ 12

Author(s):

J. Todd Auman ◽

Gary A. Boorman ◽

Ralph E. Wilson ◽

Gregory S. Travlos ◽

Richard S. Paules

Keyword(s):

Large Scale ◽

Clinical Chemistry ◽

High Density ◽

Male Rats ◽

Z Score ◽

Heat Map ◽

Density Data ◽

Treatment Groups ◽

Multiple Treatment ◽

Chemistry Data

Clinical chemistry data are routinely generated as part of preclinical animal toxicity studies and human clinical studies. With large-scale studies involving hundreds or even thousands of samples in multiple treatment groups, it is currently difficult to interpret the resulting complex, high-density clinical chemistry data. Accordingly, we conducted this study to investigate methods for easy visualization of complex, high-density data. Clinical chemistry data were obtained from male rats each treated with one of eight different acute hepatotoxicants from a large-scale toxicogenomics study. The raw data underwent a Z-score transformation comparing each individual animal's clinical chemistry values to that of reference controls from all eight studies and then were visualized in a single graphic using a heat map. The utility of using a heat map to visualize high-density clinical chemistry data was explored by clustering changes in clinical chemistry values for >400 animals. A clear distinction was observed in animals displaying hepatotoxicity from those that did not. Additionally, while animals experiencing hepatotoxicity showed many similarities in the observed clinical chemistry alterations, distinct differences were noted in the heat map profile for the different compounds. Using a heat map to visualize complex, high-density clinical chemistry data in a single graphic facilitates the identification of previously unrecognized trends. This method is simple to implement and maintains the biological integrity of the data. The value of this clinical chemistry data transformation and visualization will manifest itself through integration with other high-density data, such as genomics data, to study physiology at the systems level.

Download Full-text

Galactose elimination capacity in patients with chronic liver diseases: A comparison with clinical and clinical chemistry data

Journal of Hepatology ◽

10.1016/s0168-8278(05)81247-4 ◽

1994 ◽

Vol 21 ◽

pp. S157

Keyword(s):

Liver Diseases ◽

Clinical Chemistry ◽

Chronic Liver ◽

Elimination Capacity ◽

Chronic Liver Diseases ◽

Chemistry Data ◽

Galactose Elimination Capacity ◽

Galactose Elimination

Download Full-text

Qmin: A machine learning-based application for mineral chemistry data processing and analysis

10.21203/rs.3.rs-629516/v1 ◽

2021 ◽

Author(s):

Guilherme Ferreira da Silva ◽

Marcos Vinicius Ferreira ◽

Iago Sousa Lima Costa ◽

Renato Borges Bernardes ◽

Carlos Eduardo Miranda Mota ◽

...

Keyword(s):

Machine Learning ◽

Data Processing ◽

Relevant Information ◽

Mineral Chemistry ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Max Planck ◽

Blind Test ◽

Chemistry Data ◽

Learning Techniques

Abstract Mineral chemistry analysis is a valuable tool in several phases of mineralogy and mineral prospecting studies. This type of analysis can point out relevant information, such as concentration of the chemical element of interest in the analyzed phase and, thus, the predisposition of an area for a given commodity. Due to this, considerable amount of data has been generated, especially with the use of electron probe micro-analyzers (EPMA), either in research for academic purposes or in a typical prospecting campaign in the mineral industry. We have identified an efficiency gap when manually processing and analyzing mineral chemistry data, and thus, we envisage this research niche could benefit from the versatility brought by machine learning algorithms. In this paper, we present Qmin, an application that assists in increasing the efficiency of mineral chemistry data processing and analysis stages through automated routines. Our code benefits from a hierarchical structure of classifiers and regressors trained by a Random Forest algorithm developed on a filtered training database extracted from the GEOROC (Geochemistry of Rocks of the Oceans and Continents) repository, maintained by the Max Planck Institute for Chemistry. To test the robustness of our application, we applied a blind test with more than 11,000 mineral chemistry analyses compiled for diamond prospecting within the scope of the Diamante Brasil Project of the Geological Survey of Brazil. The blind test yielded a balanced classifier accuracy of ca. 99% for the minerals known by Qmin. Therefore, we highlight the potential of machine learning techniques in assisting the processing and analysis of mineral chemistry data.

Download Full-text

Dimensionality Reduction using Machine Learning and Big Data Technologies

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.b7580.129219 ◽

2020 ◽

Vol 9 (2) ◽

pp. 1740-1745

Keyword(s):

Machine Learning ◽

Big Data ◽

Processing Time ◽

Higher Dimensions ◽

High Dimensional ◽

Big Data Technologies ◽

Low Dimensionality ◽

And Storage ◽

Clustering Problems ◽

Day By Day

Machine learning and big data models are most useful constraints in software technologies. But these systems need very less data at processing time, also technology wise data dimensionality increases day by day. Any algorithm applicable for high dimensional data requires more processing time and storage resources. The curse of dimensionality refers to all the problems that arise when working with data in the higher dimensions that did not exist in the lower dimensions. Our paper attempts to deal with the issue of safety for information at low dimensionality. Addressing this trouble is equivalent to addressing the safety problem of the hardware and software platform. Decision tree (DT) ML model is helpful for these dimensional and clustering problems. DTML model has been reduced the duplicate data size and clustering achieved efficiency 94.3% and reduction ratio by 32.4%..

Download Full-text