scholarly journals Clinical chemistry in higher dimensions: Machine-learning and enhanced prediction from routine clinical chemistry data

2016 ◽  
Vol 49 (16-17) ◽  
pp. 1213-1220 ◽  
Author(s):  
Alice Richardson ◽  
Ben M. Signor ◽  
Brett A. Lidbury ◽  
Tony Badrick
Diagnosis ◽  
2015 ◽  
Vol 2 (1) ◽  
pp. 41-51 ◽  
Author(s):  
Brett A. Lidbury ◽  
Alice M. Richardson ◽  
Tony Badrick

AbstractRoutine liver function tests (LFTs) are central to serum testing profiles, particularly in community medicine. However there is concern about the redundancy of information provided to requesting clinicians. Large quantities of clinical laboratory data and advances in computational knowledge discovery methods provide opportunities to re-examine the value of individual routine laboratory results that combine for LFT profiles.The machine learning methods recursive partitioning (decision trees) and support vector machines (SVMs) were applied to aggregate clinical chemistry data that included elevated LFT profiles. Response categories for γ-glutamyl transferase (GGT) were established based on whether the patient results were within or above the sex-specific reference interval. Single decision tree and SVMs were applied to test the accuracy of GGT prediction by the highest ranked predictors of GGT response, alkaline phosphatase (ALP) and alanine amino-transaminase (ALT).Through interrogating more than 20,000 individual cases comprising both sexes and all ages, decision trees predicted GGT category at 90% accuracy using only ALP and ALT, with a SVM prediction accuracy of 82.6% after 10-fold training and testing. Bilirubin, lactate dehydrogenase (LD) and albumin did not enhance prediction, or reduced accuracy. Comparison of abnormal (elevated) GGT categories also supported the primacy of ALP and ALT as screening markers, with serum urate and cholesterol also useful.Machine-learning interrogation of massive clinical chemistry data sets demonstrated a strategy to address redundancy in routine LFT screening by identifying ALT and ALP in tandem as able to accurately predict GGT elevation, suggesting that GGT can be removed from routine LFT screening.


2014 ◽  
Vol 2 (4) ◽  
pp. 63-70 ◽  
Author(s):  
Danyel Jennen ◽  
Jan Polman ◽  
Mark Bessem ◽  
Maarten Coonen ◽  
Joost van Delft ◽  
...  

2021 ◽  
Vol 10 (19) ◽  
pp. 4576
Author(s):  
Dae Youp Shin ◽  
Bora Lee ◽  
Won Sang Yoo ◽  
Joo Won Park ◽  
Jung Keun Hyun

Diabetic sensorimotor polyneuropathy (DSPN) is a major complication in patients with diabetes mellitus (DM), and early detection or prediction of DSPN is important for preventing or managing neuropathic pain and foot ulcer. Our aim is to delineate whether machine learning techniques are more useful than traditional statistical methods for predicting DSPN in DM patients. Four hundred seventy DM patients were classified into four groups (normal, possible, probable, and confirmed) based on clinical and electrophysiological findings of suspected DSPN. Three ML methods, XGBoost (XGB), support vector machine (SVM), and random forest (RF), and their combinations were used for analysis. RF showed the best area under the receiver operator characteristic curve (AUC, 0.8250) for differentiating between two categories—criteria by clinical findings (normal, possible, and probable groups) and those by electrophysiological findings (confirmed group)—and the result was superior to that of linear regression analysis (AUC = 0.6620). Average values of serum glucose, International Federation of Clinical Chemistry (IFCC), HbA1c, and albumin levels were identified as the four most important predictors of DSPN. In conclusion, machine learning techniques, especially RF, can predict DSPN in DM patients effectively, and electrophysiological analysis is important for identifying DSPN.


2007 ◽  
Vol 31 (2) ◽  
pp. 352-356 ◽  
Author(s):  
J. Todd Auman ◽  
Gary A. Boorman ◽  
Ralph E. Wilson ◽  
Gregory S. Travlos ◽  
Richard S. Paules

Clinical chemistry data are routinely generated as part of preclinical animal toxicity studies and human clinical studies. With large-scale studies involving hundreds or even thousands of samples in multiple treatment groups, it is currently difficult to interpret the resulting complex, high-density clinical chemistry data. Accordingly, we conducted this study to investigate methods for easy visualization of complex, high-density data. Clinical chemistry data were obtained from male rats each treated with one of eight different acute hepatotoxicants from a large-scale toxicogenomics study. The raw data underwent a Z-score transformation comparing each individual animal's clinical chemistry values to that of reference controls from all eight studies and then were visualized in a single graphic using a heat map. The utility of using a heat map to visualize high-density clinical chemistry data was explored by clustering changes in clinical chemistry values for >400 animals. A clear distinction was observed in animals displaying hepatotoxicity from those that did not. Additionally, while animals experiencing hepatotoxicity showed many similarities in the observed clinical chemistry alterations, distinct differences were noted in the heat map profile for the different compounds. Using a heat map to visualize complex, high-density clinical chemistry data in a single graphic facilitates the identification of previously unrecognized trends. This method is simple to implement and maintains the biological integrity of the data. The value of this clinical chemistry data transformation and visualization will manifest itself through integration with other high-density data, such as genomics data, to study physiology at the systems level.


2021 ◽  
Author(s):  
Guilherme Ferreira da Silva ◽  
Marcos Vinicius Ferreira ◽  
Iago Sousa Lima Costa ◽  
Renato Borges Bernardes ◽  
Carlos Eduardo Miranda Mota ◽  
...  

Abstract Mineral chemistry analysis is a valuable tool in several phases of mineralogy and mineral prospecting studies. This type of analysis can point out relevant information, such as concentration of the chemical element of interest in the analyzed phase and, thus, the predisposition of an area for a given commodity. Due to this, considerable amount of data has been generated, especially with the use of electron probe micro-analyzers (EPMA), either in research for academic purposes or in a typical prospecting campaign in the mineral industry. We have identified an efficiency gap when manually processing and analyzing mineral chemistry data, and thus, we envisage this research niche could benefit from the versatility brought by machine learning algorithms. In this paper, we present Qmin, an application that assists in increasing the efficiency of mineral chemistry data processing and analysis stages through automated routines. Our code benefits from a hierarchical structure of classifiers and regressors trained by a Random Forest algorithm developed on a filtered training database extracted from the GEOROC (Geochemistry of Rocks of the Oceans and Continents) repository, maintained by the Max Planck Institute for Chemistry. To test the robustness of our application, we applied a blind test with more than 11,000 mineral chemistry analyses compiled for diamond prospecting within the scope of the Diamante Brasil Project of the Geological Survey of Brazil. The blind test yielded a balanced classifier accuracy of ca. 99% for the minerals known by Qmin. Therefore, we highlight the potential of machine learning techniques in assisting the processing and analysis of mineral chemistry data.


Machine learning and big data models are most useful constraints in software technologies. But these systems need very less data at processing time, also technology wise data dimensionality increases day by day. Any algorithm applicable for high dimensional data requires more processing time and storage resources. The curse of dimensionality refers to all the problems that arise when working with data in the higher dimensions that did not exist in the lower dimensions. Our paper attempts to deal with the issue of safety for information at low dimensionality. Addressing this trouble is equivalent to addressing the safety problem of the hardware and software platform. Decision tree (DT) ML model is helpful for these dimensional and clustering problems. DTML model has been reduced the duplicate data size and clustering achieved efficiency 94.3% and reduction ratio by 32.4%..


Sign in / Sign up

Export Citation Format

Share Document