Using Machine Learning to Aid the Interpretation of Urine Steroid Profiles

Abstract BACKGROUND Urine steroid profiles are used in clinical practice for the diagnosis and monitoring of disorders of steroidogenesis and adrenal pathologies. Machine learning (ML) algorithms are powerful computational tools used extensively for the recognition of patterns in large data sets. Here, we investigated the utility of various ML algorithms for the automated biochemical interpretation of urine steroid profiles to support current clinical practices. METHODS Data from 4619 urine steroid profiles processed between June 2012 and October 2016 were retrospectively collected. Of these, 1314 profiles were used to train and test various ML classifiers' abilities to differentiate between “No significant abnormality” and “?Abnormal” profiles. Further classifiers were trained and tested for their ability to predict the specific biochemical interpretation of the profiles. RESULTS The best performing binary classifier could predict the interpretation of No significant abnormality and ?Abnormal profiles with a mean area under the ROC curve of 0.955 (95% CI, 0.949–0.961). In addition, the best performing multiclass classifier could predict the individual abnormal profile interpretation with a mean balanced accuracy of 0.873 (0.865–0.880). CONCLUSIONS Here we have described the application of ML algorithms to the automated interpretation of urine steroid profiles. This provides a proof-of-concept application of ML algorithms to complex clinical laboratory data that has the potential to improve laboratory efficiency in a setting of limited staff resources.

Download Full-text

Posaconazole-Induced Hypertension Due to Inhibition of 11β-Hydroxylase and 11β-Hydroxysteroid Dehydrogenase 2

Journal of the Endocrine Society ◽

10.1210/js.2019-00189 ◽

2019 ◽

Vol 3 (7) ◽

pp. 1361-1366 ◽

Cited By ~ 12

Author(s):

George R Thompson ◽

Katharina R Beck ◽

Melanie Patt ◽

Denise V Kratschmar ◽

Alex Odermatt

Keyword(s):

Clinical Laboratory ◽

Hydroxysteroid Dehydrogenase ◽

Underlying Mechanism ◽

Comprehensive Analysis ◽

Hydroxylase Activity ◽

Interindividual Differences ◽

Induced Hypertension ◽

Steroid Profiles ◽

The Individual ◽

Urine Steroid

Abstract We describe two cases of hypertension and hypokalemia due to mineralocorticoid excess caused by posaconazole treatment of coccidioidomycosis and rhinocerebral mucormycosis infections, respectively. Clinical laboratory evaluations, including a comprehensive analysis of blood and urine steroid profiles, revealed low renin and aldosterone and indicated as the underlying mechanism primarily a block of 11β-hydroxylase activity in patient 1, whereas patient 2 displayed weaker 11β-hydroxylase but more pronounced 11β-hydroxysteroid dehydrogenase 2 inhibition. The results show that both previously suggested mechanisms must be considered and emphasize significant interindividual differences in the contribution of each enzyme to the observed mineralocorticoid excess phenotype. The mineralocorticoid symptoms of patient 1 resolved after replacement of posaconazole therapy by isavoconazole, and posaconazole dosage de-escalation ameliorated the effects in patient 2. By providing a thorough analysis of the patients’ blood and urine steroid metabolites, this report adds further evidence for two individually pronounced mechanisms of posaconazole-induced hypertension and hypokalemia. The elucidation of the factors responsible for the individual phenotype warrants further research.

Download Full-text

Generation of geometric interpolations of building types with deep variational autoencoders

Design Science ◽

10.1017/dsj.2020.31 ◽

2020 ◽

Vol 6 ◽

Author(s):

Jaime de Miguel Rodríguez ◽

Maria Eugenia Villafañe ◽

Luka Piškorec ◽

Fernando Sancho Caparrini

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Large Data ◽

Learning Model ◽

Large Data Sets ◽

Data Sets ◽

Connectivity Map ◽

Data Set ◽

3D Objects ◽

Machine Learning Model

Abstract This work presents a methodology for the generation of novel 3D objects resembling wireframes of building types. These result from the reconstruction of interpolated locations within the learnt distribution of variational autoencoders (VAEs), a deep generative machine learning model based on neural networks. The data set used features a scheme for geometry representation based on a ‘connectivity map’ that is especially suited to express the wireframe objects that compose it. Additionally, the input samples are generated through ‘parametric augmentation’, a strategy proposed in this study that creates coherent variations among data by enabling a set of parameters to alter representative features on a given building type. In the experiments that are described in this paper, more than 150 k input samples belonging to two building types have been processed during the training of a VAE model. The main contribution of this paper has been to explore parametric augmentation for the generation of large data sets of 3D geometries, showcasing its problems and limitations in the context of neural networks and VAEs. Results show that the generation of interpolated hybrid geometries is a challenging task. Despite the difficulty of the endeavour, promising advances are presented.

Download Full-text

Machine Learning Improves the Precision and Robustness of High-Content Screens

CrossRef Listing of Deleted DOIs ◽

10.1177/1087057111414878 ◽

2011 ◽

Vol 16 (9) ◽

pp. 1059-1067 ◽

Cited By ~ 44

Author(s):

Peter Horvath ◽

Thomas Wild ◽

Ulrike Kutay ◽

Gabor Csucs

Keyword(s):

Machine Learning ◽

User Interaction ◽

Texture Features ◽

Large Data ◽

Complete Analysis ◽

Data Sets ◽

Reporter Protein ◽

Fluorescent Intensity ◽

Analysis Workflow

Imaging-based high-content screens often rely on single cell-based evaluation of phenotypes in large data sets of microscopic images. Traditionally, these screens are analyzed by extracting a few image-related parameters and use their ratios (linear single or multiparametric separation) to classify the cells into various phenotypic classes. In this study, the authors show how machine learning–based classification of individual cells outperforms those classical ratio-based techniques. Using fluorescent intensity and morphological and texture features, they evaluated how the performance of data analysis increases with increasing feature numbers. Their findings are based on a case study involving an siRNA screen monitoring nucleoplasmic and nucleolar accumulation of a fluorescently tagged reporter protein. For the analysis, they developed a complete analysis workflow incorporating image segmentation, feature extraction, cell classification, hit detection, and visualization of the results. For the classification task, the authors have established a new graphical framework, the Advanced Cell Classifier, which provides a very accurate high-content screen analysis with minimal user interaction, offering access to a variety of advanced machine learning methods.

Download Full-text

Deep Learning Approaches for Sentiment Analysis Challenges and Future Issues

10.4018/978-1-7998-8161-2.ch003 ◽

2022 ◽

pp. 27-50

Author(s):

Rajalaxmi Prabhu B. ◽

Seema S.

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Model Building ◽

Large Data ◽

Machine Learning Algorithms ◽

Large Data Sets ◽

Data Sets ◽

Learning Approaches ◽

Learning Techniques ◽

Important Challenge

A lot of user-generated data is available these days from huge platforms, blogs, websites, and other review sites. These data are usually unstructured. Analyzing sentiments from these data automatically is considered an important challenge. Several machine learning algorithms are implemented to check the opinions from large data sets. A lot of research has been undergone in understanding machine learning approaches to analyze sentiments. Machine learning mainly depends on the data required for model building, and hence, suitable feature exactions techniques also need to be carried. In this chapter, several deep learning approaches, its challenges, and future issues will be addressed. Deep learning techniques are considered important in predicting the sentiments of users. This chapter aims to analyze the deep-learning techniques for predicting sentiments and understanding the importance of several approaches for mining opinions and determining sentiment polarity.

Download Full-text

Precision-Recall versus Accuracy and the Role of Large Data Sets

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33014039 ◽

2019 ◽

Vol 33 ◽

pp. 4039-4048 ◽

Cited By ~ 8

Author(s):

Brendan Juba ◽

Hai S. Le

Keyword(s):

Machine Learning ◽

Class Imbalance ◽

Imbalanced Data ◽

Large Data ◽

Constant Factor ◽

Data Sets ◽

Data Set ◽

Small Constant ◽

Classifier Performance ◽

Necessary And Sufficient

Practitioners of data mining and machine learning have long observed that the imbalance of classes in a data set negatively impacts the quality of classifiers trained on that data. Numerous techniques for coping with such imbalances have been proposed, but nearly all lack any theoretical grounding. By contrast, the standard theoretical analysis of machine learning admits no dependence on the imbalance of classes at all. The basic theorems of statistical learning establish the number of examples needed to estimate the accuracy of a classifier as a function of its complexity (VC-dimension) and the confidence desired; the class imbalance does not enter these formulas anywhere. In this work, we consider the measures of classifier performance in terms of precision and recall, a measure that is widely suggested as more appropriate to the classification of imbalanced data. We observe that whenever the precision is moderately large, the worse of the precision and recall is within a small constant factor of the accuracy weighted by the class imbalance. A corollary of this observation is that a larger number of examples is necessary and sufficient to address class imbalance, a finding we also illustrate empirically.

Download Full-text

Gene sequences, collaboration and analysis of large data sets

Australian Systematic Botany ◽

10.1071/sb97010 ◽

1998 ◽

Vol 11 (2) ◽

pp. 215 ◽

Cited By ~ 36

Author(s):

Mark W. Chase ◽

Antony V. Cox

Keyword(s):

Dna Sequences ◽

Information Transfer ◽

Large Data ◽

Large Data Sets ◽

Data Sets ◽

Pool Data ◽

Present Evidence ◽

The Individual ◽

Technological Improvements ◽

Taxonomic System

DNA sequences are well suited to international collaboration due to the universality of their simple nature, making them easy to be exchanged and understood. In addition, the output of modern automated sequencers is electronic, making the raw data themselves easy to exchange electronically. Software programs have also been significantly improved, and researchers are tending to focus on standardised ones, which contributes to increased ease of communication. The major problem confronting modern systematics is that large analyses, made possible by all the technological improvements in DNA sequencing and the ability of widely separated researchers to pool data, have been viewed as untenable, simply due to their size. We present evidence here that such large analyses are not as impractical as has been thought, particularly those that are combined analyses of multiple genes. When there is increased signal, as there is in many combined analyses, starting trees generally are much closer to the ultimate shortest trees than any of the individual analyses. Combined with increased ease of analysis, the large angiosperm matrices are providing congruent ideas about relationships, and this makes possible the initiation of the re-classification process, which should also utilise the capacity for rapid information transfer by electronic media. The first truly synthetic and phylogenetic angiosperm classification is in reach, and it should ideally involve all interested systematists in its production, making it also the first broadly collaborative taxonomic system.

Download Full-text

Implementation of Supervised Learning towards Optimizing Queries in Database Systems

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.b3531.129219 ◽

2019 ◽

Vol 9 (2) ◽

pp. 1182-1187

Keyword(s):

Machine Learning ◽

Supervised Learning ◽

Student Loans ◽

Large Data ◽

Database Systems ◽

Large Data Sets ◽

Data Sets ◽

Human Intervention ◽

Huge Data ◽

Future Direction

Machine learning is a technology which with accumulated data provides better decisions towards future applications. It is also the scientific study of algorithms implemented efficiently to perform a specific task without using explicit instructions. It may also be viewed as a subset of artificial intelligence in which it may be linked with the ability to automatically learn and improve from experience without being explicitly programmed. Its primary intention is to allow the computers learn automatically and produce more accurate results in order to identify profitable opportunities. Combining machine learning with AI and cognitive technologies can make it even more effective in processing large volumes human intervention or assistance and adjust actions accordingly. It may enable analyzing the huge data of information. It may also be linked to algorithm driven study towards improving the performance of the tasks. In such scenario, the techniques can be applied to judge and predict large data sets. The paper concerns the mechanism of supervised learning in the database systems, which would be self driven as well as secure. Also the citation of an organization dealing with student loans has been presented. The paper ends discussion, future direction and conclusion.

Download Full-text

Predicting Fault Slip via Transfer Learning

10.21203/rs.3.rs-700852/v1 ◽

2021 ◽

Author(s):

Kun Wang ◽

Christopher Johnson ◽

Kane Bennett ◽

Paul Johnson

Keyword(s):

Machine Learning ◽

Numerical Simulations ◽

Transfer Learning ◽

Laboratory Experiments ◽

Laboratory Data ◽

Fault Slip ◽

Geophysical Data ◽

Training Data ◽

Data Sets ◽

Earthquake Cycle

Abstract Data-driven machine-learning for predicting instantaneous and future fault-slip in laboratory experiments has recently progressed markedly due to large training data sets. In Earth however, earthquake interevent times range from 10's-100's of years and geophysical data typically exist for only a portion of an earthquake cycle. Sparse data presents a serious challenge to training machine learning models. Here we describe a transfer learning approach using numerical simulations to train a convolutional encoder-decoder that predicts fault-slip behavior in laboratory experiments. The model learns a mapping between acoustic emission histories and fault-slip from numerical simulations, and generalizes to produce accurate results using laboratory data. Notably slip-predictions markedly improve using the simulation-data trained-model and training the latent space using a portion of a single laboratory earthquake-cycle. The transfer learning results elucidate the potential of using models trained on numerical simulations and fine-tuned with small geophysical data sets for potential applications to faults in Earth.

Download Full-text

Machine learning in diachronic corpus phonology: mining verse data to infer trajectories in English phonotactics

Papers in Historical Phonology ◽

10.2218/pihph.3.2018.2878 ◽

2018 ◽

Vol 3 ◽

Author(s):

Andreas Baumann

Keyword(s):

Machine Learning ◽

Middle English ◽

Large Data ◽

Large Data Sets ◽

Machine Learning Techniques ◽

Data Sets ◽

Powerful Method ◽

K Nearest Neighbors ◽

Learning Techniques ◽

Standard Techniques

Machine learning is a powerful method when working with large data sets such as diachronic corpora. However, as opposed to standard techniques from inferential statistics like regression modeling, machine learning is less commonly used among phonological corpus linguists. This paper discusses three different machine learning techniques (K nearest neighbors classifiers; Naïve Bayes classifiers; artificial neural networks) and how they can be applied to diachronic corpus data to address specific phonological questions. To illustrate the methodology, I investigate Middle English schwa deletion and when and how it potentially triggered reduction of final /mb/ clusters in English.

Download Full-text

A system for analyzing large data sets using machine learning algorithms

Bulletin of Kharkov National Automobile and Highway University ◽

10.30977/bul.2219-5548.2021.94.0.142 ◽

2021 ◽

pp. 142

Author(s):

Sergey Pronin ◽

Mykhailo Miroshnichenko

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

Large Data ◽

Machine Learning Algorithms ◽

Large Data Sets ◽

Data Sets

A system for analyzing large data sets using machine learning algorithms

Download Full-text