scholarly journals Determining the Optimal Number of MEG Trials: A Machine Learning and Speech Decoding Perspective

Author(s):  
Debadatta Dash ◽  
Paul Ferrari ◽  
Saleem Malik ◽  
Albert Montillo ◽  
Joseph A. Maldjian ◽  
...  
2021 ◽  
Author(s):  
Ruijie Huang ◽  
Chenji Wei ◽  
Baohua Wang ◽  
Baozhu Li ◽  
Jian Yang ◽  
...  

Abstract Compared with conventional reservoir, the development efficiency of the carbonate reservoir is lower, because of the strong heterogeneity and complicated reservoir structure. How to accurately and quantitatively analyze development performance is critical to understand challenges faced, and to propose optimization plans to improve recovery. In the study, we develop a workflow to evaluate similarities and difference of well performance based on Machine Learning methods. A comprehensive Machine Learning evaluation approach for well performance is established by utilizing Principal Component Analysis (PCA) in combination with K-Means clustering. The multidimensional dataset used for analysis consists of over 15 years dynamic surveillance data of producers and static geology parameters of formation, such as oil/water/gas production, GOR, water cut (WC), porosity, permeability, thickness, and depth. This approach divides multidimensional data into several clusters by PCA and K-Means, and quantitatively evaluate the well performance based on clustering results. The approach is successfully developed to visualize (dis)similarities among dynamic and static data of heterogeneous carbonate reservoir, the optimal number of clusters of 27-dimension data is 4. This method provides a systematic framework for visually and quantitatively analyzing and evaluating the development performance of production wells. Reservoir engineers can efficiently propose targeted optimization measures based on the analysis results. This paper offers a reference case for well performance clustering and quantitative analysis and proposing optimization plans that will help engineers make better decision in similar situation.


2021 ◽  
Vol 3 (1) ◽  
pp. 49-55
Author(s):  
R. O. Tkachenko ◽  
◽  
I. V. Izonіn ◽  
V. M. Danylyk ◽  
V. Yu. Mykhalevych ◽  
...  

Improving prediction accuracy by artificial intelligence tools is an important task in various industries, economics, medicine. Ensemble learning is one of the possible options to solve this task. In particular, the construction of stacking models based on different machine learning methods, or using different parts of the existing data set demonstrates high prediction accuracy of the. However, the need for proper selection of ensemble members, their optimal parameters, etc., necessitates large time costs for the construction of such models. This paper proposes a slightly different approach to building a simple but effective ensemble method. The authors developed a new model of stacking of nonlinear SGTM neural-like structures, which is based on the use of only one type of ANN as an element base of the ensemble and the use of the same training sample for all members of the ensemble. This approach provides a number of advantages over the procedures for building ensembles based on different machine learning methods, at least in the direction of selecting the optimal parameters for each of them. In our case, a tuple of random hyperparameters for each individual member of the ensemble was used as the basis of ensemble. That is, the training of each combined SGTM neural-like structure with an additional RBF layer, as a separate member of the ensemble occurs using different, randomly selected values of RBF centers and centersfof mass. This provides the necessary variety of ensemble elements. Experimental studies on the effectiveness of the developed ensemble were conducted using a real data set. The task is to predict the amount of health insurance costs based on a number of independent attributes. The optimal number of ensemble members is determined experimentally, which provides the highest prediction accuracy. The results of the work of the developed ensemble are compared with the existing methods of this class. The highest prediction accuracy of the developed ensemble at satisfactory duration of procedure of its training is established.


2021 ◽  
Vol 9 (1) ◽  
pp. e001889
Author(s):  
Rodrigo M Carrillo-Larco ◽  
Manuel Castillo-Cara ◽  
Cecilia Anza-Ramirez ◽  
Antonio Bernabé-Ortiz

IntroductionWe aimed to identify clusters of people with type 2 diabetes mellitus (T2DM) and to assess whether the frequency of these clusters was consistent across selected countries in Latin America and the Caribbean (LAC).Research design and methodsWe analyzed 13 population-based national surveys in nine countries (n=8361). We used k-means to develop a clustering model; predictors were age, sex, body mass index (BMI), waist circumference (WC), systolic/diastolic blood pressure (SBP/DBP), and T2DM family history. The training data set included all surveys, and the clusters were then predicted in each country-year data set. We used Euclidean distance, elbow and silhouette plots to select the optimal number of clusters and described each cluster according to the underlying predictors (mean and proportions).ResultsThe optimal number of clusters was 4. Cluster 0 grouped more men and those with the highest mean SBP/DBP. Cluster 1 had the highest mean BMI and WC, as well as the largest proportion of T2DM family history. We observed the smallest values of all predictors in cluster 2. Cluster 3 had the highest mean age. When we reflected the four clusters in each country-year data set, a different distribution was observed. For example, cluster 3 was the most frequent in the training data set, and so it was in 7 out of 13 other country-year data sets.ConclusionsUsing unsupervised machine learning algorithms, it was possible to cluster people with T2DM from the general population in LAC; clusters showed unique profiles that could be used to identify the underlying characteristics of the T2DM population in LAC.


2021 ◽  
Vol 15 ◽  
Author(s):  
Cirelle K. Rosenblatt ◽  
Alexandra Harriss ◽  
Aliya-Nur Babul ◽  
Samuel A. Rosenblatt

Background: Concussion subtypes are typically organized into commonly affected symptom areas or a combination of affected systems, an approach that may be flawed by bias in conceptualization or the inherent limitations of interdisciplinary expertise.Objective: The purpose of this study was to determine whether a bottom-up, unsupervised, machine learning approach, could more accurately support concussion subtyping.Methods: Initial patient intake data as well as objective outcome measures including, the Patient-Reported Outcomes Measurement Information System (PROMIS), Dizziness Handicap Inventory (DHI), Pain Catastrophizing Scale (PCS), and Immediate Post-Concussion Assessment and Cognitive Testing Tool (ImPACT) were retrospectively extracted from the Advance Concussion Clinic's database. A correlation matrix and principal component analysis (PCA) were used to reduce the dimensionality of the dataset. Sklearn's agglomerative clustering algorithm was then applied, and the optimal number of clusters within the patient database were generated. Between-group comparisons among the formed clusters were performed using a Mann-Whitney U test.Results: Two hundred seventy-five patients within the clinics database were analyzed. Five distinct clusters emerged from the data when maximizing the Silhouette score (0.36) and minimizing the Davies-Bouldin score (0.83). Concussion subtypes derived demonstrated clinically distinct profiles, with statistically significant differences (p < 0.05) between all five clusters.Conclusion: This machine learning approach enabled the identification and characterization of five distinct concussion subtypes, which were best understood according to levels of complexity, ranging from Extremely Complex to Minimally Complex. Understanding concussion in terms of Complexity with the utilization of artificial intelligence, could provide a more accurate concussion classification or subtype approach; one that better reflects the true heterogeneity and complex system disruptions associated with mild traumatic brain injury.


Diagnostics ◽  
2021 ◽  
Vol 11 (12) ◽  
pp. 2288
Author(s):  
Kaixiang Su ◽  
Jiao Wu ◽  
Dongxiao Gu ◽  
Shanlin Yang ◽  
Shuyuan Deng ◽  
...  

Increasingly, machine learning methods have been applied to aid in diagnosis with good results. However, some complex models can confuse physicians because they are difficult to understand, while data differences across diagnostic tasks and institutions can cause model performance fluctuations. To address this challenge, we combined the Deep Ensemble Model (DEM) and tree-structured Parzen Estimator (TPE) and proposed an adaptive deep ensemble learning method (TPE-DEM) for dynamic evolving diagnostic task scenarios. Different from previous research that focuses on achieving better performance with a fixed structure model, our proposed model uses TPE to efficiently aggregate simple models more easily understood by physicians and require less training data. In addition, our proposed model can choose the optimal number of layers for the model and the type and number of basic learners to achieve the best performance in different diagnostic task scenarios based on the data distribution and characteristics of the current diagnostic task. We tested our model on one dataset constructed with a partner hospital and five UCI public datasets with different characteristics and volumes based on various diagnostic tasks. Our performance evaluation results show that our proposed model outperforms other baseline models on different datasets. Our study provides a novel approach for simple and understandable machine learning models in tasks with variable datasets and feature sets, and the findings have important implications for the application of machine learning models in computer-aided diagnosis.


2020 ◽  
pp. 135481662097695
Author(s):  
Jian-Wu Bi ◽  
Tian-Yu Han ◽  
Hui Li

This study explores how to select the optimal number of lagged inputs (NLIs) in international tourism demand forecasting. With international tourist arrivals at 10 European countries, the performances of eight machine learning models are evaluated using different NLIs. The results show that: (1) as NLIs increases, the error of most machine learning models first decreases rapidly and then tends to be stable (or fluctuates around a certain value) when NLIs reaches a certain cutoff point. The cutoff point is related to 12 and its multiples. This trend is not affected by the size of the test set; (2) for nonlinear and ensemble models, it is better to select one cycle of the data as the NLIs, while for linear models, multiple cycles are a better choice; (3) significantly different prediction results are obtained by different categories of models when the optimal NLIs are used.


2021 ◽  
Vol 10 (5) ◽  
pp. 972
Author(s):  
Jeeyae Choi ◽  
Hee-Tae Jung ◽  
Anastasiya Ferrell ◽  
Seoyoon Woo ◽  
Linda Haddad

Despite the harmful effect on health, e-cigarette and hookah smoking in youth in the U.S. has increased. Developing tailored e-cigarette and hookah cessation programs for youth is imperative. The aim of this study was to identify predictor variables such as social, mental, and environmental determinants that cause nicotine addiction in youth e-cigarette or hookah users and build nicotine addiction prediction models using machine learning algorithms. A total of 6511 participants were identified as ever having used e-cigarettes or hookah from the National Youth Tobacco Survey (2019) datasets. Prediction models were built by Random Forest with ReliefF and Least Absolute Shrinkage and Selection Operator (LASSO). ReliefF identified important predictor variables, and the Davies–Bouldin clustering evaluation index selected the optimal number of predictors for Random Forest. A total of 193 predictor variables were included in the final analysis. Performance of prediction models was measured by Root Mean Square Error (RMSE) and Confusion Matrix. The results suggested high performance of prediction. Identified predictor variables were aligned with previous research. The noble predictors found, such as ‘witnessed e-cigarette use in their household’ and ‘perception of their tobacco use’, could be used in public awareness or targeted e-cigarette and hookah youth education and for policymakers.


2020 ◽  
Vol 7 (Supplement_1) ◽  
pp. S78-S79
Author(s):  
Joshua C Herigon ◽  
Jonathan Hatoun ◽  
Louis Vernacchio

Abstract Background Antibiotics are the most commonly prescribed drugs for children with estimates that 30%-50% of outpatient antibiotic prescriptions are inappropriate. Most analyses of outpatient antibiotic prescribing practices do not examine patterns within individual clinicians’ prescribing practices. We sought to derive unique phenotypes of outpatient antibiotic prescribing practices using an unsupervised machine learning clustering algorithm. Methods We extracted diagnoses and prescribing data on all problem-focused visits with a physician or nurse practitioner between 6/11/2018 – 12/11/2018 for a state-wide association of pediatric practices across Massachusetts. Clinicians with fewer than 100 encounters were excluded. The proportion of encounters resulting in an antibiotic prescription were calculated. Proportions were stratified by diagnoses: otitis media (OM), pharyngitis, pneumonia (PNA), sinusitis, skin & soft tissue infection (SSTI), and urinary tract infection (UTI). We then applied consensus k-means clustering, a form of unsupervised machine learning, across all included clinicians to create clusters (or phenotypes) based on their prescribing rates for these 6 conditions. A scree plot was used to determine the optimal number of clusters. Results A total of 431 clinicians at 77 practices with 234,288 problem-focused visits were included (Table 1). Overall, 42,441 visits (18%) resulted in an antibiotic prescription. Individual clinician prescribing proportions ranged from 5% of visits up to 44%. The optimal number of clusters was determined to be four (designated alpha, beta, gamma, delta). Antibiotic prescribing rates were similar for each phenotype across AOM, pharyngitis, and pneumonia but differed substantially for sinusitis, SSTI, and UTI (Figure 1). The beta phenotype had the highest median rates of prescribing across all conditions while the delta phenotype had the lowest median prescribing rates except for UTI. Table 1. Patient demographics and clinician characteristics Figure 1. Novel phenotypes of antibiotic prescribing practices across six common conditions Conclusion Antibiotic prescribing varies by both condition and individual clinician. Clustering algorithms can be used to derive phenotypic antibiotic prescribing practices. Antimicrobial stewardship efforts may have a higher impact if tailored by antibiotic prescribing phenotype. Disclosures All Authors: No reported disclosures


Sensors ◽  
2021 ◽  
Vol 21 (19) ◽  
pp. 6594
Author(s):  
Anish Prasad ◽  
Carl Mofjeld ◽  
Yang Peng

With the advancement of machine learning, a growing number of mobile users rely on machine learning inference for making time-sensitive and safety-critical decisions. Therefore, the demand for high-quality and low-latency inference services at the network edge has become the key to modern intelligent society. This paper proposes a novel solution that jointly provisions machine learning models and dispatches inference requests to reduce inference latency on edge nodes. Existing solutions either direct inference requests to the nearest edge node to save network latency or balance edge nodes’ workload by reducing queuing and computing time. The proposed solution provisions each edge node with the optimal number and type of inference instances under a holistic consideration of networking, computing, and memory resources. Mobile users can thus be directed to utilize inference services on the edge nodes that offer minimal serving latency. The proposed solution has been implemented using TensorFlow Serving and Kubernetes on an edge cluster. Through simulation and testbed experiments under various system settings, the evaluation results showed that the joint strategy could consistently achieve lower latency than simply searching for the best edge node to serve inference requests.


Sign in / Sign up

Export Citation Format

Share Document