machine learning method
Recently Published Documents


TOTAL DOCUMENTS

934
(FIVE YEARS 643)

H-INDEX

26
(FIVE YEARS 11)

2022 ◽  
Vol 10 (4) ◽  
pp. 617-623
Author(s):  
Silvia Elsa Suryana ◽  
Budi Warsito ◽  
Suparti Suparti

Telemarketing is another form of marketing which is conducted via telephone. Bank can use telemarketing to offer its products such as term deposit. One of the most important strategy to the success of telemarketing is opting the potential customer to create effective telemarketing. Predicting the success of telemarketing can use machine learning. Gradient boosting is machine learning method with advanced decision tree. Gardient boosting involves many classification trees which are continually upgraded from previous tree. The optimal classification result cannot be separated from the role of the optimal hyperparameter.  Hyperopt is Python library that can be used to tune hyperparameter effectively because it uses Bayesian optimization. Hyperopt uses hyperparameter prior distribution to find optimal hyperparameter. Data in this study including 20 independent variables and binary dependent variable which has ‘yes’ and ‘no’ classes. The study showed that gradient boosting reached classification accuracy up to 90,39%, precision 94,91%, and AUC 0,939. These values describe gradient boosting method is able to predict both classes ‘yes’ and ‘no’ relatively accurate.


2022 ◽  
Author(s):  
Henry Han ◽  
Tianyu Zhang ◽  
Mary Lauren Benton ◽  
Chun Li ◽  
Juan Wang ◽  
...  

Single-cell RNA (scRNA-seq) sequencing technologies trigger the study of individual cell gene expression and reveal the diversity within cell populations. To measure cell-to-cell similarity based on their transcription and gene expression, many dimension reduction methods are employed to retrieve the corresponding low-dimensional embeddings of input scRNA-seq data to conduct clustering. However, the methods lack explainability and may not perform well with scRNA-seq data because they are often migrated from other fields and not customized for high-dimensional sparse scRNA-seq data. In this study, we propose an explainable t-SNE: cell-driven t-SNE (c-TSNE) that fuses the cell differences reflected from biologically meaningful distance metrics for input scRNA-seq data. Our study shows that the proposed method not only enhances the interpretation of the original t-SNE visualization for scRNA-seq data but also demonstrates favorable single cell segregation performance on benchmark datasets compared to the state-of-the-art peers. The robustness analysis shows that the proposed cell-driven t-SNE demonstrates robustness to dropout and noise in dimension reduction and clustering. It provides a novel and practical way to investigate the interpretability of t-SNE in scRNA-seq data analysis. Unlike the general assumption that the explainanbility of a machine learning method needs to compromise with the learning efficiency, the proposed explainable t-SNE improves both clustering efficiency and explainanbility in scRNA-seq analysis. More importantly, our work suggests that widely used t-SNE can be easily misused in the existing scRNA-seq analysis, because its default Euclidean distance can bring biases or meaningless results in cell difference evaluation for high-dimensional sparse scRNA-seq data. To the best of our knowledge, it is the first explainable t-SNE proposed in scRNA-seq analysis and will inspire other explainable machine learning method development in the field.


PLoS ONE ◽  
2022 ◽  
Vol 17 (1) ◽  
pp. e0262131
Author(s):  
Adil Aslam Mir ◽  
Kimberlee Jane Kearfott ◽  
Fatih Vehbi Çelebi ◽  
Muhammad Rafique

A new methodology, imputation by feature importance (IBFI), is studied that can be applied to any machine learning method to efficiently fill in any missing or irregularly sampled data. It applies to data missing completely at random (MCAR), missing not at random (MNAR), and missing at random (MAR). IBFI utilizes the feature importance and iteratively imputes missing values using any base learning algorithm. For this work, IBFI is tested on soil radon gas concentration (SRGC) data. XGBoost is used as the learning algorithm and missing data are simulated using R for different missingness scenarios. IBFI is based on the physically meaningful assumption that SRGC depends upon environmental parameters such as temperature and relative humidity. This assumption leads to a model obtained from the complete multivariate series where the controls are available by taking the attribute of interest as a response variable. IBFI is tested against other frequently used imputation methods, namely mean, median, mode, predictive mean matching (PMM), and hot-deck procedures. The performance of the different imputation methods was assessed using root mean squared error (RMSE), mean squared log error (MSLE), mean absolute percentage error (MAPE), percent bias (PB), and mean squared error (MSE) statistics. The imputation process requires more attention when multiple variables are missing in different samples, resulting in challenges to machine learning methods because some controls are missing. IBFI appears to have an advantage in such circumstances. For testing IBFI, Radon Time Series Data (RTS) has been used and data was collected from 1st March 2017 to the 11th of May 2018, including 4 seismic activities that have taken place during the data collection time.


2022 ◽  
Author(s):  
Wanxin Li ◽  
Lila Kari ◽  
Yaoliang Yu ◽  
Laura A Hug

We propose MT-MAG, a novel machine learning-based taxonomic assignment tool for hierarchically-structured local classification of metagenome-assembled genomes (MAGs). MT-MAG is capable of classifying large and diverse real metagenomic datasets, having analyzed for this study a total of 240 Gbp of data in the training set, and 7 Gbp of data in the test set. MT-MAG is, to the best of our knowledge, the first machine learning method for taxonomic assignment of metagenomic data that offers a "partial classification" option. MT-MAG outputs complete or a partial classification paths, and interpretable numerical classification confidences of its classifications, at all taxonomic ranks. MT-MAG is able to completely classify 48% more sequences than DeepMicrobes to the Species level (the only comparable taxonomic rank for DeepMicrobes), and it outperforms DeepMicrobes by an average of 33% in weighted accuracy, and by 89% in constrained accuracy.


Mathematics ◽  
2022 ◽  
Vol 10 (2) ◽  
pp. 214
Author(s):  
Javier Bilbao ◽  
Eugenio Bravo ◽  
Olatz García ◽  
Carolina Rebollar ◽  
Concepción Varela

This article deals with the optimization of the operation of hybrid microgrids. Both the problem of controlling the management of load sharing between the different generators and energy storage and possible solutions for the integration of the microgrid into the electricity market will be discussed. Solar and wind energy as well as hybrid storage with hydrogen, as renewable sources, will be considered, which allows management of the energy balance on different time scales. The Machine Learning method of Decision Trees, combined with ensemble methods, will also be introduced to study the optimization of microgrids. The conclusions obtained indicate that the development of suitable controllers can facilitate a competitive participation of renewable energies and the integration of microgrids in the electricity system.


2022 ◽  
Author(s):  
Paul Krueger ◽  
Frederick Callaway ◽  
Sayan Gul ◽  
Tom Griffiths ◽  
Falk Lieder

For computationally limited agents such as humans, perfectly rational decision-making is almost always out of reach. Instead, people may rely on computationally frugal heuristics that usually yield good outcomes. Although previous research has identified many such heuristics, discovering good heuristics and predicting when they will be used remains challenging. Here, we present a machine learning method that identifies the best heuristics to use in any given situation. To demonstrate the generalizability and accuracy of our method, we compare the strategies it discovers against those used by people across a wide range of multi-alternative risky choice environments in a behavioral experiment that is an order of magnitude larger than any previous experiments of its type. Our method rediscovered known heuristics, identifying them as rational strategies for specific environments, and discovered novel heuristics that had been previously overlooked. Our results show that people adapt their decision strategies to the structure of the environment and generally make good use of their limited cognitive resources, although they tend to collect too little information and their strategy choices do not always fully exploit the structure of the environment.


Author(s):  
Liuchang Xu ◽  
Jie Wang ◽  
Dayu Xu ◽  
Liang Xu

Consumer financial fraud has become a serious problem because it often causes victims to suffer economic, physical, mental, social, and legal harm. Identifying which individuals are more likely to be scammed may mitigate the threat posed by consumer financial fraud. Based on a two-stage conceptual framework, this study integrated various individual factors in a nationwide survey (36,202 participants) to construct fraud exposure recognition (FER) and fraud victimhood recognition (FVR) models by utilizing a machine learning method. The FER model performed well (f1 = 0.727), and model interpretation indicated that migration status, financial status, urbanicity, and age have good predictive effects on fraud exposure in the Chinese context, whereas the FVR model shows a low predictive effect (f1 = 0.565), reminding us to consider more psychological factors in future work. This research provides an important reference for the analysis of individual differences among people vulnerable to consumer fraud.


Sign in / Sign up

Export Citation Format

Share Document