scholarly journals Automated Retraining of Machine Learning Models

Data is the most crucial component of a successful ML system. Once a machine learning model is developed, it gets obsolete over time due to presence of new input data being generated every second. In order to keep our predictions accurate we need to find a way to keep our models up to date. Our research work involves finding a mechanism which can retrain the model with new data automatically. This research also involves exploring the possibilities of automating machine learning processes. We started this project by training and testing our model using conventional machine learning methods. The outcome was then compared with the outcome of those experiments conducted using the AutoML methods like TPOT. This helped us in finding an efficient technique to retrain our models. These techniques can be used in areas where people do not deal with the actual working of a ML model but only require the outputs of ML processes

2021 ◽  
Vol 26 (1) ◽  
pp. 83-88
Author(s):  
Arjun Singh Saud ◽  
Subarna Shakya

Nowadays stock price prediction is an active area of research among machine learning researchers. One of the main problems with machine learning models is overfitting. Regularization techniques are widely used approaches to avoid over-fitted models. L2 regularization is one of the most popular and widely used regularization techniques. Regularization hyperparameter (ʎ) is one key parameter to be optimized for a well-generalized machine learning model. Hyperparameters can’t be learned by machine learning models during the learning process. We need to find their optimal value through experiments. This research work analyzed the L2 regularization hyperparameter used with a gated recurrent unit (GRU) network for stock price prediction. We experimented with five stocks from the Nepal Stock Exchange (NEPSE) and observed that stock price can be predicted with lower mean squared errors (MSEs) when the value of ʎ was around 0.0005. Therefore, this research paper recommended using ʎ=0.0005 with L2 regularization for stock price prediction.


2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Javier De Velasco Oriol ◽  
Edgar E. Vallejo ◽  
Karol Estrada ◽  
José Gerardo Taméz Peña ◽  
The Alzheimer’s Disease Neuroimaging Initiative

Abstract Background Late-Onset Alzheimer’s Disease (LOAD) is a leading form of dementia. There is no effective cure for LOAD, leaving the treatment efforts to depend on preventive cognitive therapies, which stand to benefit from the timely estimation of the risk of developing the disease. Fortunately, a growing number of Machine Learning methods that are well positioned to address this challenge are becoming available. Results We conducted systematic comparisons of representative Machine Learning models for predicting LOAD from genetic variation data provided by the Alzheimer’s Disease Neuroimaging Initiative (ADNI) cohort. Our experimental results demonstrate that the classification performance of the best models tested yielded ∼72% of area under the ROC curve. Conclusions Machine learning models are promising alternatives for estimating the genetic risk of LOAD. Systematic machine learning model selection also provides the opportunity to identify new genetic markers potentially associated with the disease.


2020 ◽  
Author(s):  
Shreya Reddy ◽  
Lisa Ewen ◽  
Pankti Patel ◽  
Prerak Patel ◽  
Ankit Kundal ◽  
...  

<p>As bots become more prevalent and smarter in the modern age of the internet, it becomes ever more important that they be identified and removed. Recent research has dictated that machine learning methods are accurate and the gold standard of bot identification on social media. Unfortunately, machine learning models do not come without their negative aspects such as lengthy training times, difficult feature selection, and overwhelming pre-processing tasks. To overcome these difficulties, we are proposing a blockchain framework for bot identification. At the current time, it is unknown how this method will perform, but it serves to prove the existence of an overwhelming gap of research under this area.<i></i></p>


2018 ◽  
Vol 211 ◽  
pp. 17009
Author(s):  
Natalia Espinoza Sepulveda ◽  
Jyoti Sinha

The development of technologies for the maintenance industry has taken an important role to meet the demanding challenges. One of the important challenges is to predict the defects, if any, in machines as early as possible to manage the machines downtime. The vibration-based condition monitoring (VCM) is well-known for this purpose but requires the human experience and expertise. The machine learning models using the intelligent systems and pattern recognition seem to be the future avenue for machine fault detection without the human expertise. Several such studies are published in the literature. This paper is also on the machine learning model for the different machine faults classification and detection. Here the time domain and frequency domain features derived from the measured machine vibration data are used separated in the development of the machine learning models using the artificial neutral network method. The effectiveness of both the time and frequency domain features based models are compared when they are applied to an experimental rig. The paper presents the proposed machine learning models and their performance in terms of the observations and results.


2019 ◽  
pp. 29-43
Author(s):  
Anastasiya A. Korepanova ◽  
◽  
Valerii D. Oliseenko ◽  
Maxim V. Abramov ◽  
Alexander L. Tulupyev ◽  
...  

The article describes the approach to solving the problem of comparing user profiles of different social networks and identifying those that belong to one person. An appropriate method is proposed based on a comparison of the social environment and the values of account profile attributes in two different social networks. The results of applying various machine learning models to solving this problem are compared. The novelty of the approach lies in the proposed new combination of various methods and application to new social networks. The practical significance of the study is to automate the process of determining the ownership of profiles in various social networks to one user. These results can be applied in the task of constructing a meta-profile of a user of an information system for the subsequent construction of a profile of his vulnerabilities, as well as in other studies devoted to social networks.


2022 ◽  
pp. 181-194
Author(s):  
Bala Krishna Priya G. ◽  
Jabeen Sultana ◽  
Usha Rani M.

Mining Telugu news data and categorizing based on public sentiments is quite important since a lot of fake news emerged with rise of social media. Identifying whether news text is positive, negative, or neutral and later classifying the data in which areas they fall like business, editorial, entertainment, nation, and sports is included throughout this research work. This research work proposes an efficient model by adopting machine learning classifiers to perform classification on Telugu news data. The results obtained by various machine-learning models are compared, and an efficient model is found, and it is observed that the proposed model outperformed with reference to accuracy, precision, recall, and F1-score.


2016 ◽  
Vol 7 (2) ◽  
pp. 43-71 ◽  
Author(s):  
Sangeeta Lal ◽  
Neetu Sardana ◽  
Ashish Sureka

Logging is an important yet tough decision for OSS developers. Machine-learning models are useful in improving several steps of OSS development, including logging. Several recent studies propose machine-learning models to predict logged code construct. The prediction performances of these models are limited due to the class-imbalance problem since the number of logged code constructs is small as compared to non-logged code constructs. No previous study analyzes the class-imbalance problem for logged code construct prediction. The authors first analyze the performances of J48, RF, and SVM classifiers for catch-blocks and if-blocks logged code constructs prediction on imbalanced datasets. Second, the authors propose LogIm, an ensemble and threshold-based machine-learning model. Third, the authors evaluate the performance of LogIm on three open-source projects. On average, LogIm model improves the performance of baseline classifiers, J48, RF, and SVM, by 7.38%, 9.24%, and 4.6% for catch-blocks, and 12.11%, 14.95%, and 19.13% for if-blocks logging prediction.


2020 ◽  
Vol 21 (1) ◽  
Author(s):  
Jacob Schreiber ◽  
Ritambhara Singh ◽  
Jeffrey Bilmes ◽  
William Stafford Noble

AbstractMachine learning models that predict genomic activity are most useful when they make accurate predictions across cell types. Here, we show that when the training and test sets contain the same genomic loci, the resulting model may falsely appear to perform well by effectively memorizing the average activity associated with each locus across the training cell types. We demonstrate this phenomenon in the context of predicting gene expression and chromatin domain boundaries, and we suggest methods to diagnose and avoid the pitfall. We anticipate that, as more data becomes available, future projects will increasingly risk suffering from this issue.


Significance It required arguably the single largest computational effort for a machine learning model to date, and is it capable of producing text at times indistinguishable from the work of a human author. This has generated considerable excitement about potentially transformative business applications -- and concerns about the system's weaknesses and possible misuse. Impacts Stereotypes and biases in machine learning models will become increasingly problematic as they are adopted by businesses and governments. The use of flawed AI tools that result in embarrassing failures risk cuts to public funding for AI research. Academia and industry face pressure to advance research into explainable AI, but progress is slow.


Sign in / Sign up

Export Citation Format

Share Document