scholarly journals Framework of Intelligent System for Machine Learning Algorithm Selection in Social Sciences

2022 ◽  
pp. 21-28
Author(s):  
Dijana Oreški ◽  

The ability to generate data has never been as powerful as today when three quintile bytes of data are generated daily. In the field of machine learning, a large number of algorithms have been developed, which can be used for intelligent data analysis and to solve prediction and descriptive problems in different domains. Developed algorithms have different effects on different problems.If one algorithmworks better on one dataset,the same algorithm may work worse on another data set. The reason is that each dataset has different features in terms of local and global characteristics. It is therefore imperative to know intrinsic algorithms behavior on different types of datasets andchoose the right algorithm for the problem solving. To address this problem, this papergives scientific contribution in meta learning field by proposing framework for identifying the specific characteristics of datasets in two domains of social sciences:education and business and develops meta models based on: ranking algorithms, calculating correlation of ranks, developing a multi-criteria model, two-component index and prediction based on machine learning algorithms. Each of the meta models serve as the basis for the development of intelligent system version. Application of such framework should include a comparative analysis of a large number of machine learning algorithms on a large number of datasetsfromsocial sciences.

2020 ◽  
pp. 1-11
Author(s):  
Jie Liu ◽  
Lin Lin ◽  
Xiufang Liang

The online English teaching system has certain requirements for the intelligent scoring system, and the most difficult stage of intelligent scoring in the English test is to score the English composition through the intelligent model. In order to improve the intelligence of English composition scoring, based on machine learning algorithms, this study combines intelligent image recognition technology to improve machine learning algorithms, and proposes an improved MSER-based character candidate region extraction algorithm and a convolutional neural network-based pseudo-character region filtering algorithm. In addition, in order to verify whether the algorithm model proposed in this paper meets the requirements of the group text, that is, to verify the feasibility of the algorithm, the performance of the model proposed in this study is analyzed through design experiments. Moreover, the basic conditions for composition scoring are input into the model as a constraint model. The research results show that the algorithm proposed in this paper has a certain practical effect, and it can be applied to the English assessment system and the online assessment system of the homework evaluation system algorithm system.


2020 ◽  
Vol 17 (9) ◽  
pp. 4294-4298
Author(s):  
B. R. Sunil Kumar ◽  
B. S. Siddhartha ◽  
S. N. Shwetha ◽  
K. Arpitha

This paper intends to use distinct machine learning algorithms and exploring its multi-features. The primary advantage of machine learning is, a machine learning algorithm can predict its work automatically by learning what to do with information. This paper reveals the concept of machine learning and its algorithms which can be used for different applications such as health care, sentiment analysis and many more. Sometimes the programmers will get confused which algorithm to apply for their applications. This paper provides an idea related to the algorithm used on the basis of how accurately it fits. Based on the collected data, one of the algorithms can be selected based upon its pros and cons. By considering the data set, the base model is developed, trained and tested. Then the trained model is ready for prediction and can be deployed on the basis of feasibility.


2021 ◽  
Author(s):  
Marc Raphael ◽  
Michael Robitaille ◽  
Jeff Byers ◽  
Joseph Christodoulides

Abstract Machine learning algorithms hold the promise of greatly improving live cell image analysis by way of (1) analyzing far more imagery than can be achieved by more traditional manual approaches and (2) by eliminating the subjective nature of researchers and diagnosticians selecting the cells or cell features to be included in the analyzed data set. Currently, however, even the most sophisticated model based or machine learning algorithms require user supervision, meaning the subjectivity problem is not removed but rather incorporated into the algorithm’s initial training steps and then repeatedly applied to the imagery. To address this roadblock, we have developed a self-supervised machine learning algorithm that recursively trains itself directly from the live cell imagery data, thus providing objective segmentation and quantification. The approach incorporates an optical flow algorithm component to self-label cell and background pixels for training, followed by the extraction of additional feature vectors for the automated generation of a cell/background classification model. Because it is self-trained, the software has no user-adjustable parameters and does not require curated training imagery. The algorithm was applied to automatically segment cells from their background for a variety of cell types and five commonly used imaging modalities - fluorescence, phase contrast, differential interference contrast (DIC), transmitted light and interference reflection microscopy (IRM). The approach is broadly applicable in that it enables completely automated cell segmentation for long-term live cell phenotyping applications, regardless of the input imagery’s optical modality, magnification or cell type.


2021 ◽  
Author(s):  
Michael C. Robitaille ◽  
Jeff M. Byers ◽  
Joseph A. Christodoulides ◽  
Marc P. Raphael

Machine learning algorithms hold the promise of greatly improving live cell image analysis by way of (1) analyzing far more imagery than can be achieved by more traditional manual approaches and (2) by eliminating the subjective nature of researchers and diagnosticians selecting the cells or cell features to be included in the analyzed data set. Currently, however, even the most sophisticated model based or machine learning algorithms require user supervision, meaning the subjectivity problem is not removed but rather incorporated into the algorithm's initial training steps and then repeatedly applied to the imagery. To address this roadblock, we have developed a self-supervised machine learning algorithm that recursively trains itself directly from the live cell imagery data, thus providing objective segmentation and quantification. The approach incorporates an optical flow algorithm component to self-label cell and background pixels for training, followed by the extraction of additional feature vectors for the automated generation of a cell/background classification model. Because it is self-trained, the software has no user-adjustable parameters and does not require curated training imagery. The algorithm was applied to automatically segment cells from their background for a variety of cell types and five commonly used imaging modalities - fluorescence, phase contrast, differential interference contrast (DIC), transmitted light and interference reflection microscopy (IRM). The approach is broadly applicable in that it enables completely automated cell segmentation for long-term live cell phenotyping applications, regardless of the input imagery's optical modality, magnification or cell type.


Machine learning is a branch of Artificial intelligence which provides algorithms that can learn from data and improve from experience, without human intervention. Now a day's many of the machine learning algorithms playing a vital role in data analytics. Such algorithms are possible to apply with the recent pandemic COVID situation across the globe. Machine learning algorithms are classified into 3 different groups based on the type of learning process, such as supervised learning, unsupervised learning, and reinforcement learning. By considering the medical observations on the COVID across the globe it has been discussed and concluded to analyze under the supervised learning process. The data set is acquired from the reliable source, it is processed and fed into the classification algorithms. Since learning behaviors are carried out by knowing the input data and expected output data. The data is labeled and has been classified based on labels. In the proposed work, three different algorithms are used to experiment with the COVID'19 dataset and compared for their efficiency and algorithm selection decision is made.


2021 ◽  
Author(s):  
Omar Alfarisi ◽  
Zeyar Aung ◽  
Mohamed Sassi

For defining the optimal machine learning algorithm, the decision was not easy for which we shall choose. To help future researchers, we describe in this paper the optimal among the best of the algorithms. We built a synthetic data set and performed the supervised machine learning runs for five different algorithms. For heterogeneity, we identified Random Forest, among others, to be the best algorithm.


2021 ◽  
Author(s):  
Aayushi Rathore ◽  
Anu Saini ◽  
Navjot Kaur ◽  
Aparna Singh ◽  
Ojasvi Dutta ◽  
...  

ABSTRACTSepsis is a severe infectious disease with high mortality, and it occurs when chemicals released in the bloodstream to fight an infection trigger inflammation throughout the body and it can cause a cascade of changes that damage multiple organ systems, leading them to fail, even resulting in death. In order to reduce the possibility of sepsis or infection antiseptics are used and process is known as antisepsis. Antiseptic peptides (ASPs) show properties similar to antigram-negative peptides, antigram-positive peptides and many more. Machine learning algorithms are useful in screening and identification of therapeutic peptides and thus provide initial filters or built confidence before using time consuming and laborious experimental approaches. In this study, various machine learning algorithms like Support Vector Machine (SVM), Random Forest (RF), K-Nearest Neighbour (KNN) and Logistic Regression (LR) were evaluated for prediction of ASPs. Moreover, the characteristics physicochemical features of ASPs were also explored to use them in machine learning. Both manual and automatic feature selection methodology was employed to achieve best performance of machine learning algorithms. A 5-fold cross validation and independent data set validation proved RF as the best model for prediction of ASPs. Our RF model showed an accuracy of 97%, Matthew’s Correlation Coefficient (MCC) of 0.93, which are indication of a robust and good model. To our knowledge this is the first attempt to build a machine learning classifier for prediction of ASPs.


2019 ◽  
Vol 40 (Supplement_1) ◽  
Author(s):  
J.-M Gregoire ◽  
N Subramanian ◽  
D Papazian ◽  
H Bersini

Abstract Background Forecasting atrial fibrillation (AF) a few minutes before its onset has been studied, mainly based on heart rate variability parameters, derived from 24-hour ECG Holter monitorings. However, these studies have shown conflicting, non-clinically applicable results. Nowadays, machine learning algorithms have proven their ability to anticipate events. Therefore, forecasting AF before its onset should be (re)assessed using machine learning techniques. A reliable forecasting could improve results of preventive pacing in patients with cardiac electronic implanted devices (CEID). Purpose To forecast an oncoming AF episode in individual patients using machine learning techniques. To evaluate the effect if the onset of an AF episode can be forecasted on longer time frames. Methods The totality of the raw data of a data set of 10484 ECG Holter monitorings was retrospectively analyzed and all AF episodes were annotated. Onset of each AF episode was determined with a precision of 5 msec. We only took AF events into consideration if they lasted longer than 30 seconds. Of all patients in the dataset, 140 presented paroxysmal AF (286 recorded AF episodes). We only used RR intervals to predict the presence of AF. We developed two different types of machine learning algorithms with different computational power requirements: a “dynamic” deep and recurrent neural net (RNN) and a “static” decision-tree with adaboost (boosting trees) more suitable for embedded devices. These algorithms were trained on one set of patients (around 90%) and tested on the remaining set of patients (around 10%). Results The performance figures are summarized in the table. Both algorithms can be tuned to increase their specificity (at a loss of sensitivity) or vice versa, depending on the objective. Performance of forecasting algorithms RR-distance Boosting trees AUC RNN AUC 30–1 RR-Interval(s) before an AF event 97.1% 98.77% 60–31 RR-Intervals before an AF event 97.5% 99.1% 90–61 RR-Intervals before an AF event 96.9% 99.1% 120–91 RR-Inervals before an AF event 98.2% 98.9% AUC for Area Under ROC Curves. Conclusion Based upon this retrospective study, we show that AF can be forecasted on an individual level with high predictive power using machine learning algorithm, with little drop-off of predictive value within the studied distances (1–120 RR intervals before a potential AF episode). We believe that the embedding of our new algorithm(s) in CEID's could open the way to innovative therapies that significantly decrease AF burden in selected implanted patients.


Author(s):  
John Yearwood ◽  
Adil Bagirov ◽  
Andrei V. Kelarev

The applications of machine learning algorithms to the analysis of data sets of DNA sequences are very important. The present chapter is devoted to the experimental investigation of applications of several machine learning algorithms for the analysis of a JLA data set consisting of DNA sequences derived from non-coding segments in the junction of the large single copy region and inverted repeat A of the chloroplast genome in Eucalyptus collected by Australian biologists. Data sets of this sort represent a new situation, where sophisticated alignment scores have to be used as a measure of similarity. The alignment scores do not satisfy properties of the Minkowski metric, and new machine learning approaches have to be investigated. The authors’ experiments show that machine learning algorithms based on local alignment scores achieve very good agreement with known biological classes for this data set. A new machine learning algorithm based on graph partitioning performed best for clustering of the JLA data set. Our novel k-committees algorithm produced most accurate results for classification. Two new examples of synthetic data sets demonstrate that the authors’ k-committees algorithm can outperform both the Nearest Neighbour and k-medoids algorithms simultaneously.


Nature is an important phenomenon of this physical world. This nature includes plants, trees, animal, humans and several other organisms. Amongst them plants are the most important organisms in the world. They are specialist of creating their food by themselves and also they are the notable components in food chain. They also serve the nature and its organisms in a tremendous way. Hence it is necessary to protect our nature in an efficient manner so as to maintain the food chain. Our technology development has given much advancement in the field of agriculture. This paper deals with the analysis of various machine learning algorithms, by applying the algorithms on the plant data set. Sample sizes of collected data set are used to train the algorithm and the results are evaluated accordingly to estimate the better implementation of machine learning algorithm.


Sign in / Sign up

Export Citation Format

Share Document