scholarly journals Computational analysis of the digital footprint using machine learning and artificial intelligence

2021 ◽  
Vol 2094 (3) ◽  
pp. 032003
Author(s):  
V D Munister ◽  
A L Zolkin ◽  
V N Malikov ◽  
O V Kosnikova ◽  
I A Poskryakov

Abstract The article discusses the procedure of step-by-step formalization of the software model of the system for accounting and forecasting the effectiveness of employees of an IT enterprise The issue of control of social interaction is considered. The relational model from game theory based on a specific data model template is proposed. Generalization in form of highlighting the categorical apparatus of metrics that directly or indirectly affect the procedure for assessing work efficiency, downtime or incorrect use of a working device (computer) is proposed. An architectural model of the application is proposed and substantiated, a model of a system for working with metrics is determined, the implementation of the necessary analysis tools in form of a combination of various machine learning algorithms used in systems with a binary classification based on decision making by the operator (the object of analysis) is described. The article summarizes the economic effect of the implementation of this approach in employee control systems.

Diagnostics ◽  
2021 ◽  
Vol 11 (7) ◽  
pp. 1263
Author(s):  
Samy Ammari ◽  
Raoul Sallé de Chou ◽  
Tarek Assi ◽  
Mehdi Touat ◽  
Emilie Chouzenoux ◽  
...  

Anti-angiogenic therapy with bevacizumab is a widely used therapeutic option for recurrent glioblastoma (GBM). Nevertheless, the therapeutic response remains highly heterogeneous among GBM patients with discordant outcomes. Recent data have shown that radiomics, an advanced recent imaging analysis method, can help to predict both prognosis and therapy in a multitude of solid tumours. The objective of this study was to identify novel biomarkers, extracted from MRI and clinical data, which could predict overall survival (OS) and progression-free survival (PFS) in GBM patients treated with bevacizumab using machine-learning algorithms. In a cohort of 194 recurrent GBM patients (age range 18–80), radiomics data from pre-treatment T2 FLAIR and gadolinium-injected MRI images along with clinical features were analysed. Binary classification models for OS at 9, 12, and 15 months were evaluated. Our classification models successfully stratified the OS. The AUCs were equal to 0.78, 0.85, and 0.76 on the test sets (0.79, 0.82, and 0.87 on the training sets) for the 9-, 12-, and 15-month endpoints, respectively. Regressions yielded a C-index of 0.64 (0.74) for OS and 0.57 (0.69) for PFS. These results suggest that radiomics could assist in the elaboration of a predictive model for treatment selection in recurrent GBM patients.


2018 ◽  
Vol 7 (2.28) ◽  
pp. 306
Author(s):  
Manu Kohli

For business enterprises, supplier evaluation is a mission critical process. On ERP (Enterprise Resource Planning) applications such as SAP, the supplier evaluation process is performed by configuring a linear score model, however this approach has a limited success. Therefore, author in this paper has proposed a two-stage supplier evaluation model by integrating data from SAP application and ML algorithms. In the first stage, author has applied data extraction algorithm on SAP application to build a data model comprising of relevant features. In the second stage, each instance in the data model is classified, on a rank of 1 to 6, based on the supplier performance measurements such as on-time, on quality and as promised quantity features. Thereafter, author has applied various machine learning algorithms on training sample with multi-classification objective to allow algorithm to learn supplier ranking classification. Encouraging test results were observed when learning algorithms,(DT) and Support Vector Machine (SVM), were tested with more than 98 percent accuracy on test data sets. The application of supplier evaluation model proposed in the paper can therefore be generalised to any other other information management system, not only limited to SAP, that manages Procure to Pay process.  


2017 ◽  
Vol 7 (1) ◽  
Author(s):  
Jiamei Liu ◽  
Cheng Xu ◽  
Weifeng Yang ◽  
Yayun Shu ◽  
Weiwei Zheng ◽  
...  

Abstract Binary classification is a widely employed problem to facilitate the decisions on various biomedical big data questions, such as clinical drug trials between treated participants and controls, and genome-wide association studies (GWASs) between participants with or without a phenotype. A machine learning model is trained for this purpose by optimizing the power of discriminating samples from two groups. However, most of the classification algorithms tend to generate one locally optimal solution according to the input dataset and the mathematical presumptions of the dataset. Here we demonstrated from the aspects of both disease classification and feature selection that multiple different solutions may have similar classification performances. So the existing machine learning algorithms may have ignored a horde of fishes by catching only a good one. Since most of the existing machine learning algorithms generate a solution by optimizing a mathematical goal, it may be essential for understanding the biological mechanisms for the investigated classification question, by considering both the generated solution and the ignored ones.


2018 ◽  
Vol 25 (7) ◽  
pp. 855-861 ◽  
Author(s):  
Halil Kilicoglu ◽  
Graciela Rosemblat ◽  
Mario Malički ◽  
Gerben ter Riet

Abstract Objective To automatically recognize self-acknowledged limitations in clinical research publications to support efforts in improving research transparency. Methods To develop our recognition methods, we used a set of 8431 sentences from 1197 PubMed Central articles. A subset of these sentences was manually annotated for training/testing, and inter-annotator agreement was calculated. We cast the recognition problem as a binary classification task, in which we determine whether a given sentence from a publication discusses self-acknowledged limitations or not. We experimented with three methods: a rule-based approach based on document structure, supervised machine learning, and a semi-supervised method that uses self-training to expand the training set in order to improve classification performance. The machine learning algorithms used were logistic regression (LR) and support vector machines (SVM). Results Annotators had good agreement in labeling limitation sentences (Krippendorff’s α = 0.781). Of the three methods used, the rule-based method yielded the best performance with 91.5% accuracy (95% CI [90.1-92.9]), while self-training with SVM led to a small improvement over fully supervised learning (89.9%, 95% CI [88.4-91.4] vs 89.6%, 95% CI [88.1-91.1]). Conclusions The approach presented can be incorporated into the workflows of stakeholders focusing on research transparency to improve reporting of limitations in clinical studies.


Author(s):  
Z. Neili ◽  
M. Fezari ◽  
A. Redjati

The acquisition of Breath sounds (BS) signals from a human respiratory system with an electronic stethoscope, provide and offer prominent information which helps the doctors to diagnosis and classification of pulmonary diseases. Unfortunately, this BS signals with other biological signals have a non-stationary nature according to the variation of the lung volume, and this nature makes it difficult to analyze and classify between several diseases. In this study, we were focused on comparing the ability of the extreme learning machine (ELM) and k-nearest neighbour (K-nn) machine learning algorithms in the classification of adventitious and normal breath sounds. To do so, the empirical mode decomposition (EMD) was used in this work to analyze BS, this method is rarely used in the breath sounds analysis. After the EMD decomposition of the signals into Intrinsic Mode Functions (IMFs), the Hjorth descriptors (Activity) and Permutation Entropy (PE) features were extracted from each IMFs and combined for classification stage. The study has found that the combination of features (activity and PE) yielded an accuracy of 90.71%, 95% using ELM and K-nn respectively in binary classification (normal and abnormal breath sounds), and 83.57%, 86.42% in multiclass classification (five classes).


2021 ◽  
Author(s):  
Mihai Niculita

<p>Machine learning algorithms are increasingly used in geosciences for the detection of susceptibility modeling of certain landforms or processes. The increased availability of high-resolution data and the increase of available machine learning algorithms opens up the possibility of creating datasets for the training of models for automatic detection of specific landforms. In this study, we tested the usage of LiDAR DEMs for creating a dataset of labeled images representing shallow single event landslides in order to use them for the detection of other events. The R stat implementation of the keras high-level neural networks API was used to build and test the proposed approach. A 5m LiDAR DEM was cut in 25 by 25 pixels tiles, and the tiles that overlayed shallow single event landslides were labeled accordingly, while the tiles that did not contain landslides were randomly selected to be labeled as non-landslides. The binary classification approach was tested with 255 grey levels elevation images and 255 grey levels shading images, the shading approach giving better results. The presented study case shows the possibility of using machine learning in the landslide detection on high-resolution DEMs.</p>


Sensors ◽  
2021 ◽  
Vol 21 (13) ◽  
pp. 4519
Author(s):  
Livia Petrescu ◽  
Cătălin Petrescu ◽  
Ana Oprea ◽  
Oana Mitruț ◽  
Gabriela Moise ◽  
...  

This paper focuses on the binary classification of the emotion of fear, based on the physiological data and subjective responses stored in the DEAP dataset. We performed a mapping between the discrete and dimensional emotional information considering the participants’ ratings and extracted a substantial set of 40 types of features from the physiological data, which represented the input to various machine learning algorithms—Decision Trees, k-Nearest Neighbors, Support Vector Machine and artificial networks—accompanied by dimensionality reduction, feature selection and the tuning of the most relevant hyperparameters, boosting classification accuracy. The methodology we approached included tackling different situations, such as resolving the problem of having an imbalanced dataset through data augmentation, reducing overfitting, computing various metrics in order to obtain the most reliable classification scores and applying the Local Interpretable Model-Agnostic Explanations method for interpretation and for explaining predictions in a human-understandable manner. The results show that fear can be predicted very well (accuracies ranging from 91.7% using Gradient Boosting Trees to 93.5% using dimensionality reduction and Support Vector Machine) by extracting the most relevant features from the physiological data and by searching for the best parameters which maximize the machine learning algorithms’ classification scores.


2017 ◽  
Author(s):  
ZhiMin Xiao ◽  
Steve Higgins

Data analysis usually aims to identify a particular signal, such as an intervention effect. Conventional analyses often assume a specific data generation process, which suggests a theoretical model that best fits the data. Machine learning techniques do not make such an assumption. In fact, they encourage multiple models to compete on the same data. Applying logistic regression and machine learning algorithms to real and simulated datasets with different features of noise and signal, we demonstrate that no single model dominates others under all circumstances. By showing when different models shine or struggle, we argue it is both possible and important to conduct comparative analyses.


Sign in / Sign up

Export Citation Format

Share Document