Intelligent programming support system: machine learning feat. Fast development of secure programs.

Author(s):  
N.E. Romanov ◽  
◽  
K.E. Izrailov ◽  
V.V. Pokussov

The article is devoted to the field of software development. The considered scientific contradiction lies in the fact that, on the one hand, the use of manual labor of a programmer is necessary in this area, and on the other hand, the presence of a human factor negatively affects the safety of the resulting code. To resolve the contradiction, it is proposed to use machine learning, which is traditionally used to solve the problem of classification, regression, search for anomalies, clustering, generalization and search for associations. It is shown that the majority of publications on this solution are of a private nature and do not cover the entire spectrum of possibilities. Various ways of automating the programming process using solutions for the specified machine learning problems are considered and substantiated. The demand for a system that combines such methods is indicated; Also, for the first time, its author’s definition is introduced: «Intelligent Programming Support System – a computer automated system based on artificial intelligence technologies, the purpose of which is to help developers of program code in the interests of reducing and simplifying manual labor, as well as increasing the safety of the final product». A comparative analysis of automation methods based on machine learning is given according to 8 criteria that this intelligent system must meet. The ways of further continuation of the research are indicated.

2020 ◽  
pp. 1-11
Author(s):  
Jie Liu ◽  
Lin Lin ◽  
Xiufang Liang

The online English teaching system has certain requirements for the intelligent scoring system, and the most difficult stage of intelligent scoring in the English test is to score the English composition through the intelligent model. In order to improve the intelligence of English composition scoring, based on machine learning algorithms, this study combines intelligent image recognition technology to improve machine learning algorithms, and proposes an improved MSER-based character candidate region extraction algorithm and a convolutional neural network-based pseudo-character region filtering algorithm. In addition, in order to verify whether the algorithm model proposed in this paper meets the requirements of the group text, that is, to verify the feasibility of the algorithm, the performance of the model proposed in this study is analyzed through design experiments. Moreover, the basic conditions for composition scoring are input into the model as a constraint model. The research results show that the algorithm proposed in this paper has a certain practical effect, and it can be applied to the English assessment system and the online assessment system of the homework evaluation system algorithm system.


2020 ◽  
pp. 1-17
Author(s):  
Francisco Javier Balea-Fernandez ◽  
Beatriz Martinez-Vega ◽  
Samuel Ortega ◽  
Himar Fabelo ◽  
Raquel Leon ◽  
...  

Background: Sociodemographic data indicate the progressive increase in life expectancy and the prevalence of Alzheimer’s disease (AD). AD is raised as one of the greatest public health problems. Its etiology is twofold: on the one hand, non-modifiable factors and on the other, modifiable. Objective: This study aims to develop a processing framework based on machine learning (ML) and optimization algorithms to study sociodemographic, clinical, and analytical variables, selecting the best combination among them for an accurate discrimination between controls and subjects with major neurocognitive disorder (MNCD). Methods: This research is based on an observational-analytical design. Two research groups were established: MNCD group (n = 46) and control group (n = 38). ML and optimization algorithms were employed to automatically diagnose MNCD. Results: Twelve out of 37 variables were identified in the validation set as the most relevant for MNCD diagnosis. Sensitivity of 100%and specificity of 71%were achieved using a Random Forest classifier. Conclusion: ML is a potential tool for automatic prediction of MNCD which can be applied to relatively small preclinical and clinical data sets. These results can be interpreted to support the influence of the environment on the development of AD.


2021 ◽  
Vol 22 (S3) ◽  
Author(s):  
Junyi Li ◽  
Huinian Li ◽  
Xiao Ye ◽  
Li Zhang ◽  
Qingzhe Xu ◽  
...  

Abstract Background The prediction of long non-coding RNA (lncRNA) has attracted great attention from researchers, as more and more evidence indicate that various complex human diseases are closely related to lncRNAs. In the era of bio-med big data, in addition to the prediction of lncRNAs by biological experimental methods, many computational methods based on machine learning have been proposed to make better use of the sequence resources of lncRNAs. Results We developed the lncRNA prediction method by integrating information-entropy-based features and machine learning algorithms. We calculate generalized topological entropy and generate 6 novel features for lncRNA sequences. By employing these 6 features and other features such as open reading frame, we apply supporting vector machine, XGBoost and random forest algorithms to distinguish human lncRNAs. We compare our method with the one which has more K-mer features and results show that our method has higher area under the curve up to 99.7905%. Conclusions We develop an accurate and efficient method which has novel information entropy features to analyze and classify lncRNAs. Our method is also extendable for research on the other functional elements in DNA sequences.


2021 ◽  
Vol 13 (3) ◽  
pp. 408
Author(s):  
Charles Nickmilder ◽  
Anthony Tedde ◽  
Isabelle Dufrasne ◽  
Françoise Lessire ◽  
Bernard Tychon ◽  
...  

Accurate information about the available standing biomass on pastures is critical for the adequate management of grazing and its promotion to farmers. In this paper, machine learning models are developed to predict available biomass expressed as compressed sward height (CSH) from readily accessible meteorological, optical (Sentinel-2) and radar satellite data (Sentinel-1). This study assumed that combining heterogeneous data sources, data transformations and machine learning methods would improve the robustness and the accuracy of the developed models. A total of 72,795 records of CSH with a spatial positioning, collected in 2018 and 2019, were used and aggregated according to a pixel-like pattern. The resulting dataset was split into a training one with 11,625 pixellated records and an independent validation one with 4952 pixellated records. The models were trained with a 19-fold cross-validation. A wide range of performances was observed (with mean root mean square error (RMSE) of cross-validation ranging from 22.84 mm of CSH to infinite-like values), and the four best-performing models were a cubist, a glmnet, a neural network and a random forest. These models had an RMSE of independent validation lower than 20 mm of CSH at the pixel-level. To simulate the behavior of the model in a decision support system, performances at the paddock level were also studied. These were computed according to two scenarios: either the predictions were made at a sub-parcel level and then aggregated, or the data were aggregated at the parcel level and the predictions were made for these aggregated data. The results obtained in this study were more accurate than those found in the literature concerning pasture budgeting and grassland biomass evaluation. The training of the 124 models resulting from the described framework was part of the realization of a decision support system to help farmers in their daily decision making.


Author(s):  
Zuoshan Li

With the continuous progress of society, the level of science and technology of the country has made a leap forward development, the research energy of various industries on new science and technology continues to deepen, greatly promoting the promotion of science and technology. At the same time, with the increase in social pressure, more and more people pursue spiritual relaxation, and appropriate leisure and entertainment activities have gradually become a part of people’s life. Film plays an irreplaceable role in leisure and entertainment. Mainly from the background of the development of the film industry towards intelligent direction, and then use machine learning technology to study the application of film animation production and film virtual assets analysis and investigation. Based on the Internet of things technology, we also vigorously develop the ways and methods of visual expression of movies, and at the same time introduce new expression modes to promote the expression effect of the intelligent system. Finally, by comparing various algorithms in machine learning technology, the results of intelligent expression of random number forest algorithm in machine learning technology are more accurate. The system is also applied to 3D animation production to observe the measurement error of 3D motion data and facial expression data.


2021 ◽  
Vol 11 (13) ◽  
pp. 6237
Author(s):  
Azharul Islam ◽  
KyungHi Chang

Unstructured data from the internet constitute large sources of information, which need to be formatted in a user-friendly way. This research develops a model that classifies unstructured data from data mining into labeled data, and builds an informational and decision-making support system (DMSS). We often have assortments of information collected by mining data from various sources, where the key challenge is to extract valuable information. We observe substantial classification accuracy enhancement for our datasets with both machine learning and deep learning algorithms. The highest classification accuracy (99% in training, 96% in testing) was achieved from a Covid corpus which is processed by using a long short-term memory (LSTM). Furthermore, we conducted tests on large datasets relevant to the Disaster corpus, with an LSTM classification accuracy of 98%. In addition, random forest (RF), a machine learning algorithm, provides a reasonable 84% accuracy. This research’s main objective is to increase the application’s robustness by integrating intelligence into the developed DMSS, which provides insight into the user’s intent, despite dealing with a noisy dataset. Our designed model selects the random forest and stochastic gradient descent (SGD) algorithms’ F1 score, where the RF method outperforms by improving accuracy by 2% (to 83% from 81%) compared with a conventional method.


2018 ◽  
Vol 25 (11) ◽  
pp. 1481-1487 ◽  
Author(s):  
Vivek Kumar Singh ◽  
Utkarsh Shrivastava ◽  
Lina Bouayad ◽  
Balaji Padmanabhan ◽  
Anna Ialynytchev ◽  
...  

Abstract Objective Develop an approach, One-class-at-a-time, for triaging psychiatric patients using machine learning on textual patient records. Our approach aims to automate the triaging process and reduce expert effort while providing high classification reliability. Materials and Methods The One-class-at-a-time approach is a multistage cascading classification technique that achieves higher triage classification accuracy compared to traditional multiclass classifiers through 1) classifying one class at a time (or stage), and 2) identification and application of the highest accuracy classifier at each stage. The approach was evaluated using a unique dataset of 433 psychiatric patient records with a triage class label provided by “I2B2 challenge,” a recent competition in the medical informatics community. Results The One-class-at-a-time cascading classifier outperformed state-of-the-art classification techniques with overall classification accuracy of 77% among 4 classes, exceeding accuracies of existing multiclass classifiers. The approach also enabled highly accurate classification of individual classes—the severe and mild with 85% accuracy, moderate with 64% accuracy, and absent with 60% accuracy. Discussion The triaging of psychiatric cases is a challenging problem due to the lack of clear guidelines and protocols. Our work presents a machine learning approach using psychiatric records for triaging patients based on their severity condition. Conclusion The One-class-at-a-time cascading classifier can be used as a decision aid to reduce triaging effort of physicians and nurses, while providing a unique opportunity to involve experts at each stage to reduce false positive and further improve the system’s accuracy.


Sign in / Sign up

Export Citation Format

Share Document