scholarly journals Identification of Processor’s Architecture of Executable Code Based on Machine Learning. Part 1. Frequency Byte Model

2020 ◽  
Vol 6 (1) ◽  
pp. 77-85 ◽  
Author(s):  
M. Buinevich ◽  
K. Izrailov

This article shows us the study results of a method for identifying the processor architecture of an executable code based on machine learning. In the first part of the article we see an overview of existing solutions for machine code identifying and we see how the author makes a new method assumption. The author considers features of the machine code instructions and build its frequency-byte model. There is a processor architecture identification scheme, which is based on this model. Apart from that we see the frequency signatures which are provided for the following Top 10 processor architectures: amd64, arm64, armel, armhf, i386, mips, mips64el, mipsel, ppc64el, s390x.

2020 ◽  
Vol 6 (2) ◽  
pp. 104-112 ◽  
Author(s):  
M. Buinevich ◽  
K. Izrailov

This article shows us the study results of a method for identifying the processor architecture of an executable code based on machine learning. In the second part of the series of articles, a three-stage scheme of the method and the corresponding software are synthesized. The functional and information layer of the architecture of the tool, as well as its operation modes, are described. Basic testing of the tool is carried out and the results of its work are given. By the example of identification of files with machine code of various architectures, the efficiency of the proposed method and means is substantiated.


2020 ◽  
Vol 6 (3) ◽  
pp. 48-57
Author(s):  
M. Buinevich ◽  
K. Izrailov

The article presents the author's method testing results for identifying the processor architecture of the executable code based on machine learning. In the third final part of the cycle, its qualitative indicators are determined: accuracy, completeness and F-measure for the executable files of the Debian build. There are investigated the applicability limits of the architecture identification method for four conditions: the file header absence, different sizes of machine code, partial code destruction, and the presence of instructions from several architectures. We can observe the identified disadvantages of the proposed method and ways to eliminate them, as well as the further direction of its development.


Author(s):  
Yu Shao ◽  
Xinyue Wang ◽  
Wenjie Song ◽  
Sobia Ilyas ◽  
Haibo Guo ◽  
...  

With the increasing aging population in modern society, falls as well as fall-induced injuries in elderly people become one of the major public health problems. This study proposes a classification framework that uses floor vibrations to detect fall events as well as distinguish different fall postures. A scaled 3D-printed model with twelve fully adjustable joints that can simulate human body movement was built to generate human fall data. The mass proportion of a human body takes was carefully studied and was reflected in the model. Object drops, human falling tests were carried out and the vibration signature generated in the floor was recorded for analyses. Machine learning algorithms including K-means algorithm and K nearest neighbor algorithm were introduced in the classification process. Three classifiers (human walking versus human fall, human fall versus object drop, human falls from different postures) were developed in this study. Results showed that the three proposed classifiers can achieve the accuracy of 100, 85, and 91%. This paper developed a framework of using floor vibration to build the pattern recognition system in detecting human falls based on a machine learning approach.


Atmosphere ◽  
2021 ◽  
Vol 12 (1) ◽  
pp. 109
Author(s):  
Ashima Malik ◽  
Megha Rajam Rao ◽  
Nandini Puppala ◽  
Prathusha Koouri ◽  
Venkata Anil Kumar Thota ◽  
...  

Over the years, rampant wildfires have plagued the state of California, creating economic and environmental loss. In 2018, wildfires cost nearly 800 million dollars in economic loss and claimed more than 100 lives in California. Over 1.6 million acres of land has burned and caused large sums of environmental damage. Although, recently, researchers have introduced machine learning models and algorithms in predicting the wildfire risks, these results focused on special perspectives and were restricted to a limited number of data parameters. In this paper, we have proposed two data-driven machine learning approaches based on random forest models to predict the wildfire risk at areas near Monticello and Winters, California. This study demonstrated how the models were developed and applied with comprehensive data parameters such as powerlines, terrain, and vegetation in different perspectives that improved the spatial and temporal accuracy in predicting the risk of wildfire including fire ignition. The combined model uses the spatial and the temporal parameters as a single combined dataset to train and predict the fire risk, whereas the ensemble model was fed separate parameters that were later stacked to work as a single model. Our experiment shows that the combined model produced better results compared to the ensemble of random forest models on separate spatial data in terms of accuracy. The models were validated with Receiver Operating Characteristic (ROC) curves, learning curves, and evaluation metrics such as: accuracy, confusion matrices, and classification report. The study results showed and achieved cutting-edge accuracy of 92% in predicting the wildfire risks, including ignition by utilizing the regional spatial and temporal data along with standard data parameters in Northern California.


2021 ◽  
Vol 13 (14) ◽  
pp. 2848
Author(s):  
Hao Sun ◽  
Qian Xu

Obtaining large-scale, long-term, and spatial continuous soil moisture (SM) data is crucial for climate change, hydrology, and water resource management, etc. ESA CCI SM is such a large-scale and long-term SM (longer than 40 years until now). However, there exist data gaps, especially for the area of China, due to the limitations in remote sensing of SM such as complex topography, human-induced radio frequency interference (RFI), and vegetation disturbances, etc. The data gaps make the CCI SM data cannot achieve spatial continuity, which entails the study of gap-filling methods. In order to develop suitable methods to fill the gaps of CCI SM in the whole area of China, we compared typical Machine Learning (ML) methods, including Random Forest method (RF), Feedforward Neural Network method (FNN), and Generalized Linear Model (GLM) with a geostatistical method, i.e., Ordinary Kriging (OK) in this study. More than 30 years of passive–active combined CCI SM from 1982 to 2018 and other biophysical variables such as Normalized Difference Vegetation Index (NDVI), precipitation, air temperature, Digital Elevation Model (DEM), soil type, and in situ SM from International Soil Moisture Network (ISMN) were utilized in this study. Results indicated that: 1) the data gap of CCI SM is frequent in China, which is found not only in cold seasons and areas but also in warm seasons and areas. The ratio of gap pixel numbers to the whole pixel numbers can be greater than 80%, and its average is around 40%. 2) ML methods can fill the gaps of CCI SM all up. Among the ML methods, RF had the best performance in fitting the relationship between CCI SM and biophysical variables. 3) Over simulated gap areas, RF had a comparable performance with OK, and they outperformed the FNN and GLM methods greatly. 4) Over in situ SM networks, RF achieved better performance than the OK method. 5) We also explored various strategies for gap-filling CCI SM. Results demonstrated that the strategy of constructing a monthly model with one RF for simulating monthly average SM and another RF for simulating monthly SM disturbance achieved the best performance. Such strategy combining with the ML method such as the RF is suggested in this study for filling the gaps of CCI SM in China.


Author(s):  
V. I. Khirkhasova ◽  

The paper deals with modification of cement composite and concrete with nanocellulose in low and high density. The author presents the study results of the influence of nanocellulose on the cement composite hardening process, as well as the physical and mechanical properties of heavy concrete. The influence of the used additive on the rheological and strength characteristics of concrete is revealed. A new method is proposed to improve the material performance.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Haoran Zhu ◽  
Lei Lei

PurposePrevious research concerning automatic extraction of research topics mostly used rule-based or topic modeling methods, which were challenged due to the limited rules, the interpretability issue and the heavy dependence on human judgment. This study aims to address these issues with the proposal of a new method that integrates machine learning models with linguistic features for the identification of research topics.Design/methodology/approachFirst, dependency relations were used to extract noun phrases from research article texts. Second, the extracted noun phrases were classified into topics and non-topics via machine learning models and linguistic and bibliometric features. Lastly, a trend analysis was performed to identify hot research topics, i.e. topics with increasing popularity.FindingsThe new method was experimented on a large dataset of COVID-19 research articles and achieved satisfactory results in terms of f-measures, accuracy and AUC values. Hot topics of COVID-19 research were also detected based on the classification results.Originality/valueThis study demonstrates that information retrieval methods can help researchers gain a better understanding of the latest trends in both COVID-19 and other research areas. The findings are significant to both researchers and policymakers.


Sign in / Sign up

Export Citation Format

Share Document