Machine learning-based software classification scheme for efficient program similarity analysis

Computing good specification and invariants is key to effectiveand efficient program verification. In this talk, I will describeour experiences in using machine learning techniques (Bayesianinference, SVMs) for computing specifications and invariantsuseful for program verification. The first project Merlin usesBayesian inference in order to automatically infer securityspecifications of programs. A novel feature of Merlin is that itcan infer specifications even when the code under analysis givesrise to conflicting constraints, a situation that typicallyoccurs when there are bugs. We have used Merlin to infer securityspecifications of 10 large business critical webapplications. Furthermore, we show that these specifications canbe used to detect new information flow security vulnerabilitiesin these applications.In the second project Interpol, we show how interpolants can beviewed as classifiers in supervised machine learning. This viewhas several advantages: First, we are able to use off-the-shelfclassification techniques, in particular support vectormachines (SVMs), for interpolation. Second, we show that SVMs canfind relevant predicates for a number of benchmarks. Sinceclassification algorithms are predictive, the interpolantscomputed via classification are likely to be relevant predicatesor invariants. Finally, the machine learning view also enables usto handle superficial non-linearities. Even if the underlyingproblem structure is linear, the symbolic constraints can give animpression that we are solving a non-linear problem. Sincelearning algorithms try to mine the underlying structuredirectly, we can discover the linear structure for suchproblems. We demonstrate the feasibility of Interpol viaexperiments over benchmarks from various papers on programverification.

Download Full-text

Computational timbre and tonal system similarity analysis of the music of Northern Myanmar-based Kachin compared to Xinjiang-based Uyghur ethnic groups

10.31235/osf.io/83hpr ◽

2021 ◽

Author(s):

Rolf Bader ◽

Michael Blaß ◽

Jonas Franke

Keyword(s):

Machine Learning ◽

Feature Extraction ◽

Standard Deviation ◽

Ethnic Groups ◽

Western China ◽

Similarity Analysis ◽

Self Organizing Map ◽

Musical Piece ◽

Sound Recordings ◽

Spectral Centroid

The music of Northern Myanmar Kachin ethnic group is compared to the music of western China, Xijiang based Uyghur music, using timbre and pitch feature extraction and machine learning. Although separated by Tibet, the muqam tradition of Xinjiang might be found in Kachin music due to myths of Kachin origin, as well as linguistic similarities, e.g., the Kachin term 'makan' for a musical piece. Extractions were performed using the apollon and COMSAR (Computational Music and Sound Archiving) frameworks, on which the Ethnographic Sound Recordings Archive (ESRA) is based, using ethnographic recordings from ESRA next to additional pieces. In terms of pitch, tonal systems were compared using Kohonen self-organizing map (SOM), which clearly clusters Kachin and Uyghur musical pieces. This is mainly caused by the Xinjiang muqam music showing just fifth and fourth, while Kachin pieces tend to have a higher fifth and fourth, next to other dissimilarities. Also, the timbre features of spectral centroid and spectral sharpness standard deviation clearly tells Uyghur from Kachin pieces, where Uyghur music shows much larger deviations. Although more features will be compared in the future, like rhythm or melody, these already strong findings might introduce an alternative comparison methodology of ethnic groups beyond traditional linguistic definitions.

Download Full-text

A machine-learning phase classification scheme for anomaly detection in signals with periodic characteristics

EURASIP Journal on Advances in Signal Processing ◽

10.1186/s13634-019-0619-3 ◽

2019 ◽

Vol 2019 (1) ◽

Author(s):

Lia Ahrens ◽

Julian Ahrens ◽

Hans D. Schotten

Keyword(s):

Machine Learning ◽

Anomaly Detection ◽

Classification Scheme ◽

Learning Phase ◽

Phase Classification

Download Full-text

Machine Learning-Based Jamming Classification Scheme for Real GPS L1 C/A Signal

The Journal of Korean Institute of Communications and Information Sciences ◽

10.7840/kics.2021.46.11.1804 ◽

2021 ◽

Vol 46 (11) ◽

pp. 1804-1806

Author(s):

Seungsoo Yoo ◽

Cheon Sig Sin ◽

Sun Yong Kim

Keyword(s):

Machine Learning ◽

Classification Scheme

Download Full-text

Developing a Novel Machine Learning-Based Classification Scheme for Predicting SPCs in Breast Cancer Survivors

Frontiers in Genetics ◽

10.3389/fgene.2019.00848 ◽

2019 ◽

Vol 10 ◽

Cited By ~ 5

Author(s):

Chi-Chang Chang ◽

Ssu-Han Chen

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Cancer Survivors ◽

Breast Cancer Survivors ◽

Classification Scheme

Download Full-text

Benthic Habitat Mapping Model and Cross Validation Using Machine-Learning Classification Algorithms

Remote Sensing ◽

10.3390/rs11111279 ◽

2019 ◽

Vol 11 (11) ◽

pp. 1279 ◽

Cited By ~ 8

Author(s):

Pramaditya Wicaksono ◽

Prama Ardha Aryaguna ◽

Wahyu Lazuardi

Keyword(s):

Machine Learning ◽

Classification Scheme ◽

Learning Algorithm ◽

Classification Tree ◽

Habitat Mapping ◽

Support Vector ◽

Benthic Habitat ◽

Classification Algorithms ◽

Machine Learning Classification ◽

Mapping Model

This research was aimed at developing the mapping model of benthic habitat mapping using machine-learning classification algorithms and tested the applicability of the model in different areas. We integrated in situ benthic habitat data and image processing of WorldView-2 (WV2) image to parameterise the machine-learning algorithm, namely: Random Forest (RF), Classification Tree Analysis (CTA), and Support Vector Machine (SVM). The classification inputs are sunglint-free bands, water column corrected bands, Principle Component (PC) bands, bathymetry, and the slope of underwater topography. Kemujan Island was used in developing the model, while Karimunjawa, Menjangan Besar, and Menjangan Kecil Islands served as test areas. The results obtained indicated that RF was more accurate than any other classification algorithm based on the statistics and benthic habitats spatial distribution. The maximum accuracy of RF was 94.17% (4 classes) and 88.54% (14 classes). The accuracies from RF, CTA, and SVM were consistent across different input bands for each classification scheme. The application of RF model in the classification of benthic habitat in other areas revealed that it is recommended to make use of the more general classification scheme in order to avoid several issues regarding benthic habitat variations. The result also established the possibility of mapping a benthic habitat without the use of training areas.

Download Full-text