semantic code
Recently Published Documents


TOTAL DOCUMENTS

65
(FIVE YEARS 30)

H-INDEX

9
(FIVE YEARS 2)

2022 ◽  
Vol 31 (2) ◽  
pp. 1-34
Author(s):  
Patrick Keller ◽  
Abdoul Kader Kaboré ◽  
Laura Plein ◽  
Jacques Klein ◽  
Yves Le Traon ◽  
...  

Recent successes in training word embeddings for Natural Language Processing ( NLP ) tasks have encouraged a wave of research on representation learning for source code, which builds on similar NLP methods. The overall objective is then to produce code embeddings that capture the maximum of program semantics. State-of-the-art approaches invariably rely on a syntactic representation (i.e., raw lexical tokens, abstract syntax trees, or intermediate representation tokens) to generate embeddings, which are criticized in the literature as non-robust or non-generalizable. In this work, we investigate a novel embedding approach based on the intuition that source code has visual patterns of semantics. We further use these patterns to address the outstanding challenge of identifying semantic code clones. We propose the WySiWiM  ( ‘ ‘What You See Is What It Means ” ) approach where visual representations of source code are fed into powerful pre-trained image classification neural networks from the field of computer vision to benefit from the practical advantages of transfer learning. We evaluate the proposed embedding approach on the task of vulnerable code prediction in source code and on two variations of the task of semantic code clone identification: code clone detection (a binary classification problem), and code classification (a multi-classification problem). We show with experiments on the BigCloneBench (Java), Open Judge (C) that although simple, our WySiWiM  approach performs as effectively as state-of-the-art approaches such as ASTNN or TBCNN. We also showed with data from NVD and SARD that WySiWiM  representation can be used to learn a vulnerable code detector with reasonable performance (accuracy ∼90%). We further explore the influence of different steps in our approach, such as the choice of visual representations or the classification algorithm, to eventually discuss the promises and limitations of this research direction.


2021 ◽  
Vol 16 (2) ◽  
pp. 117-123
Author(s):  
Hasbullah Hasbullah ◽  
I Wayan Mudra ◽  
I Wayan Swandi

This study aims to analyze the meaning of aesthetic code in the animated film "Si Uma." The animated film "Si Uma" needs to be investigated because the reanimated film shows Balinese culture's beauty and has a unique character shape. Visualization of Balinese culture in this animated film contains a meaningful message for the life of the universe hitch conveyed through the form of code. The problem is what is the meaning of the Balinese aesthetic code represented in the animated film "Si Uma." The method used in this research is qualitative with data collection techniques through observation, interviews, and documentation. The technique of determining the source of data in this study is the Snowball sampling technique. Data sources were collected through interviews with Ida Bagus Surya Manuba, I Nyoman Suci Rasika, and Gede Pasek Putra Adnyana Yasa. Data analysis was carried out through data reduction, presentation, and conclusion drawing based on Ferdinand de Saussure's semiotic theory and Roland Barthes' postmodern aesthetic code. The results of this study indicate that the aesthetic code in the animated film "Si Uma" in the form of a semantic code is associated with a black and white patterned fabric sign (poleng) which contains the meaning of life balance in the universe; the cultural code is associated with a headband (udeng) containing meaning to concentrate thoughts or views (concentration) in every activity in Bali. In conclusion, the meaning of the aesthetic code in the animated film "Si Uma" is seen from two perspectives such as the semantic code in the form of Poleng Cloth is ideological containing the meaning of balance in life in the universe; cultural code in the form of a headband (udeng)is connotative which means to focus attention/thought, aesthetic, and cultural identity in worship and daily activities


2021 ◽  
Vol 20 (1) ◽  
Author(s):  
Guang Zhang ◽  
Yanwei Ren ◽  
Xiaoming Xi ◽  
Delin Li ◽  
Jie Guo ◽  
...  

Abstract Purpose This study proposed a novel Local Reference Semantic Code (LRSC) network for automatic breast ultrasound image classification with few labeled data. Methods In the proposed network, the local structure extractor is firstly developed to learn the local reference which describes common local characteristics of tumors. After that, a two-stage hierarchical encoder is developed to encode the local structures of lesion into the high-level semantic code. Based on the learned semantic code, the self-matching layer is proposed for the final classification. Results In the experiment, the proposed method outperformed traditional classification methods and AUC (Area Under Curve), ACC (Accuracy), Sen (Sensitivity), Spec (Specificity), PPV (Positive Predictive Values), and NPV(Negative Predictive Values) are 0.9540, 0.9776, 0.9629, 0.93, 0.9774 and 0.9090, respectively. In addition, the proposed method also improved matching speed. Conclusions LRSC-network is proposed for breast ultrasound images classification with few labeled data. In the proposed network, a two-stage hierarchical encoder is introduced to learn high-level semantic code. The learned code contains more effective high-level classification information and is simpler, leading to better generalization ability.


2021 ◽  
Vol 42 (6) ◽  
pp. 99-110
Author(s):  
L. V. Matveeva ◽  
◽  
T. Ya. Anikeeva ◽  
Yu. V. Mochalova ◽  
O. B. Stepanova ◽  
...  

The study involved 98 representatives of Moscow youth; 116 of Perm youth and 104 of Tyumen youth. A total 318 respondents, 44% young men and 56% young girls, aged 18 to 25. The 49 bipolar psychosemantic scale of the study describes the image of the country in terms of its strength and authority in the international arena, activity in transformations and various aspects of assessment. Respondents evaluated the images of “Russia-country”, “Future Russia”, “USA”, “China”. The categorical structures of young people's social representations about the country's image, identified in all three regions, have three substantively comparable factors: 1) “Welfare, progressiveness of the country”, 2) “Level of social distance”, 3) “Civilizational attribution”. The value-semantic component in the structure of young people's ideas fixes the cultural codes of Russian civilization: “beauty”, “generosity”, “mercy”, “kindness”, “spirituality”, “morality”.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Shengzi Sun ◽  
Binghui Guo ◽  
Zhilong Mi ◽  
Zhiming Zheng

AbstractCross-modal retrieval has become a topic of popularity, since multi-data is heterogeneous and the similarities between different forms of information are worthy of attention. Traditional single-modal methods reconstruct the original information and lack of considering the semantic similarity between different data. In this work, a cross-modal semantic autoencoder with embedding consensus (CSAEC) is proposed, mapping the original data to a low-dimensional shared space to retain semantic information. Considering the similarity between the modalities, an automatic encoder is utilized to associate the feature projection to the semantic code vector. In addition, regularization and sparse constraints are applied to low-dimensional matrices to balance reconstruction errors. The high dimensional data is transformed into semantic code vector. Different models are constrained by parameters to achieve denoising. The experiments on four multi-modal data sets show that the query results are improved and effective cross-modal retrieval is achieved. Further, CSAEC can also be applied to fields related to computer and network such as deep and subspace learning. The model breaks through the obstacles in traditional methods, using deep learning methods innovatively to convert multi-modal data into abstract expression, which can get better accuracy and achieve better results in recognition.


Queue ◽  
2021 ◽  
Vol 19 (4) ◽  
pp. 42-67
Author(s):  
Timothy Clem ◽  
Patrick Thomson

The Semantic Code team at GitHub builds and operates a suite of technologies that power symbolic code navigation on github.com. We learned that scale is about adoption, user behavior, incremental improvement, and utility. Static analysis in particular is difficult to scale with respect to human behavior; we often think of complex analysis tools working to find potentially problematic patterns in code and then trying to convince the humans to fix them. Our approach took a different tack: use basic analysis techniques to quickly put information that augments our ability to understand programs in front of everyone reading code on GitHub with zero configuration required and almost immediate availability after code changes.


Author(s):  
Rimma M. Khaninova ◽  

Introduction. In the genre system of Kalmyk poetry, the literary fable appeared in the 1930s. When it came to master the genre, Kalmyk poets mainly focused on the traditions of Russian fable of the 19th–20th centuries, primarily on I. A. Krylov’s works which they eagerly translated. The Kalmyk authors were the least likely to rely on traditions of Eastern literature — whether Indian, Tibetan, or Oirat Mongolian — since those sources written in Tibetan, Classical Mongolian and Clear Script (Kalm. todo bichiq) were virtually unavailable to them, and not all poets had knowledge of the scripts. National folklore, including myths, animal tales, household tales, aphoristic poetry (proverbs, sayings, riddles), to a certain extent contributed to the creation of plots and motifs, a gallery of images ― people and the animal world ― in the Kalmyk literary fable. The appeal to the fable was determined by the tasks of cultural construction in Kalmykia, the satirical possibilities of the genre designed to scourge social vices and human shortcomings, contribute to the correction of morals, facilitate education of a person in the new society. Attention to the fable in 20th-century Kalmyk poetry was not that universal and constant, by the end of the century it was no longer in demand and never revived further. The Kalmyk literary fable has been little studied so far, with the exception of several recent articles by R. M. Khaninova, which determines the relevance of this study. Goals. The article aims to study zoopoetics of text of the animalistic fable in Kalmyk poetry of the past century through examples of selected works by Khasyr Syan-Belgin, Muutl Erdniev, Garya Shalburov, Basang Dordzhiev, Timofey Bembeev, and Mikhail Khoninov. Methods. The work employs a number of research methods, such as the historical literary, comparative, and descriptive ones. Results. The animalistic fable is not the leading one in the general genre system of Kalmyk poetry of the past century, including among fables with human characters. It usually includes characters of the steppe fauna whose figurative characteristics are manifested in Kalmyk folklore. The social satire and political orientation of the fables are actualized by modern reality, actual international situation and events. The paper reveals a relationship between the animal fable and — Kalmyk folklore and the Russian fable tradition. Most of the fables have not yet been translated into Russian. Conclusions. In terms of national versification patterns, the study of the Kalmyk poetic animal fable has identified such synthetic forms as fable-fairy tale, fable-proverb, and fable-dream. The genre definition is not always specified by the authors, a moral usually concludes each quatrain-structured narrative. Genre scenes, monologues, and dialogues contribute to an in-depth reading of the context, symbolism of images, and semantic code.


2021 ◽  
Vol 15 (5) ◽  
pp. 1-21
Author(s):  
Xiang Ling ◽  
Lingfei Wu ◽  
Saizhuo Wang ◽  
Gaoning Pan ◽  
Tengfei Ma ◽  
...  

Code retrieval is to find the code snippet from a large corpus of source code repositories that highly matches the query of natural language description. Recent work mainly uses natural language processing techniques to process both query texts (i.e., human natural language) and code snippets (i.e., machine programming language), however, neglecting the deep structured features of query texts and source codes, both of which contain rich semantic information. In this article, we propose an end-to-end deep graph matching and searching (DGMS) model based on graph neural networks for the task of semantic code retrieval. To this end, we first represent both natural language query texts and programming language code snippets with the unified graph-structured data, and then use the proposed graph matching and searching model to retrieve the best matching code snippet. In particular, DGMS not only captures more structural information for individual query texts or code snippets, but also learns the fine-grained similarity between them by cross-attention based semantic matching operations. We evaluate the proposed DGMS model on two public code retrieval datasets with two representative programming languages (i.e., Java and Python). Experiment results demonstrate that DGMS significantly outperforms state-of-the-art baseline models by a large margin on both datasets. Moreover, our extensive ablation studies systematically investigate and illustrate the impact of each part of DGMS.


Author(s):  
Priti Oli ◽  
Rabin Banjade ◽  
Lasang Jimba Tamang ◽  
Vasile Rus

We present in this paper an automated method to assess the quality of Jupyter notebooks. The quality of notebooks is assessed in terms of reproducibility and executability. Specifically, we automatically extract a number of expert-defined features for each notebook, perform a feature selection step, and then trained supervised binary classifiers to predict whether a notebook is reproducible and executable, respectively. We also experimented with semantic code embeddings to capture the notebooks' semantics. We have evaluated these methods on a dataset of 306,539 notebooks and achieved an F1 score of 0.87 for reproducibility and 0.96 for executability (using expert-defined features) and an F1 score of 0.81 for reproducibility and 0.78 for executability (using code embeddings). Our results suggest that semantic code embeddings can be used to determine with good performance the reproducibility and executability of Jupyter notebooks, and since they can be automatically derived, they have the advantage of no need for expert involvement to define features.


Sign in / Sign up

Export Citation Format

Share Document