A Proposed Model for Source Code Reuse Detection in Computer Programs

PERANGKAT LUNAK KOMPUTER

10.31219/osf.io/tjbfr ◽

2020 ◽

Author(s):

Cut Nabilah Damni

Keyword(s):

Programming Languages ◽

Programming Language ◽

Operating Systems ◽

Source Code ◽

Computer Software ◽

Computer Programs ◽

Application Systems ◽

Executable Programs

AbstrakSoftware komputer atau perangkat lunak komputer merupakan kumpulan instruksi (program atau prosedur) untuk dapat melaksanakan pekerjaan secara otomatis dengan cara mengolah atau memproses kumpulan intruksi (data) yang diberikan. (Yahfizham, 2019 : 19) Sebagian besar dari software komputer dibuat oleh (programmer) dengan menggunakan bahasa pemprograman. Orang yang membuat bahasa pemprograman menuliskan perintah dalam bahasa pemprograman seperti layaknya bahasa yang digunakan oleh orang pada umumnya dalam melakukan perbincangan. Perintah-perintah tersebut dinamakan (source code). Program komputer lainnya dinamakan (compiler) yang digunakan pada (source code) dan kemudian mengubah perintah tersebut kedalam bahasa yang dimengerti oleh komputer lalu hasilnya dinamakan program executable (EXE). Pada dasarnya, komputer selalu memiliki perangkat lunak komputer atau software yang terdiri dari sistem operasi, sistem aplikasi dan bahasa pemograman.AbstractComputer software or computer software is a collection of instructions (programs or procedures) to be able to carry out work automatically by processing or processing the collection of instructions (data) provided. (Yahfizham, 2019: 19) Most of the computer software is made by (programmers) using the programming language. People who make programming languages write commands in the programming language like the language used by people in general in conducting conversation. The commands are called (source code). Other computer programs called (compilers) are used in (source code) and then change the command into a language understood by the computer and the results are called executable programs (EXE). Basically, computers always have computer software or software consisting of operating systems, application systems and programming languages.

Get full-text (via PubEx)

Semi-automating small-scale source code reuse via structural correspondence

Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of software engineering - SIGSOFT '08/FSE-16 ◽

10.1145/1453101.1453130 ◽

2008 ◽

Cited By ~ 19

Author(s):

Rylan Cottrell ◽

Robert J. Walker ◽

Jörg Denzinger

Keyword(s):

Source Code ◽

Small Scale ◽

Code Reuse ◽

Structural Correspondence

Get full-text (via PubEx)

Pragmatic source code reuse via execution record and replay

Journal of Software Evolution and Process ◽

10.1002/smr.1790 ◽

2016 ◽

Vol 28 (8) ◽

pp. 642-664 ◽

Cited By ~ 4

Author(s):

Ameer Armaly ◽

Collin McMillan

Keyword(s):

Source Code ◽

Code Reuse ◽

Record And Replay

Get full-text (via PubEx)

Analyzing source code identifiers for code reuse using NLP techniques and WordNet

2017 Moratuwa Engineering Research Conference (MERCon) ◽

10.1109/mercon.2017.7980465 ◽

2017 ◽

Cited By ~ 4

Author(s):

P. Pirapuraj ◽

Indika Perera

Keyword(s):

Source Code ◽

Code Reuse

Get full-text (via PubEx)

End-to-End Prediction of Buffer Overruns from Raw Source Code via Neural Memory Networks

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/214 ◽

2017 ◽

Cited By ~ 4

Author(s):

Min-je Choi ◽

Sehun Jeong ◽

Hakjoo Oh ◽

Jaegul Choo

Keyword(s):

Programming Languages ◽

Program Analysis ◽

Question Answering ◽

Source Code ◽

Source Codes ◽

Program Language ◽

Proposed Model ◽

Challenging Tasks ◽

Data Driven Approach ◽

End To End

Detecting buffer overruns from a source code is one of the most common and yet challenging tasks in program analysis. Current approaches based on rigid rules and handcrafted features are limited in terms of flexible applicability and robustness due to diverse bug patterns and characteristics existing in sophisticated real-world software programs. In this paper, we propose a novel, data-driven approach that is completely end-to-end without requiring any hand-crafted features, thus free from any program language-specific structural limitations. In particular, our approach leverages a recently proposed neural network model called memory networks that have shown the state-of-the-art performances mainly in question-answering tasks. Our experimental results using source code samples demonstrate that our proposed model is capable of accurately detecting different types of buffer overruns. We also present in-depth analyses on how a memory network can learn to understand the semantics in programming languages solely from raw source codes, such as tracing variables of interest, identifying numerical values, and performing their quantitative comparisons.

Get full-text (via PubEx)

DeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/514 ◽

2017 ◽

Cited By ~ 8

Author(s):

Xiaodong Gu ◽

Hongyu Zhang ◽

Dongmei Zhang ◽

Sunghun Kim

Keyword(s):

Sequence Learning ◽

Large Scale ◽

Intelligent System ◽

State Of The Art ◽

Source Code ◽

Computer Programs ◽

Experimental Results ◽

Multiple Devices ◽

Application Programming ◽

Programming Interfaces

Computer programs written in one language are often required to be ported to other languages to support multiple devices and environments. When programs use language specific APIs (Application Programming Interfaces), it is very challenging to migrate these APIs to the corresponding APIs written in other languages. Existing approaches mine API mappings from projects that have corresponding versions in two languages. They rely on the sparse availability of bilingual projects, thus producing a limited number of API mappings. In this paper, we propose an intelligent system called DeepAM for automatically mining API mappings from a large-scale code corpus without bilingual projects. The key component of DeepAM is based on the multi-modal sequence to sequence learning architecture that aims to learn joint semantic representations of bilingual API sequences from big source code data. Experimental results indicate that DeepAM significantly increases the accuracy of API mappings as well as the number of API mappings when compared with the state-of-the-art approaches.

Get full-text (via PubEx)

Source Code Comprehension and Appropriation by Novice Programmers: Understanding Novice Programmers’ Perception about Source Code Reuse

Journal of Interactive Systems ◽

10.5753/jis.2019.556 ◽

2019 ◽

Vol 10 ◽

pp. 96

Author(s):

Luana Müller ◽

Milene Selbach Silveira ◽

Clarisse S. de Souza

Keyword(s):

Software Development ◽

Source Code ◽

Ongoing Research ◽

Code Reuse ◽

Semiotic Approach ◽

Software Development Practices

Software development practices rely extensively on reusing source code written by other programmers. One of the recurring questions about such practice is how much programmers, acting as users of somebody else’s code, really understand the source code that they inject it in their programs. The question is even more important for novices, who are trying to learn what programming is and how it should be practiced on a larger scale. In this paper we present the results of an ongoing research using a semiotic approach to investigate how novice programmers reuse source code, and how, through messages inscribed in the source code of the programs they write or reuse, they communicate, implicitly or explicitly, what such source code "means" to them and others. We carried out three studies with novice programmers, and results suggest that source code reuse may impact what programmers take their source code to mean.

Get full-text (via PubEx)

A SEMI-AUTOMATED PROCESS FOR OPEN SOURCE CODE REUSE

Proceedings of the Fifth International Conference on Evaluation of Novel Approaches to Software Engineering ◽

10.5220/0002999401790185 ◽

2010 ◽

Keyword(s):

Open Source ◽

Source Code ◽

Code Reuse ◽

Open Source Code ◽

Automated Process

Get full-text (via PubEx)

Source Code Assessment and Classification Based on Estimated Error Probability Using Attentive LSTM Language Model and Its Application in Programming Education

Applied Sciences ◽

10.3390/app10082973 ◽

2020 ◽

Vol 10 (8) ◽

pp. 2973 ◽

Cited By ~ 2

Author(s):

Md. Mostafizer Rahman ◽

Yutaka Watanobe ◽

Keita Nakamura

Keyword(s):

Error Detection ◽

Error Probability ◽

Short Term Memory ◽

Language Model ◽

Source Code ◽

Attention Mechanism ◽

Error Assessment ◽

Programming Education ◽

Proposed Model ◽

Estimated Error

The rate of software development has increased dramatically. Conventional compilers cannot assess and detect all source code errors. Software may thus contain errors, negatively affecting end-users. It is also difficult to assess and detect source code logic errors using traditional compilers, resulting in software that contains errors. A method that utilizes artificial intelligence for assessing and detecting errors and classifying source code as correct (error-free) or incorrect is thus required. Here, we propose a sequential language model that uses an attention-mechanism-based long short-term memory (LSTM) neural network to assess and classify source code based on the estimated error probability. The attentive mechanism enhances the accuracy of the proposed language model for error assessment and classification. We trained the proposed model using correct source code and then evaluated its performance. The experimental results show that the proposed model has logic and syntax error detection accuracies of 92.2% and 94.8%, respectively, outperforming state-of-the-art models. We also applied the proposed model to the classification of source code with logic and syntax errors. The average precision, recall, and F-measure values for such classification are much better than those of benchmark models. To strengthen the proposed model, we combined the attention mechanism with LSTM to enhance the results of error assessment and detection as well as source code classification. Finally, our proposed model can be effective in programming education and software engineering by improving code writing, debugging, error-correction, and reasoning.

Get full-text (via PubEx)

Jurnal PERANGKAT LUNAK KOMPUTER Ayubi Simatupang PMM-1

10.31219/osf.io/jfvg5 ◽

2020 ◽

Author(s):

S Mukhtar Ayubi Simatupang

Keyword(s):

Programming Languages ◽

Programming Language ◽

Source Code ◽

Computer Software ◽

Computer Programs ◽

Computer Hardware ◽

Executable Programs

Abstrak- Perangkat lunak komputer atau yang sering disebut sebagai (software) mempunyai sifat yang berbeda dengan (hardware) atau perangkat keras komputer. Jika perangkat keras komputer dapat dilihat dan disentuh keberadaannya maka perangkat lunak pada suatu komputer hanya dapat dilihat saja tanpa dapat kita rasa atau raba bendanya. Lebih tepatnya, perangkat lunak tidak dapat disentuh dan memang secara fisik tidak tampak namun kita dapat mengoperasikannya. Namun walaupun tidak tampak secara fisik perangkat lunak sangat berguna dalam pengoperasiannya dengan adanya perangkat lunak suatu komputer dapat menjalankan suatu perintah. Sebagian besar dari software komputer dibuat oleh (programmer) dengan menggunakan bahasa pemprograman. Orang yang membuat bahasa pemprograman menuliskan perintah dalam bahasa pemprograman seperti layaknya bahasa yang digunakan oleh orang pada umumnya dalam melakukan perbincangan. Perintah-perintah tersebut dinamakan (source code). Program komputer lainnya dinamakan (compiler) yang digunakan pada (source code) dan kemudian mengubah perintah tersebut kedalam bahasa yang dimengerti oleh komputer lalu hasilnya dinamakan program executable (EXE).Kata Kunci: Software, ProgrammerAbstac t- Computer software or often referred to as (software) has different properties from (hardware) or computer hardware. If the computer hardware can be seen and touched, then the software on a computer can only be seen without our feeling or feeling. More precisely, the software cannot be touched and it is physically invisible but we can operate it. But even though the software does not appear physically very useful in its operation with the software a computer can run a command. Most of the computer software is made by (programmers) using the programming language. People who make programming languages write commands in the programming language like the language used by people in general in conducting conversation. The commands are called (source code). Other computer programs called (compilers) are used in (source code) and then change the command into a language understood by the computer and the results are called executable programs (EXE).Keywords: Software, Programmer

Get full-text (via PubEx)