Opportunities and Challenges in Code Search Tools

Code search is a core software engineering task. Effective code search tools can help developers substantially improve their software development efficiency and effectiveness. In recent years, many code search studies have leveraged different techniques, such as deep learning and information retrieval approaches, to retrieve expected code from a large-scale codebase. However, there is a lack of a comprehensive comparative summary of existing code search approaches. To understand the research trends in existing code search studies, we systematically reviewed 81 relevant studies. We investigated the publication trends of code search studies, analyzed key components, such as codebase, query, and modeling technique used to build code search tools, and classified existing tools into focusing on supporting seven different search tasks. Based on our findings, we identified a set of outstanding challenges in existing studies and a research roadmap for future code search research.

Download Full-text

Evaluation of a Suite of Metrics for Component Based Software Engineering (CBSE)

10.28945/3379 ◽

2009 ◽

Author(s):

Lakshmi Narasimhan ◽

Prapanna Parthasarathy ◽

Manik Lal Das

Keyword(s):

Software Engineering ◽

Software Development ◽

Large Scale ◽

Software Systems ◽

Systematic Analysis ◽

Rapid Production ◽

Component Based Software Engineering ◽

Large Software ◽

Engineered Systems

Component-Based Software Engineering (CBSE) has shown significant prospects in rapid production of large software systems with enhanced quality, and emphasis on decomposition of the engineered systems into functional or logical components with well-defined interfaces used for communication across the components. In this paper, a series of metrics proposed by various researchers have been analyzed, evaluated and benchmarked using several large-scale publicly available software systems. A systematic analysis of the values for various metrics has been carried out and several key inferences have been drawn from them. A number of useful conclusions have been drawn from various metrics evaluations, which include inferences on complexity, reusability, testability, modularity and stability of the underlying components. The inferences are argued to be beneficial for CBSE-based software development, integration and maintenance.

Download Full-text

Deep Learning

Advances in Systems Analysis, Software Engineering, and High Performance Computing - Deep Learning Techniques and Optimization Strategies in Big Data Analytics ◽

10.4018/978-1-7998-1192-3.ch008 ◽

2020 ◽

pp. 124-141 ◽

Cited By ~ 3

Author(s):

Menaga D. ◽

Revathi S.

Keyword(s):

Information Retrieval ◽

Deep Learning ◽

Software Engineering ◽

Multimedia Information ◽

Research Area ◽

Multimedia Application ◽

Information Process ◽

Display Devices ◽

Storage Devices ◽

Major Application

Multimedia application is a significant and growing research area because of the advances in technology of software engineering, storage devices, networks, and display devices. With the intention of satisfying multimedia information desires of users, it is essential to build an efficient multimedia information process, access, and analysis applications, which maintain various tasks, like retrieval, recommendation, search, classification, and clustering. Deep learning is an emerging technique in the sphere of multimedia information process, which solves both the crisis of conventional and recent researches. The main aim is to resolve the multimedia-related problems by the use of deep learning. The deep learning revolution is discussed with the depiction and feature. Finally, the major application also explained with respect to different fields. This chapter analyzes the crisis of retrieval after providing the successful discussion of multimedia information retrieval that is the ability of retrieving an object of every multimedia.

Download Full-text

Managing tail latency in large scale information retrieval systems

ACM SIGIR Forum ◽

10.1145/3451964.3451982 ◽

2020 ◽

Vol 54 (1) ◽

pp. 1-2

Author(s):

Joel M. Mackenzie

Keyword(s):

Information Retrieval ◽

User Experience ◽

Large Scale ◽

Response Times ◽

Smart Devices ◽

Worst Case ◽

Retrieval Systems ◽

Trade Offs ◽

Efficiency And Effectiveness ◽

Information Retrieval Systems

As both the availability of internet access and the prominence of smart devices continue to increase, data is being generated at a rate faster than ever before. This massive increase in data production comes with many challenges, including efficiency concerns for the storage and retrieval of such large-scale data. However, users have grown to expect the sub-second response times that are common in most modern search engines, creating a problem --- how can such large amounts of data continue to be served efficiently enough to satisfy end users? This dissertation investigates several issues regarding tail latency in large-scale information retrieval systems. Tail latency corresponds to the high percentile latency that is observed from a system --- in the case of search, this latency typically corresponds to how long it takes for a query to be processed. In particular, keeping tail latency as low as possible translates to a good experience for all users, as tail latency is directly related to the worst-case latency and hence, the worst possible user experience. The key idea in targeting tail latency is to move from questions such as "what is the median latency of our search engine?" to questions which more accurately capture user experience such as "how many queries take more than 200 ms to return answers?" or "what is the worst case latency that a user may be subject to, and how often might it occur?" While various strategies exist for efficiently processing queries over large textual corpora, prior research has focused almost entirely on improvements to the average processing time or cost of search systems. As a first contribution, we examine some state-of-the-art retrieval algorithms for two popular index organizations, and discuss the trade-offs between them, paying special attention to the notion of tail latency. This research uncovers a number of observations that are subsequently leveraged for improved search efficiency and effectiveness. We then propose and solve a new problem, which involves processing a number of related query variations together, known as multi-queries , to yield higher quality search results. We experiment with a number of algorithmic approaches to efficiently process these multi-queries, and report on the cost, efficiency, and effectiveness trade-offs present with each. Finally, we examine how predictive models can be used to improve the tail latency and end-to-end cost of a commonly used multi-stage retrieval architecture without impacting result effectiveness. By combining ideas from numerous areas of information retrieval, we propose a prediction framework which can be used for training and evaluating several efficiency/effectiveness trade-off parameters, resulting in improved trade-offs between cost, result quality, and tail latency.

Download Full-text

Constructive Knowledge Management Model and Information Retrieval Methods for Software Engineering

Software Design and Development ◽

10.4018/978-1-4666-4301-7.ch014 ◽

2014 ◽

pp. 253-269

Author(s):

Zeyar Aung ◽

Khine Khine Nyunt

Keyword(s):

Knowledge Management ◽

Information Retrieval ◽

Software Engineering ◽

Software Development ◽

General Purpose ◽

Theory And Practice ◽

Manufacturing Plants ◽

Development Organizations ◽

Information Retrieval Methods ◽

Short Time

In this chapter, the authors discuss two important trends in modern software engineering (SE) regarding the utilization of knowledge management (KM) and information retrieval (IR). Software engineering is a discipline in which knowledge and experience, acquired in the course of many years, play a fundamental role. For software development organizations, the main assets are not manufacturing plants, buildings, and machines, but the knowledge held by their employees. Software engineering has long recognized the need for managing knowledge and that the SE community could learn much from the KM community. The authors introduce the fundamental concepts of KM theory and practice and mainly discuss the aspects of knowledge management that are valuable to software development organizations and how a KM system for such an organization can be implemented. In addition to knowledge management, information retrieval (IR) also plays a crucial role in SE. IR is a study of how to efficiently and effectively retrieve a required piece of information from a large corpus of storage entities such as documents. As software development organizations grow larger and have to deal with larger numbers (probably millions) of documents of various types, IR becomes an essential tool for retrieving any piece of information that a software developer wants within a short time. IR can be used both as a general-purpose tool to improve the productivity of developers or as an enabler tool to facilitate a KM system.

Download Full-text

Large-scale information retrieval in software engineering - an experience report from industrial application

Empirical Software Engineering ◽

10.1007/s10664-015-9410-8 ◽

2015 ◽

Vol 21 (6) ◽

pp. 2324-2365 ◽

Cited By ~ 7

Author(s):

Michael Unterkalmsteiner ◽

Tony Gorschek ◽

Robert Feldt ◽

Niklas Lavesson

Keyword(s):

Information Retrieval ◽

Software Engineering ◽

Industrial Application ◽

Large Scale ◽

Experience Report

Download Full-text

Constructive Knowledge Management Model and Information Retrieval Methods for Software Engineering

Software Development Techniques for Constructive Information Systems Design ◽

10.4018/978-1-4666-3679-8.ch021 ◽

2013 ◽

pp. 377-393 ◽

Cited By ~ 1

Author(s):

Zeyar Aung ◽

Khine Khine Nyunt

Keyword(s):

Knowledge Management ◽

Information Retrieval ◽

Software Engineering ◽

Software Development ◽

General Purpose ◽

Theory And Practice ◽

Manufacturing Plants ◽

Development Organizations ◽

Information Retrieval Methods ◽

Short Time

In this book chapter, the authors discuss two important trends in modern software engineering (SE) regarding the utilization of knowledge management (KM) and information retrieval (IR). Software engineering is a discipline in which knowledge and experience, acquired in the course of many years, play a fundamental role. For software development organizations, the main assets are not manufacturing plants, buildings, and machines, but the knowledge held by their employees. Software engineering has long recognized the need for managing knowledge and that the SE community could learn much from the KM community. The authors introduce the fundamental concepts of KM theory and practice and mainly discuss the aspects of knowledge management that are valuable to software development organizations and how a KM system for such an organization can be implemented. In addition to knowledge management, information retrieval (IR) also plays a crucial role in SE. IR is a study of how to efficiently and effectively retrieve a required piece of information from a large corpus of storage entities such as documents. As software development organizations grow larger and have to deal with larger numbers (probably millions) of documents of various types, IR becomes an essential tool for retrieving any piece of information that a software developer wants within a short time. IR can be used both as a general-purpose tool to improve the productivity of developers or as an enabler tool to facilitate a KM system.

Download Full-text

Computer-Aided Software Engineering - An approach to real-time software development

10.2514/6.1989-2961 ◽

1989 ◽

Cited By ~ 1

Author(s):

CARRIE WALKER ◽

JOHN TURKOVICH

Keyword(s):

Software Engineering ◽

Software Development ◽

Real Time ◽

Computer Aided ◽

Computer Aided Software Engineering

Download Full-text

Multi Disease-Prediction Framework Using Hybrid Deep Learning: An Optimal Prediction Model (Preprint)

10.2196/preprints.22865 ◽

2020 ◽

Author(s):

Anusha Ampavathi ◽

Vijaya Saradhi T

Keyword(s):

Feature Extraction ◽

Big Data ◽

Deep Learning ◽

Weight Function ◽

Optimization Algorithm ◽

Large Scale ◽

Heuristic Algorithms ◽

Disease Prediction ◽

Health Care Decisions ◽

Proposed Model

UNSTRUCTURED Big data and its approaches are generally helpful for healthcare and biomedical sectors for predicting the disease. For trivial symptoms, the difficulty is to meet the doctors at any time in the hospital. Thus, big data provides essential data regarding the diseases on the basis of the patient’s symptoms. For several medical organizations, disease prediction is important for making the best feasible health care decisions. Conversely, the conventional medical care model offers input as structured that requires more accurate and consistent prediction. This paper is planned to develop the multi-disease prediction using the improvised deep learning concept. Here, the different datasets pertain to “Diabetes, Hepatitis, lung cancer, liver tumor, heart disease, Parkinson’s disease, and Alzheimer’s disease”, from the benchmark UCI repository is gathered for conducting the experiment. The proposed model involves three phases (a) Data normalization (b) Weighted normalized feature extraction, and (c) prediction. Initially, the dataset is normalized in order to make the attribute's range at a certain level. Further, weighted feature extraction is performed, in which a weight function is multiplied with each attribute value for making large scale deviation. Here, the weight function is optimized using the combination of two meta-heuristic algorithms termed as Jaya Algorithm-based Multi-Verse Optimization algorithm (JA-MVO). The optimally extracted features are subjected to the hybrid deep learning algorithms like “Deep Belief Network (DBN) and Recurrent Neural Network (RNN)”. As a modification to hybrid deep learning architecture, the weight of both DBN and RNN is optimized using the same hybrid optimization algorithm. Further, the comparative evaluation of the proposed prediction over the existing models certifies its effectiveness through various performance measures.

Download Full-text