Graph-Augmented Code Summarization in Computational Notebooks

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/717 ◽

2021 ◽

Author(s):

April Wang ◽

Dakuo Wang ◽

Xuye Liu ◽

Lingfei Wu

Keyword(s):

Source Code ◽

Use Cases ◽

Automation System ◽

Generation Algorithm ◽

The Creation ◽

Api Documentation

Computational notebooks allow data scientists to express their ideas through a combination of code and documentation. However, data scientists often pay attention only to the code and neglect the creation of the documentation in a notebook. In this work, we present a human-centered automation system, Themisto, that can support users to easily create documentation via three approaches: 1) We have developed and reported a GNN-augmented code documentation generation algorithm in a previous paper, which can generate documentation for a given source code; 2) Themisto also implements a query-based approach to retrieve the online API documentation as the summary for certain types of source code; 3) Lastly, Themistoalso enables a user prompt approach to motivate users to write documentation for some use cases that automation does not work well.

Download Full-text

Documentation Matters: Human-Centered AI System to Assist Data Science Code Documentation in Computational Notebooks

ACM Transactions on Computer-Human Interaction ◽

10.1145/3489465 ◽

2022 ◽

Vol 29 (2) ◽

pp. 1-33

Author(s):

April Yi Wang ◽

Dakuo Wang ◽

Jaimie Drozdal ◽

Michael Muller ◽

Soya Park ◽

...

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Data Science ◽

Source Code ◽

Generation System ◽

Document Code ◽

Human Data ◽

Within Subjects ◽

The Creation ◽

Api Documentation

Computational notebooks allow data scientists to express their ideas through a combination of code and documentation. However, data scientists often pay attention only to the code, and neglect creating or updating their documentation during quick iterations. Inspired by human documentation practices learned from 80 highly-voted Kaggle notebooks, we design and implement Themisto, an automated documentation generation system to explore how human-centered AI systems can support human data scientists in the machine learning code documentation scenario. Themisto facilitates the creation of documentation via three approaches: a deep-learning-based approach to generate documentation for source code, a query-based approach to retrieve online API documentation for source code, and a user prompt approach to nudge users to write documentation. We evaluated Themisto in a within-subjects experiment with 24 data science practitioners, and found that automated documentation generation techniques reduced the time for writing documentation, reminded participants to document code they would have ignored, and improved participants’ satisfaction with their computational notebook.

Download Full-text

Implementação de Novas Funcionalidades no Sistema de Visualização de Dados da Plataforma ExpeRT

10.14210/cotb.v12.p578-581 ◽

2021 ◽

Author(s):

Eduardo Marmitt ◽

Helmo Alan Batista de Araújo ◽

Mariângela Mendes Recco ◽

Matheus Lorenzato Braga

Keyword(s):

Source Code ◽

Web Page ◽

Java Language ◽

The Creation

The ExpeRT Platform is a system created to assist in the development of pedagogical experimental. After tests accomplished in classrooms, deficits were pointed out in the Data Viewer System (DVS), by the teacher and creator of the ExpeRT Platform. This work consists of enhancing the ExpeRT Platform by using the Java language to modify the data viewer system (DVS) source code and solve the issues pointed out, leading to an update for the seventh version. In addition, provide the creation of a web page as your portal to supply the system’s download.

Download Full-text

An Automated Approach for Constructing Framework Instantiation Documentation

International Journal of Software Engineering and Knowledge Engineering ◽

10.1142/s0218194020500205 ◽

2020 ◽

Vol 30 (04) ◽

pp. 575-601

Author(s):

Raquel Fialho de Queiroz Lafetá ◽

Thiago Fialho de Queiroz Lafetá ◽

Marcelo de Almeida Maia

Keyword(s):

Empirical Study ◽

Human Subjects ◽

Source Code ◽

High Quality ◽

Source Code Analysis ◽

Code Analysis ◽

Empirical Assessment ◽

Significant Difference ◽

Api Documentation ◽

Substantial Effort

A substantial effort, in general, is required for understanding APIs of application frameworks. High-quality API documentation may alleviate the effort, but the production of such documentation still poses a major challenge for modern frameworks. To facilitate the production of framework instantiation documentation, we hypothesize that the framework code itself and the code of existing instantiations provide useful information. However, given the size and complexity of existent code, automated approaches are required to assist the documentation production. Our goal is to assess an automated approach for constructing relevant documentation for framework instantiation based on source code analysis of the framework itself and of existing instantiations. The criterion for defining whether documentation is relevant would be to compare the documentation with an traditional framework documentation, considering the time spent and correctness during instantiation activities, information usefulness, complexity of the activity, navigation, satisfaction, information localization and clarity. We propose an automated approach for constructing relevant documentation for framework instantiation based on source code analysis of the framework itself and of existing instantiations. The proposed approach generates documentation in a cookbook style, where the recipes are programming activities using the necessary API elements driven by the framework features. We performed an empirical study, consisting of three experiments with 44 human subjects executing real framework instantiations aimed at comparing the use of the proposed cookbooks to traditional manual framework documentation (baseline). Our empirical assessment shows that the generated cookbooks performed better or, at least, with non-significant difference when compared to the traditional documentation, evidencing the effectiveness of the approach.

Download Full-text

An Opportunistic Approach to Retaining Use Cases in Object-Oriented Source Code

2015 4th Eastern European Regional Conference on the Engineering of Computer Based Systems ◽

10.1109/ecbs-eerc.2015.12 ◽

2015 ◽

Cited By ~ 1

Author(s):

Jan Greppel ◽

Valentino Vranic

Keyword(s):

Source Code ◽

Object Oriented ◽

Use Cases

Download Full-text

Talk Is Silver, Code Is Gold? Beyond Traditional Notions of Contribution in Peer Production: The Case of Drupal

Frontiers in Human Dynamics ◽

10.3389/fhumd.2021.618207 ◽

2021 ◽

Vol 3 ◽

Author(s):

David Rozas ◽

Nigel Gilbert ◽

Paul Hodkinson ◽

Samer Hassan

Keyword(s):

Qualitative Study ◽

Empirical Evidence ◽

Source Code ◽

Free Software ◽

The Internet ◽

Software Project ◽

Peer Production ◽

Digital Platforms ◽

Digital Commons ◽

The Creation

Peer production communities are based on the collaboration of communities of people, mediated by the Internet, typically to create digital commons, as in Wikipedia or free software. The contribution activities around the creation of such commons (e.g., source code, articles, or documentation) have been widely explored. However, other types of contribution whose focus is directed toward the community have remained significantly less visible (e.g., the organization of events or mentoring). This work challenges the notion of contribution in peer production through an in-depth qualitative study of a prominent “code-centric” example: the case of the free software project Drupal. Involving the collaboration of more than a million participants, the Drupal project supports nearly 2% of websites worldwide. This research (1) offers empirical evidence of the perception of “community-oriented” activities as contributions, and (2) analyzes their lack of visibility in the digital platforms of collaboration. Therefore, through the exploration of a complex and “code-centric” case, this study aims to broaden our understanding of the notion of contribution in peer production communities, incorporating new kinds of contributions customarily left invisible.

Download Full-text

Internet-Based Customer Collaboration

Advances in E-Collaboration - Emerging e-Collaboration Concepts and Applications ◽

10.4018/978-1-59904-393-7.ch009 ◽

2011 ◽

pp. 166-192 ◽

Cited By ~ 2

Author(s):

Ulrike Schultze ◽

Anita D. Bhappu

Keyword(s):

Contingency Theory ◽

Service Design ◽

Use Cases ◽

Internet Technology ◽

Community Based ◽

Theoretical Understanding ◽

Direct Involvement ◽

Internet Technologies ◽

Goods And Services ◽

The Creation

Co-production, which is the generation of value through the direct involvement of customers in the creation of a service context and in the design, delivery, and marketing of goods and services that they themselves consume, implies customer-firm collaboration. The nature of this collaboration, however, is highly dependent on the organization’s service design, which increasingly includes Internet technology, as well as customer communities. Whereas dyadic co-production implies a single customer’s involvement with a firm, community-based co-production implies multiple customers simultaneously engaged in value-adding activities with a firm. In order to build a theoretical understanding of these modes of customer collaboration and to explore the role and implications of Internet technologies within them, we develop a contingency theory of customer co-production designs. We then use cases of Internet-based services to highlight the benefits and challenges of relying on Internet technology to implement customer co-production.

Download Full-text

Combining Machine Learning and Logical Reasoning to Improve Requirements Traceability Recovery

Applied Sciences ◽

10.3390/app10207253 ◽

2020 ◽

Vol 10 (20) ◽

pp. 7253

Author(s):

Tong Li ◽

Shiheng Wang ◽

David Lillis ◽

Zhen Yang

Keyword(s):

Machine Learning ◽

Time Pressure ◽

Structural Information ◽

Source Code ◽

Logical Reasoning ◽

Use Cases ◽

Software Systems ◽

Management And Development ◽

The One ◽

Recovery Approach

Maintaining traceability links of software systems is a crucial task for software management and development. Unfortunately, dealing with traceability links are typically taken as afterthought due to time pressure. Some studies attempt to use information retrieval-based methods to automate this task, but they only concentrate on calculating the textual similarity between various software artifacts and do not take into account the properties of such artifacts. In this paper, we propose a novel traceability link recovery approach, which comprehensively measures the similarity between use cases and source code by exploring their particular properties. To this end, we leverage and combine machine learning and logical reasoning techniques. On the one hand, our method extracts features by considering the semantics of the use cases and source code, and uses a classification algorithm to train the classifier. On the other hand, we utilize the relationships between artifacts and define a series of rules to recover traceability links. In particular, we not only leverage source code’s structural information, but also take into account the interrelationships between use cases. We have conducted a series of experiments on multiple datasets to evaluate our approach against existing approaches, the results of which show that our approach is substantially better than other methods.

Download Full-text

Identifying use cases in source code

Journal of Systems and Software ◽

10.1016/j.jss.2006.02.032 ◽

2006 ◽

Vol 79 (11) ◽

pp. 1588-1598 ◽

Cited By ~ 4

Author(s):

Lu Zhang ◽

Tao Qin ◽

Zhiying Zhou ◽

Dan Hao ◽

Jiasu Sun

Keyword(s):

Source Code ◽

Use Cases

Download Full-text

Discovering use cases from source code using the branch-reserving call graph

Tenth Asia-Pacific Software Engineering Conference, 2003. ◽

10.1109/apsec.2003.1254358 ◽

2004 ◽

Cited By ~ 3

Author(s):

Tao Qin ◽

Lu Zhang ◽

Zhiying Zhou ◽

D. Hao ◽

Jiasu Sun

Keyword(s):

Source Code ◽

Use Cases ◽

Call Graph

Download Full-text

A Conceptual Dependency Graph Based Keyword Extraction Model for Source Code to API Documentation Mapping

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.b1092.078219 ◽

2019 ◽

Vol 8 (2) ◽

pp. 5888-5895

Keyword(s):

Language Processing ◽

Source Code ◽

Dependency Graph ◽

Software Systems ◽

Code Analysis ◽

Proposed Model ◽

Context Similarity ◽

Code Metrics ◽

Api Documentation ◽

Source Code Metrics

Natural language processing on software systems usually contain high dimensional noisy and irrelevant features which lead to inaccurate and poor contextual similarity between the project source code and its API documentation. Most of the traditional source code analysis models are independent of finding and extracting the relevant features for contextual similarity. As the size of the project source code and its related API documentation increases, these models incorporate the contextual similarity between the source code and API documentation for code analysis. One of the best solutions for this problem is finding the essential features using the source code dependency graph. In this paper, the dependency graph is used to compute the contextual similarity computation between the source code metrics and its API documents. A novel contextual similarity measure is used to find the relationship between the project source code metrics to the API documents. Proposed model is evaluated on different project source codes and API documents in terms of pre-processing, context similarity and runtime. Experimental results show that the proposed model has high computational efficiency compared to the existing models on the large size datasets

Download Full-text