scholarly journals BERT_SE: A Pre-Trained Language Representation Model for Software Engineering

2021 ◽  
Author(s):  
Eliane Maria De Bortoli Fávero ◽  
Dalcimar Casanova

The application of Natural Language Processing (NLP) has achieved a high level of relevance in several areas. In the field of software engineering (SE), NLP applications are based on the classification of similar texts (e.g. software requirements), applied in tasks of estimating software effort, selection of human resources, etc. Classifying software requirements has been a complex task, considering the informality and complexity inherent in the texts produced during the software development process. The pre-trained embedding models are shown as a viable alternative when considering the low volume of textual data labeled in the area of software engineering, as well as the lack of quality of these data. Although there is much research around the application of word embedding in several areas, to date, there is no knowledge of studies that have explored its application in the creation of a specific model for the domain of the SE area. Thus, this article presents the proposal for a contextualized embedding model, called BERT_SE, which allows the recognition of specific and relevant terms in the context of SE. The assessment of BERT_SE was performed using the software requirements classification task, demonstrating that this model has an average improvement rate of 13% concerning the BERT_base model, made available by the authors of BERT. The code and pre-trained models are available at https://github.com/elianedb.

Proceedings ◽  
2021 ◽  
Vol 77 (1) ◽  
pp. 17
Author(s):  
Andrea Giussani

In the last decade, advances in statistical modeling and computer science have boosted the production of machine-produced contents in different fields: from language to image generation, the quality of the generated outputs is remarkably high, sometimes better than those produced by a human being. Modern technological advances such as OpenAI’s GPT-2 (and recently GPT-3) permit automated systems to dramatically alter reality with synthetic outputs so that humans are not able to distinguish the real copy from its counteracts. An example is given by an article entirely written by GPT-2, but many other examples exist. In the field of computer vision, Nvidia’s Generative Adversarial Network, commonly known as StyleGAN (Karras et al. 2018), has become the de facto reference point for the production of a huge amount of fake human face portraits; additionally, recent algorithms were developed to create both musical scores and mathematical formulas. This presentation aims to stimulate participants on the state-of-the-art results in this field: we will cover both GANs and language modeling with recent applications. The novelty here is that we apply a transformer-based machine learning technique, namely RoBerta (Liu et al. 2019), to the detection of human-produced versus machine-produced text concerning fake news detection. RoBerta is a recent algorithm that is based on the well-known Bidirectional Encoder Representations from Transformers algorithm, known as BERT (Devlin et al. 2018); this is a bi-directional transformer used for natural language processing developed by Google and pre-trained over a huge amount of unlabeled textual data to learn embeddings. We will then use these representations as an input of our classifier to detect real vs. machine-produced text. The application is demonstrated in the presentation.


Author(s):  
Andrej Zgank ◽  
Izidor Mlakar ◽  
Uros Berglez ◽  
Danilo Zimsek ◽  
Matej Borko ◽  
...  

The chapter presents an overview of human-computer interfaces, which are a crucial element of an ambient intelligence solution. The focus is given to the embodied conversational agents, which are needed to communicate with users in a most natural way. Different input and output modalities, with supporting methods, to process the captured information (e.g., automatic speech recognition, gesture recognition, natural language processing, dialog processing, text to speech synthesis, etc.), have the crucial role to provide the high level of quality of experience to the user. As an example, usage of embodied conversational agent for e-Health domain is proposed.


Author(s):  
Luis Costa ◽  
Neil Loughran ◽  
Roy Grønmo

Model-driven software engineering (MDE) has the basic assumption that the development of software systems from high-level abstractions along with the generation of low-level implementation code can improve the quality of the systems and at the same time reduce costs and improve time to market. This chapter provides an overview of MDE, state of the art approaches, standards, resources, and tools that support different aspects of model-driven software engineering: language development, modeling services, and real-time applications. The chapter concludes with a reflection over the main challenges faced by projects using the current MDE technologies, pointing out some promising directions for future developments.


Author(s):  
Luis Costa ◽  
Neil Loughran ◽  
Roy Grønmo

Model-driven software engineering (MDE) has the basic assumption that the development of software systems from high-level abstractions along with the generation of low-level implementation code can improve the quality of the systems and at the same time reduce costs and improve time to market. This chapter provides an overview of MDE, state of the art approaches, standards, resources, and tools that support different aspects of model-driven software engineering: language development, modeling services, and real-time applications. The chapter concludes with a reflection over the main challenges faced by projects using the current MDE technologies, pointing out some promising directions for future developments.


2021 ◽  
Author(s):  
Paolo Tirotta ◽  
Stefano Lodi

Transfer learning through large pre-trained models has changed the landscape of current applications in natural language processing (NLP). Recently Optimus, a variational autoencoder (VAE) which combines two pre-trained models, BERT and GPT-2, has been released, and its combination with generative adversarial networks (GANs) has been shown to produce novel, yet very human-looking text. The Optimus and GANs combination avoids the troublesome application of GANs to the discrete domain of text, and prevents the exposure bias of standard maximum likelihood methods. We combine the training of GANs in the latent space, with the finetuning of the decoder of Optimus for single word generation. This approach lets us model both the high-level features of the sentences, and the low-level word-by-word generation. We finetune using reinforcement learning (RL) by exploiting the structure of GPT-2 and by adding entropy-based intrinsically motivated rewards to balance between quality and diversity. We benchmark the results of the VAE-GAN model, and show the improvements brought by our RL finetuning on three widely used datasets for text generation, with results that greatly surpass the current state-of-the-art for the quality of the generated texts.


2018 ◽  
Vol 1 (80) ◽  
Author(s):  
Audrius Gocentas ◽  
Anatoli Landõr ◽  
Aleksandras Kriščiūnas

Research background and hypothesis. Replete schedule of competitions and intense training are features of contemporary team sports. Athletes, especially the most involved ones, may not have enough time to recover. As a consequence, aggregated fatigue can manifest in some undesirable form and affect athlete’s performance and health.Research aim. The aim of this study was to evaluate the changes in heart rate recovery (HRR) and investigate possible relations with sport-specifi c measures of effi cacy in professional basketball players during competition season.Research methods. Eight male high-level basketball players (mean ± SD, body mass, 97.3 ± 11.33 kg; height 2.02 ± 0.067 m, and age 23 ± 3.12 years) were investigated. The same basketball specifi c exercise was replicated several times from September till April during the practice sessions in order to assess the personal trends of HRR. Heart rate monitoring was performed using POLAR TEAM SYSTEM. Investigated athletes were ranked retrospectively according to the total amount of minutes played and the coeffi cients of effi cacy. Research results. There were signifi cant differences in the trends of HRR between the investigated players. The most effective players showed decreasing trends of HRR in all cases of ranking.Discussion and conclusions. Research fi ndings have shown that the quality of heart rate recovery differs between basketball players of the same team and could be associated with sport-specifi c effi cacy and competition playing time.Keywords: adaptation, autonomic control, monitoring training.


2013 ◽  
Vol 11 (1) ◽  
pp. 8-13
Author(s):  
V. Behar ◽  
V. Bogdanova

Abstract In this paper the use of a set of nonlinear edge-preserving filters is proposed as a pre-processing stage with the purpose to improve the quality of hyperspectral images before object detection. The capability of each nonlinear filter to improve images, corrupted by spatially and spectrally correlated Gaussian noise, is evaluated in terms of the average Improvement factor in the Peak Signal to Noise Ratio (IPSNR), estimated at the filter output. The simulation results demonstrate that this pre-processing procedure is efficient only in case the spatial and spectral correlation coefficients of noise do not exceed the value of 0.6


2019 ◽  
Vol 28 (10) ◽  
pp. 106-117
Author(s):  
R. M. Asadullin

The continuous modernization of the education system makes the problems of the quality of teacher training increasingly relevant. Moreover, the measures taken to improve the system of teacher education are largely confined to the introduction of new organizational and managerial mechanisms and practically do not affect the internal content and technological structure of the teacher training process.Modern pedagogical universities are constantly looking for innovative models of training teachers that will be able to solve non-standard social and professional tasks. However, recent studies in this area do not fully take into account the nature of pedagogical activity and conditions of its formation. Thus, the need arises for a special study of the processes and means of updating the content and technologies of teacher training in order to control the level of students’ professional competencies development, as required by educational and professional standards. This means the creation of a special educational system in a pedagogical university, which can provide a harmonious and synchronous mastering by future specialists of both subject knowledge and methods of pedagogical activity.The article provides a theoretical study aimed at identifying key patterns of designing a new content for teacher education, the basis of which is the formation of a future teacher as a subject of his own professional activity. The author describes the experience of using a subject-oriented model of education, implemented at Bashkir State Pedagogical University n.a. M. Akmulla. The effectiveness of this model is confirmed by the high level of students’ mastery of designing methods and constructing the educational process, as well as their positive experience in the implementation of educational activities.


Sign in / Sign up

Export Citation Format

Share Document