Energy and Policy Considerations for Modern Deep Learning Research

Emma Strubell; Ananya Ganesh; Andrew McCallum

doi:10.1609/aaai.v34i09.7123

Energy and Policy Considerations for Modern Deep Learning Research

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i09.7123 ◽

2020 ◽

Vol 34 (09) ◽

pp. 13693-13696

Author(s):

Emma Strubell ◽

Ananya Ganesh ◽

Andrew McCallum

Keyword(s):

Neural Network ◽

Artificial Intelligence ◽

Machine Learning ◽

Network Models ◽

Intelligence Community ◽

Environmental Costs ◽

Neural Network Models ◽

Learning Research ◽

Specialized Hardware ◽

Compute Time

The field of artificial intelligence has experienced a dramatic methodological shift towards large neural networks trained on plentiful data. This shift has been fueled by recent advances in hardware and techniques enabling remarkable levels of computation, resulting in impressive advances in AI across many applications. However, the massive computation required to obtain these exciting results is costly both financially, due to the price of specialized hardware and electricity or cloud compute time, and to the environment, as a result of non-renewable energy used to fuel modern tensor processing hardware. In a paper published this year at ACL, we brought this issue to the attention of NLP researchers by quantifying the approximate financial and environmental costs of training and tuning neural network models for NLP (Strubell, Ganesh, and McCallum 2019). In this extended abstract, we briefly summarize our findings in NLP, incorporating updated estimates and broader information from recent related publications, and provide actionable recommendations to reduce costs and improve equity in the machine learning and artificial intelligence community.

Download Full-text

FEATURES OF CONSTRUCTION AND BASIC DIRECTIONS OF DEVELOPMENT OF VIRTUAL DIGITAL ASSISTANTS

Cybersecurity Education Science Technique ◽

10.28925/2663-4023.2020.9.140148 ◽

2020 ◽

Vol 1 (9) ◽

pp. 140-148

Author(s):

Oleksandra Tsyra ◽

Nataliia Punchenko ◽

Oleksii Fraze-Frazenko

Keyword(s):

Neural Network ◽

Artificial Intelligence ◽

Machine Learning ◽

Network Models ◽

Large Data ◽

Unmanned Vehicles ◽

Machine Learning Algorithms ◽

Information Modeling ◽

Main Task ◽

Neural Network Models

The article analyzes the main aspects of creating virtual assistants that are part of intelligent computer programs – artificial intelligence systems (AI). The main task of “artificial intelligence” is to ensure effective communication of intelligent robotic systems (including unmanned vehicles) with humans. The basis of the above is in-depth training (systematic machine translation, speech recognition, processing of complex texts in natural languages, computer vision, automation of driving, etc.). This machine learning subsystem can be characterized using neural network models that mimic the brain. Any neural network model learns from large data sets, so it acquires some “skills”, but how it uses them remains for engineers, which ultimately becomes one of the most important problems for many deep learning applications. The reason is that such a model is formal and without an understanding of the logic of its actions. This raises the question: is it possible to increase the level of trust in such systems based on machine learning? Machine learning algorithms are complex mathematical descriptions and procedures and have a growing impact on people's lives. As the decision is increasingly determined by the algorithms, they become less transparent and understandable. Based on the foregoing, the paper considers the issues of the technological component and the algorithms of virtual digital assistants, conducts information modeling based on the conceptual model of the interaction of the virtual assistant with the database, and analyzes the scope and further development of the IT-sphere.

Download Full-text

Demand Modelling in Telecommunications

Acta Polytechnica ◽

10.14311/1121 ◽

2009 ◽

Vol 49 (2) ◽

Author(s):

M. Chvalina

Keyword(s):

Neural Network ◽

Artificial Intelligence ◽

Neural Networks ◽

Network Models ◽

Short Term ◽

Neural Network Models ◽

Term Forecast ◽

Artificial Intelligence Methods ◽

Demand Modelling

This article analyses the existing possibilities for using Standard Statistical Methods and Artificial Intelligence Methods for a short-term forecast and simulation of demand in the field of telecommunications. The most widespread methods are based on Time Series Analysis. Nowadays, approaches based on Artificial Intelligence Methods, including Neural Networks, are booming. Separate approaches will be used in the study of Demand Modelling in Telecommunications, and the results of these models will be compared with actual guaranteed values. Then we will examine the quality of Neural Network models.

Download Full-text

Machine Learning: Neural Network Models of Sea-Craft Paths

Lecture Notes in Electrical Engineering - Advances in Automation II ◽

10.1007/978-3-030-71119-1_78 ◽

2021 ◽

pp. 801-810

Author(s):

N. Sedova ◽

V. Sedov ◽

R. Bazhenov

Keyword(s):

Neural Network ◽

Machine Learning ◽

Network Models ◽

Neural Network Models

Download Full-text

Comparison of rule-based and neural network models for negation detection in radiology reports

Natural Language Engineering ◽

10.1017/s1351324920000509 ◽

2020 ◽

pp. 1-22 ◽

Cited By ~ 2

Author(s):

D. Sykes ◽

A. Grivas ◽

C. Grover ◽

R. Tobin ◽

C. Sudlow ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Language Processing ◽

Network Models ◽

Neural Network Models ◽

Test Set ◽

Rule Based ◽

Radiology Reports ◽

The Neural Network ◽

Negation Detection

Abstract Using natural language processing, it is possible to extract structured information from raw text in the electronic health record (EHR) at reasonably high accuracy. However, the accurate distinction between negated and non-negated mentions of clinical terms remains a challenge. EHR text includes cases where diseases are stated not to be present or only hypothesised, meaning a disease can be mentioned in a report when it is not being reported as present. This makes tasks such as document classification and summarisation more difficult. We have developed the rule-based EdIE-R-Neg, part of an existing text mining pipeline called EdIE-R (Edinburgh Information Extraction for Radiology reports), developed to process brain imaging reports, (https://www.ltg.ed.ac.uk/software/edie-r/) and two machine learning approaches; one using a bidirectional long short-term memory network and another using a feedforward neural network. These were developed on data from the Edinburgh Stroke Study (ESS) and tested on data from routine reports from NHS Tayside (Tayside). Both datasets consist of written reports from medical scans. These models are compared with two existing rule-based models: pyConText (Harkema et al. 2009. Journal of Biomedical Informatics42(5), 839–851), a python implementation of a generalisation of NegEx, and NegBio (Peng et al. 2017. NegBio: A high-performance tool for negation and uncertainty detection in radiology reports. arXiv e-prints, p. arXiv:1712.05898), which identifies negation scopes through patterns applied to a syntactic representation of the sentence. On both the test set of the dataset from which our models were developed, as well as the largely similar Tayside test set, the neural network models and our custom-built rule-based system outperformed the existing methods. EdIE-R-Neg scored highest on F1 score, particularly on the test set of the Tayside dataset, from which no development data were used in these experiments, showing the power of custom-built rule-based systems for negation detection on datasets of this size. The performance gap of the machine learning models to EdIE-R-Neg on the Tayside test set was reduced through adding development Tayside data into the ESS training set, demonstrating the adaptability of the neural network models.

Download Full-text

NeuRiPP: Neural network identification of RiPP precursor peptides

10.1101/616060 ◽

2019 ◽

Cited By ~ 1

Author(s):

Emmanuel L.C. de los Santos

Keyword(s):

Neural Network ◽

Machine Learning ◽

Network Models ◽

Gene Clusters ◽

Learning Tools ◽

Neural Network Models ◽

Data Set ◽

The Rich ◽

Tailoring Enzymes ◽

Rich Data

ABSTRACTSignificant progress has been made in the past few years on the computational identification biosynthetic gene clusters (BGCs) that encode ribosomally synthesized and post-translationally modified peptides (RiPPs). This is done by identifying both RiPP tailoring enzymes (RTEs) and RiPP precursor peptides (PPs). However, identification of PPs, particularly for novel RiPP classes remains challenging. To address this, machine learning has been used to accurately identify PP sequences. However, current machine learning tools have limitations, since they are specific to the RiPP-class they are trained for, and are context-dependent, requiring information about the surrounding genetic environment of the putative PP sequences. NeuRiPP overcomes these limitations. It does this by leveraging the rich data set of high-confidence putative PP sequences from existing programs, along with experimentally verified PPs from RiPP databases. NeuRiPP uses neural network models that are suitable for peptide classification with weights trained on PP datasets. It is able to identify known PP sequences, and sequences that are likely PPs. When tested on existing RiPP BGC datasets, NeuRiPP is able to identify PP sequences in significantly more putative RiPP clusters than current tools, while maintaining the same HMM hit accuracy. Finally, NeuRiPP was able to successfully identify PP sequences from novel RiPP classes that are recently characterized experimentally, highlighting its utility in complementing existing bioinformatics tools.

Download Full-text

Thinking like a naturalist: enhancing computer vision of citizen science images by harnessing contextual data

10.1101/730887 ◽

2019 ◽

Author(s):

J. Christopher D. Terry ◽

Helen E. Roy ◽

Tom A. August

Keyword(s):

Neural Network ◽

Machine Learning ◽

Computer Vision ◽

Contextual Information ◽

Network Models ◽

List Type ◽

British Isles ◽

Automated Identification ◽

Neural Network Models ◽

Contextual Data

AbstractThe accurate identification of species in images submitted by citizen scientists is currently a bottleneck for many data uses. Machine learning tools offer the potential to provide rapid, objective and scalable species identification for the benefit of many aspects of ecological science. Currently, most approaches only make use of image pixel data for classification. However, an experienced naturalist would also use a wide variety of contextual information such as the location and date of recording.Here, we examine the automated identification of ladybird (Coccinellidae) records from the British Isles submitted to the UK Ladybird Survey, a volunteer-led mass participation recording scheme. Each image is associated with metadata; a date, location and recorder ID, which can be cross-referenced with other data sources to determine local weather at the time of recording, habitat types and the experience of the observer. We built multi-input neural network models that synthesise metadata and images to identify records to species level.We show that machine learning models can effectively harness contextual information to improve the interpretation of images. Against an image-only baseline of 48.2%, we observe a 9.1 percentage-point improvement in top-1 accuracy with a multi-input model compared to only a 3.6% increase when using an ensemble of image and metadata models. This suggests that contextual data is being used to interpret an image, beyond just providing a prior expectation. We show that our neural network models appear to be utilising similar pieces of evidence as human naturalists to make identifications.Metadata is a key tool for human naturalists. We show it can also be harnessed by computer vision systems. Contextualisation offers considerable extra information, particularly for challenging species, even within small and relatively homogeneous areas such as the British Isles. Although complex relationships between disparate sources of information can be profitably interpreted by simple neural network architectures, there is likely considerable room for further progress. Contextualising images has the potential to lead to a step change in the accuracy of automated identification tools, with considerable benefits for large scale verification of submitted records.

Download Full-text

System of providing sustainability of tower cranes from overturn in extreme wind loads

MATEC Web of Conferences ◽

10.1051/matecconf/201822402086 ◽

2018 ◽

Vol 224 ◽

pp. 02086

Author(s):

Pavel Sorokin ◽

Alexey Mishin ◽

Vitaliy Antsev ◽

Alexey Red’kin

Keyword(s):

Neural Network ◽

Artificial Intelligence ◽

Network Models ◽

Wind Loads ◽

Neural Network Models ◽

Quality Of Work ◽

Development Stages ◽

The Neural Network ◽

Extreme Wind

The article is devoted to the issues of ensuring stability of tower cranes from overturn. The development stages of devices for ensuring tower cranes safety are examined and their shortcomings are revealed. The system consisting of subsystems and drives is proposed and their interaction is presented. The article deals with a subsystem based on artificial intelligence methods. The neural network models of forecasting wind parameters are developed. The quality of work of neural network models is estimated. The ways of further topic development are suggested.

Download Full-text

Comprehensive Introduction to Neural Networks

Artificial Neural Network Applications in Business and Engineering - Advances in Computational Intelligence and Robotics ◽

10.4018/978-1-7998-3238-6.ch002 ◽

2021 ◽

pp. 24-48

Author(s):

Rajesh Sai K. ◽

Veneela Adapa ◽

Hari Kishan Kondaveeti

Keyword(s):

Neural Network ◽

Artificial Intelligence ◽

Neural Networks ◽

Human Brain ◽

Network Models ◽

Neuron Models ◽

Neural Network Models ◽

The Neural Networks ◽

Biological Neuron ◽

Comprehensive Introduction

Unknowingly, artificial intelligence (AI) has become an inevitable part of our lives. In this chapter, the authors discuss how the neural networks, a sub-part of AI, changed the way we analyse things. In this chapter, the advent of neural networks, inspiration from the human brain, simplification models of biological neuron models are discussed. Later, a detailed overview of various neural network models, their strengths, limitations, applications, and challenges are presented in detail.

Download Full-text

Data processing using deep learning of the generative-adversarial neural network (GAN)

Neurocomputers ◽

10.18127/j19998554-202105-04 ◽

2021 ◽

Author(s):

V.Y. Ilichev ◽

I.V. Chukhraev

Keyword(s):

Neural Network ◽

Machine Learning ◽

Neural Networks ◽

Deep Learning ◽

Effective Means ◽

Network Models ◽

Neural Network Models ◽

High Quality ◽

Python Language ◽

And Training

The article is devoted to the consideration of one of the areas of application of modern and promising computer technology – machine learning. This direction is based on the creation of models consisting of neural networks and their deep learning. At present, there is a need to generate new, not yet existing, images of objects of different types. Most often, text files or images act as such objects. To achieve a high quality of results, a generation method based on the adversarial work of two neural networks (generator and discriminator) was once worked out. This class of neural network models is distinguished by the complexity of topography, since it is necessary to correctly organize the structure of neural layers in order to achieve maximum accuracy and minimal error. The described program is created using the Python language and special libraries that extend the set of commands for performing additional functions: working with neural networks Keras (main library), integrating with the operating system Os, outputting graphs Matplotlib, working with data arrays Numpy and others. A description is given of the type and features of each neural layer, as well as the use of library connection functions, input of initial data, compilation and training of the obtained model. Next, the implementation of the procedure for outputting the results of evaluating the errors of the generator and discriminator and the accuracy achieved by the model depending on the number of cycles (eras) of its training is considered. Based on the results of the work, conclusions were drawn and recommendations were made for the use and development of the considered methodology for creating and training generative and adversarial neural networks. Studies have demonstrated the procedure for operating with comparatively simple and accessible, but effective means of a universal Python language with the Keras library to create and teach a complex neural network model. In fact, it has been proved that the use of this method allows to achieve high-quality results of machine learning, previously achievable only when using special software systems for working with neural networks.

Download Full-text

Artificial Neural Network Analysis of Geo Database in Diagnosing Papillary Thyroid Carcinoma

10.20944/preprints202103.0324.v1 ◽

2021 ◽

Author(s):

zhoujing zhang ◽

di xu ◽

Ozioma Akakuru ◽

wenjing xu ◽

yewei zhang

Keyword(s):

Neural Network ◽

Artificial Intelligence ◽

Artificial Neural Network ◽

Papillary Thyroid Carcinoma ◽

Thyroid Carcinoma ◽

Network Models ◽

Papillary Thyroid ◽

Neural Network Models ◽

Artificial Neural ◽

Artificial Neural Network Models

The diagnosis of papillary thyroid carcinoma has always been a concerned and challenging issue and it is very important and meaningful to have a definite diagnosis before the operation. In this study, we tried to use an artificial intelligence algorithm instead of medical statistics to analyze the genetic fingerprint from gene chip results to identify papillary thyroid carcinoma. We trained 20 artificial neural network models with differential genes and other important genes related to the cell metabolic cycle as the list of input features, and apply them to the diagnosis of papillary thyroid cancer in the independent validation data set. The results showed that when we used the DEGs and all genes lists as input features the models got the best diagnostic performance with AUC=98.97% and 99.37% and the accuracy were both 96%. This study revealed that the proposed artificial neural network models constructed with genetic fingerprints could achieve a prediction of papillary thyroid carcinoma. Such models can support clinicians to make more accurate clinical diagnoses. At the same time, it provides a novel idea for the application of artificial intelligence in clinical medicine.

Download Full-text