scholarly journals A survey of 25 years of evaluation

2019 ◽  
Vol 25 (06) ◽  
pp. 753-767
Author(s):  
Kenneth Ward Church ◽  
Joel Hestness

AbstractEvaluation was not a thing when the first author was a graduate student in the late 1970s. There was an Artificial Intelligence (AI) boom then, but that boom was quickly followed by a bust and a long AI Winter. Charles Wayne restarted funding in the mid-1980s by emphasizing evaluation. No other sort of program could have been funded at the time, at least in America. His program was so successful that these days, shared tasks and leaderboards have become common place in speech and language (and Vision and Machine Learning). It is hard to remember that evaluation was a tough sell 25 years ago. That said, we may be a bit too satisfied with current state of the art. This paper will survey considerations from other fields such as reliability and validity from psychology and generalization from systems. There has been a trend for publications to report better and better numbers, but what do these numbers mean? Sometimes the numbers are too good to be true, and sometimes the truth is better than the numbers. It is one thing for an evaluation to fail to find a difference between man and machine, and quite another thing to pass the Turing Test. As Feynman said, “the first principle is that you must not fool yourself–and you are the easiest person to fool.”

Author(s):  
Ramjee Prasad ◽  
Purva Choudhary

Artificial Intelligence (AI) as a technology has existed for less than a century. In spite of this, it has managed to achieve great strides. The rapid progress made in this field has aroused the curiosity of many technologists around the globe and many companies across various domains are curious to explore its potential. For a field that has achieved so much in such a short duration, it is imperative that people who aim to work in Artificial Intelligence, study its origins, recent developments, and future possibilities of expansion to gain a better insight into the field. This paper encapsulates the notable progress made in Artificial Intelligence starting from its conceptualization to its current state and future possibilities, in various fields. It covers concepts like a Turing machine, Turing test, historical developments in Artificial Intelligence, expert systems, big data, robotics, current developments in Artificial Intelligence across various fields, and future possibilities of exploration.


Author(s):  
William B. Rouse

This book discusses the use of models and interactive visualizations to explore designs of systems and policies in determining whether such designs would be effective. Executives and senior managers are very interested in what “data analytics” can do for them and, quite recently, what the prospects are for artificial intelligence and machine learning. They want to understand and then invest wisely. They are reasonably skeptical, having experienced overselling and under-delivery. They ask about reasonable and realistic expectations. Their concern is with the futurity of decisions they are currently entertaining. They cannot fully address this concern empirically. Thus, they need some way to make predictions. The problem is that one rarely can predict exactly what will happen, only what might happen. To overcome this limitation, executives can be provided predictions of possible futures and the conditions under which each scenario is likely to emerge. Models can help them to understand these possible futures. Most executives find such candor refreshing, perhaps even liberating. Their job becomes one of imagining and designing a portfolio of possible futures, assisted by interactive computational models. Understanding and managing uncertainty is central to their job. Indeed, doing this better than competitors is a hallmark of success. This book is intended to help them understand what fundamentally needs to be done, why it needs to be done, and how to do it. The hope is that readers will discuss this book and develop a “shared mental model” of computational modeling in the process, which will greatly enhance their chances of success.


2021 ◽  
Vol 54 (6) ◽  
pp. 1-35
Author(s):  
Ninareh Mehrabi ◽  
Fred Morstatter ◽  
Nripsuta Saxena ◽  
Kristina Lerman ◽  
Aram Galstyan

With the widespread use of artificial intelligence (AI) systems and applications in our everyday lives, accounting for fairness has gained significant importance in designing and engineering of such systems. AI systems can be used in many sensitive environments to make important and life-changing decisions; thus, it is crucial to ensure that these decisions do not reflect discriminatory behavior toward certain groups or populations. More recently some work has been developed in traditional machine learning and deep learning that address such challenges in different subdomains. With the commercialization of these systems, researchers are becoming more aware of the biases that these applications can contain and are attempting to address them. In this survey, we investigated different real-world applications that have shown biases in various ways, and we listed different sources of biases that can affect AI applications. We then created a taxonomy for fairness definitions that machine learning researchers have defined to avoid the existing bias in AI systems. In addition to that, we examined different domains and subdomains in AI showing what researchers have observed with regard to unfair outcomes in the state-of-the-art methods and ways they have tried to address them. There are still many future directions and solutions that can be taken to mitigate the problem of bias in AI systems. We are hoping that this survey will motivate researchers to tackle these issues in the near future by observing existing work in their respective fields.


2021 ◽  
Author(s):  
Kai Guo ◽  
Zhenze Yang ◽  
Chi-Hua Yu ◽  
Markus J. Buehler

This review revisits the state of the art of research efforts on the design of mechanical materials using machine learning.


Algorithms ◽  
2019 ◽  
Vol 12 (5) ◽  
pp. 99 ◽  
Author(s):  
Kleopatra Pirpinia ◽  
Peter A. N. Bosman ◽  
Jan-Jakob Sonke ◽  
Marcel van Herk ◽  
Tanja Alderliesten

Current state-of-the-art medical deformable image registration (DIR) methods optimize a weighted sum of key objectives of interest. Having a pre-determined weight combination that leads to high-quality results for any instance of a specific DIR problem (i.e., a class solution) would facilitate clinical application of DIR. However, such a combination can vary widely for each instance and is currently often manually determined. A multi-objective optimization approach for DIR removes the need for manual tuning, providing a set of high-quality trade-off solutions. Here, we investigate machine learning for a multi-objective class solution, i.e., not a single weight combination, but a set thereof, that, when used on any instance of a specific DIR problem, approximates such a set of trade-off solutions. To this end, we employed a multi-objective evolutionary algorithm to learn sets of weight combinations for three breast DIR problems of increasing difficulty: 10 prone-prone cases, 4 prone-supine cases with limited deformations and 6 prone-supine cases with larger deformations and image artefacts. Clinically-acceptable results were obtained for the first two problems. Therefore, for DIR problems with limited deformations, a multi-objective class solution can be machine learned and used to compute straightforwardly multiple high-quality DIR outcomes, potentially leading to more efficient use of DIR in clinical practice.


2011 ◽  
Vol 130-134 ◽  
pp. 2047-2050 ◽  
Author(s):  
Hong Chun Qu ◽  
Xie Bin Ding

SVM(Support Vector Machine) is a new artificial intelligence methodolgy, basing on structural risk mininization principle, which has better generalization than the traditional machine learning and SVM shows powerfulability in learning with limited samples. To solve the problem of lack of engine fault samples, FLS-SVM theory, an improved SVM, which is a method is applied. 10 common engine faults are trained and recognized in the paper.The simulated datas are generated from PW4000-94 engine influence coefficient matrix at cruise, and the results show that the diagnostic accuracy of FLS-SVM is better than LS-SVM.


2021 ◽  
Author(s):  
Richard Büssow ◽  
Bruno Hain ◽  
Ismael Al Nuaimi

Abstract Objective and Scope Analysis of operational plant data needs experts in order to interpret detected anomalies which are defined as unusual operation points. The next step on the digital transformation journey is to provide actionable insights into the data. Prescriptive Maintenance defines in advance which kind of detailed maintenance and spare parts will be required. This paper details requirements to improve these predictions for rotating equipment and show potential to integrate the outcome into an operational workflow. Methods, Procedures, Process First principle or physics-based modelling provides additional insights into the data, since the results are directly interpretable. However, such approaches are typically assumed to be expensive to build and not scalable. Identification of and focus on the relevant equipment to be modeled in a hybrid model using a combination of first principle physics and machine learning is a successful strategy. The model is trained using a machine learning approach with historic or current real plant data, to predict conditions which have not occurred before. The better the Artificial Intelligence is trained, the better the prediction will be. Results, Observations, Conclusions The general aim when operating a plant is the actual usage of operational data for process and maintenance optimization by advanced analytics. Typically a data-driven central oversight function supports operations and maintenance staff. A major lesson-learned is that the results of a rather simple statistical approach to detect anomalies fall behind the expectations and are too labor intensive. It is a widely spread misinterpretation that being able to deal with big data is sufficient to come up with good prediction quality for Prescriptive Maintenance. What big data companies are normally missing is domain knowledge, especially on plant critical rotating equipment. Without having domain knowledge the relevant input into the model will have shortcomings and hence the same will apply to its predictions. This paper gives an example of a refinery where the described hybrid model has been used. Novel and Additive Information First principle models are typically expensive to build and not scalable. This hybrid model approach, combining first principle physics based models with artificial intelligence and integration into an operational workflow shows a new way forward.


2018 ◽  
Vol 186 ◽  
pp. 09004
Author(s):  
André Schaaff ◽  
Marc Wenger

The work environment has deeply evolved in recent decades with the generalisation of IT in terms of hardware, online resources and software. Librarians do not escape this movement and their working environment is becoming essentially digital (databases, online publications, Wikis, specialised software, etc.). With the Big Data era, new tools will be available, implementing artificial intelligence, text mining, machine learning, etc. Most of these technologies already exist but they will become widespread and strongly impact our ways of working. The development of social networks that are "business" oriented will also have an increasing influence. In this context, it is interesting to reflect on how the work environment of librarians will evolve. Maintaining interest in the daily work is fundamental and over-automation is not desirable. It is imperative to keep the human-driven factor. We draw on state of the art new technologies which impact their work, and initiate a discussion about how to integrate them while preserving their expertise.


Information ◽  
2019 ◽  
Vol 10 (3) ◽  
pp. 98 ◽  
Author(s):  
Tariq Ahmad ◽  
Allan Ramsay ◽  
Hanady Ahmed

Assigning sentiment labels to documents is, at first sight, a standard multi-label classification task. Many approaches have been used for this task, but the current state-of-the-art solutions use deep neural networks (DNNs). As such, it seems likely that standard machine learning algorithms, such as these, will provide an effective approach. We describe an alternative approach, involving the use of probabilities to construct a weighted lexicon of sentiment terms, then modifying the lexicon and calculating optimal thresholds for each class. We show that this approach outperforms the use of DNNs and other standard algorithms. We believe that DNNs are not a universal panacea and that paying attention to the nature of the data that you are trying to learn from can be more important than trying out ever more powerful general purpose machine learning algorithms.


Sign in / Sign up

Export Citation Format

Share Document