Data management challenges for artificial intelligence in plant and agricultural research

Artificial Intelligence (AI) is increasingly used within plant science, yet it is far from being routinely and effectively implemented in this domain. Particularly relevant to the development of novel food and agricultural technologies is the development of validated, meaningful and usable ways to integrate, compare and visualise large, multi-dimensional datasets from different sources and scientific approaches. After a brief summary of the reasons for the interest in data science and AI within plant science, the paper identifies and discusses eight key challenges in data management that must be addressed to further unlock the potential of AI in crop and agronomic research, and particularly the application of Machine Learning (AI) which holds much promise for this domain.

Download Full-text

A Survey on Bias and Fairness in Machine Learning

ACM Computing Surveys ◽

10.1145/3457607 ◽

2021 ◽

Vol 54 (6) ◽

pp. 1-35

Author(s):

Ninareh Mehrabi ◽

Fred Morstatter ◽

Nripsuta Saxena ◽

Kristina Lerman ◽

Aram Galstyan

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Deep Learning ◽

Real World ◽

State Of The Art ◽

Future Directions ◽

Discriminatory Behavior ◽

Real World Applications ◽

Near Future ◽

Different Sources

With the widespread use of artificial intelligence (AI) systems and applications in our everyday lives, accounting for fairness has gained significant importance in designing and engineering of such systems. AI systems can be used in many sensitive environments to make important and life-changing decisions; thus, it is crucial to ensure that these decisions do not reflect discriminatory behavior toward certain groups or populations. More recently some work has been developed in traditional machine learning and deep learning that address such challenges in different subdomains. With the commercialization of these systems, researchers are becoming more aware of the biases that these applications can contain and are attempting to address them. In this survey, we investigated different real-world applications that have shown biases in various ways, and we listed different sources of biases that can affect AI applications. We then created a taxonomy for fairness definitions that machine learning researchers have defined to avoid the existing bias in AI systems. In addition to that, we examined different domains and subdomains in AI showing what researchers have observed with regard to unfair outcomes in the state-of-the-art methods and ways they have tried to address them. There are still many future directions and solutions that can be taken to mitigate the problem of bias in AI systems. We are hoping that this survey will motivate researchers to tackle these issues in the near future by observing existing work in their respective fields.

Download Full-text

Data Science: Big Data, Machine Learning, and Artificial Intelligence

Journal of the American College of Radiology ◽

10.1016/j.jacr.2018.01.029 ◽

2018 ◽

Vol 15 (3) ◽

pp. 497-498 ◽

Cited By ~ 16

Author(s):

Ruth C. Carlos ◽

Charles E. Kahn ◽

Safwan Halabi

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Big Data ◽

Data Science

Download Full-text

Artificial Intelligence, Machine Learning, and Data Science Technologies

10.1201/9781003153405 ◽

2021 ◽

Author(s):

Neeraj Mohan ◽

Ruchi Singla ◽

Priyanka Kaushal ◽

Seifedine Kadry

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Data Science

Download Full-text

Artificial Intelligence, Machine Learning and Data Science as Iterations of Business Automation for Small Businesses

Management of Data in AI Age ◽

10.46679/isbn978819484834904 ◽

2020 ◽

pp. 87-94

Author(s):

Pooja Sharma ◽

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Multinational Corporations ◽

Small Businesses ◽

Data Science ◽

Small Data ◽

Decision Systems ◽

Business Operations ◽

Business Automation ◽

Machine Learning Tool

Artificial intelligence and machine learning, the two iterations of automation are based on the data, small or large. The larger the data, the more effective an AI or machine learning tool will be. The opposite holds the opposite iteration. With a larger pool of data, the large businesses and multinational corporations have effectively been building, developing and adopting refined AI and machine learning based decision systems. The contention of this chapter is to explore if the small businesses with small data in hands are well-off to use and adopt AI and machine learning based tools for their day to day business operations.

Download Full-text

Machine Learning in Python: Main Developments and Technology Trends in Data Science, Machine Learning, and Artificial Intelligence

Information ◽

10.3390/info11040193 ◽

2020 ◽

Vol 11 (4) ◽

pp. 193 ◽

Cited By ~ 7

Author(s):

Sebastian Raschka ◽

Joshua Patterson ◽

Corey Nolet

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Data Science ◽

Gpu Computing ◽

Graphics Processing Unit ◽

General Purpose ◽

Processing Unit ◽

The Core ◽

Critical Components ◽

High Level

Smarter applications are making better use of the insights gleaned from data, having an impact on every industry and research discipline. At the core of this revolution lies the tools and the methods that are driving it, from processing the massive piles of data generated each day to learning from and taking useful action. Deep neural networks, along with advancements in classical machine learning and scalable general-purpose graphics processing unit (GPU) computing, have become critical components of artificial intelligence, enabling many of these astounding breakthroughs and lowering the barrier to adoption. Python continues to be the most preferred language for scientific computing, data science, and machine learning, boosting both performance and productivity by enabling the use of low-level libraries and clean high-level APIs. This survey offers insight into the field of machine learning with Python, taking a tour through important topics to identify some of the core hardware and software paradigms that have enabled it. We cover widely-used libraries and concepts, collected together for holistic comparison, with the goal of educating the reader and driving the field of Python machine learning forward.

Download Full-text

Special Issue on Machine Learning, Data Science, and Artificial Intelligence in Plasma Research

IEEE Transactions on Plasma Science ◽

10.1109/tps.2019.2961571 ◽

2020 ◽

Vol 48 (1) ◽

pp. 1-2 ◽

Cited By ~ 5

Author(s):

Zhehui Wang ◽

J. Luc Peterson ◽

Cristina Rea ◽

David Humphreys

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Data Science ◽

Special Issue ◽

Plasma Research ◽

Learning Data

Download Full-text

Critical Digital Humanities

10.5622/illinois/9780252042270.001.0001 ◽

2019 ◽

Author(s):

James E. Dobson

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Computational Methods ◽

Digital Humanities ◽

Data Science ◽

Computational Thinking ◽

Major Question ◽

Computational Tools ◽

Multiple Dimensions ◽

Selection For

This book seeks to develop an answer to the major question arising from the adoption of sophisticated data-science approaches within humanities research: are existing humanities methods compatible with computational thinking? Data-based and algorithmically powered methods present both new opportunities and new complications for humanists. This book takes as its founding assumption that the exploration and investigation of texts and data with sophisticated computational tools can serve the interpretative goals of humanists. At the same time, it assumes that these approaches cannot and will not obsolete other existing interpretive frameworks. Research involving computational methods, the book argues, should be subject to humanistic modes that deal with questions of power and infrastructure directed toward the field’s assumptions and practices. Arguing for a methodologically and ideologically self-aware critical digital humanities, the author contextualizes the digital humanities within the larger neo-liberalizing shifts of the contemporary university in order to resituate the field within a theoretically informed tradition of humanistic inquiry. Bringing the resources of critical theory to bear on computational methods enables humanists to construct an array of compelling and possible humanistic interpretations from multiple dimensions—from the ideological biases informing many commonly used algorithms to the complications of a historicist text mining, from examining the range of feature selection for sentiment analysis to the fantasies of human subjectless analysis activated by machine learning and artificial intelligence.

Download Full-text

Experiencing ProvLake to Manage the Data Lineage of AI Workflows

10.5753/sbsi.2020.13144 ◽

2020 ◽

Author(s):

Leonardo Guerreiro Azevedo ◽

Renan Souza ◽

Raphael Melo Thiago ◽

Elton Soares ◽

Marcio Moreno

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Data Management ◽

Oil And Gas ◽

Core Concept ◽

Data Lineage ◽

Oil And Gas Exploration ◽

Provenance Data ◽

Management Techniques ◽

Artificial Intelligence Systems

Machine Learning (ML) is a core concept behind Artificial Intelligence systems, which work driven by data and generate ML models. These models are used for decision making, and it is crucial to trust their outputs by, e.g., understanding the process that derives them. One way to explain the derivation of ML models is by tracking the whole ML lifecycle, generating its data lineage, which may be accomplished by provenance data management techniques. In this work, we present the use of ProvLake tool for ML provenance data management in the ML lifecycle for Well Top Picking, an essential process in Oil and Gas exploration. We show how ProvLake supported the validation of ML models, the understanding of whether the ML models generalize respecting the domain characteristics, and their derivation.

Download Full-text

Unlocking the Power of Artificial Intelligence and Big Data in Medicine

Journal of Medical Internet Research ◽

10.2196/16607 ◽

2019 ◽

Vol 21 (11) ◽

pp. e16607 ◽

Cited By ~ 8

Author(s):

Christian Lovis

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Big Data ◽

Decision Support ◽

Data Science ◽

Global Environment ◽

Data Driven ◽

Cross Linking ◽

Science And Society ◽

Slowing Down

Data-driven science and its corollaries in machine learning and the wider field of artificial intelligence have the potential to drive important changes in medicine. However, medicine is not a science like any other: It is deeply and tightly bound with a large and wide network of legal, ethical, regulatory, economical, and societal dependencies. As a consequence, the scientific and technological progresses in handling information and its further processing and cross-linking for decision support and predictive systems must be accompanied by parallel changes in the global environment, with numerous stakeholders, including citizen and society. What can be seen at the first glance as a barrier and a mechanism slowing down the progression of data science must, however, be considered an important asset. Only global adoption can transform the potential of big data and artificial intelligence into an effective breakthroughs in handling health and medicine. This requires science and society, scientists and citizens, to progress together.

Download Full-text

Approaches to Capacity Building for Machine Learning and Artificial Intelligence Applications in Health

International Journal for Population Data Science ◽

10.23889/ijpds.v3i4.941 ◽

2018 ◽

Vol 3 (4) ◽

Author(s):

P. Alison Paprica ◽

Frank Sullivan ◽

Yin Aphinyanaphongs ◽

Garth Gibson

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Training Program ◽

Capacity Building ◽

Clinical Training ◽

Large Scale ◽

Data Science ◽

Training Scheme ◽

Health Strategy ◽

Clinical Training Program

Many health systems and research institutes are interested in supplementing their traditional analyses of linked data with machine learning (ML) and other artificial intelligence (AI) methods and tools. However, the availability of individuals who have the required skills to develop and/or implement ML/AI is a constraint, as there is high demand for ML/AI talent in many sectors. The three organizations presenting are all actively involved in training and capacity building for ML/AI broadly, and each has a focus on, and/or discrete initiatives for, particular trainees. P. Alison Paprica, Vector Institute for artificial intelligence, Institute for Clinical Evaluative Sciences, University of Toronto, Canada. Alison is VP, Health Strategy and Partnerships at Vector, responsible for health strategy and also playing a lead role in “1000AIMs” – a Vector-led initiative in support of the Province of Ontario’s \$30 million investment to increase the number of AI-related master’s program graduates to 1,000 per year within five years. Frank Sullivan, University of St Andrews Scotland. Frank is a family physician and an associate director of HDRUK@Scotland. Health Data Research UK \url{https://hdruk.ac.uk/} has recently provided funding to six sites across the UK to address challenging healthcare issues through use of data science. A 50 PhD student Doctoral Training Scheme in AI has also been announced. Each site works in close partnership with National Health Service bodies and the public to translate research findings into benefits for patients and populations. Yin Aphinyanaphongs – INTREPID NYU clinical training program for incoming clinical fellows. Yin is the Director of the Clinical Informatics Training Program at NYU Langone Health. He is deeply interested in the intersection of computer science and health care and as a physician and a scientist, he has a unique perspective on how to train medical professionals for a data drive world. One version of this teaching process is demonstrated in the INTREPID clinical training program. Yin teaches clinicians to work with large scale data within the R environment and generate hypothesis and insights. The session will begin with three brief presentations followed by a facilitated session where all participants share their insights about the essential skills and competencies required for different kinds of ML/AI application and contributions. Live polling and voting will be used at the end of the session to capture participants’ view on the key learnings and take away points. The intended outputs and outcomes of the session are: Participants will have a better understanding of the skills and competencies required for individuals to contribute to AI applications in health in various ways Participants will gain knowledge about different options for capacity building from targeted enhancement of the skills of clinical fellows, to producing large number of applied master’s graduates, to doctoral-level training After the session, the co-leads will work together to create a resource that summarizes the learnings from the session and make them public (though publication in a peer-reviewed journal and/or through the IPDLN website)

Download Full-text