Automated Machine Learning for Business

After preparing your dataset, the business problem should be quite familiar, along with the subject matter and the content of the dataset. This section is about modeling data, using data to train algorithms to create models that can be used to predict future events or understand past events. The section shows where data modeling fits in the overall machine learning pipeline. Traditionally, we store real-world data in one or more databases or files. This data is extracted, and features and a target (T) are created and submitted to the “Model Data” stage (the topic of this section). Following the completion of this stage, the model produced is examined (Section V) and placed into production. With the model in the production system, present data generated from the real-world environment is inputted into the system. In the example case of a diabetes patient, we enter a new patient’s information electronic health record into the system, and a database lookup retrieves additional data for feature creation.

Acquire and Integrate Data

Automated Machine Learning for Business ◽

10.1093/oso/9780190941659.003.0003 ◽

2021 ◽

pp. 47-94

Author(s):

Kai R. Larsen ◽

Daniel S. Becker

Keyword(s):

Data Integration ◽

Data Reduction ◽

Regular Expressions ◽

Data Summarization ◽

Different Sources ◽

Integrate Data

Access to additional and relevant data will lead to better predictions from algorithms until we reach the point where more observations (cases) are no longer helpful to detect the signal, the feature(s), or conditions that inform the target. In addition to obtaining more observations, we can also look for additional features of interest that we do not currently have, at which point it will invariably be necessary to integrate data from different sources. This section introduces this process of data integration, starting with an introduction of two methods: “joins” (to access more features) and “unions” (to access more observations) and continues on to cover regular expressions, data summarization, crosstabs, data reduction and splitting, and data wrangling in all its flavors.

Interpret and Communicate

Automated Machine Learning for Business ◽

10.1093/oso/9780190941659.003.0005 ◽

2021 ◽

pp. 189-218

Author(s):

Kai R. Larsen ◽

Daniel S. Becker

Keyword(s):

Machine Learning ◽

Learning Process ◽

Additional Data ◽

Hospital Readmissions ◽

Current Analysis ◽

Problem Context ◽

Time Of Admission ◽

Time Information

Having evaluated all the measures and selected the best model for this case, and much of the machine learning process has been clarified, our understanding of the problem context is still relatively immature. That is, while we have carefully specified the problem, we still do not fully understand what drives that target. Convincing management to support the implementation of the model typically includes explaining the answers to “why,” “what,” “where,” and “when” questions embedded in the model. While the model may be the best overall possible model according to selected measures, for the particular problem related to hospital readmissions, it is still not clear why the model predicts the readmission of some patients will be readmitted and that others will not. It also remains unknown what features drive these outcomes, where the patients who were readmitted come from, or whether or not this is relevant. In this case, access to time information is also unavailable––when, so it is not relevant, but it is easy to imagine that patients admitted in the middle of the night might have worse outcomes due to tired staff or lack of access to the best physicians. If we can convince management that the current analysis is useful, we can likely also make a case for the collection of additional data. The new data might include more information on past interactions with this patient, as well as date and time information to test the hypothesis about the effect of time-of-admission and whether the specific staff caring for a patient matters.

Defining Project Objectives

Automated Machine Learning for Business ◽

10.1093/oso/9780190941659.003.0002 ◽

2021 ◽

pp. 23-46

Author(s):

Kai R. Larsen ◽

Daniel S. Becker

Keyword(s):

Machine Learning ◽

Life Cycle ◽

Subject Matter ◽

Life Cycle Model ◽

Cycle Model ◽

Unit Of Analysis ◽

Success Criteria ◽

Subject Matter Expertise ◽

Project Objectives

This section covers the first steps of a the Machine Learning Life Cycle Model; how to specify a business problem, acquire subject matter expertise, define prediction target, define unit of analysis, identify success criteria, evaluate risks, and finally, decide whether to continue a project. Focus is on who will use the model, whether management is supportive, whether the drivers of the model can be visualized, and how much value a model can produce.

Implement, Document, and Maintain

Automated Machine Learning for Business ◽

10.1093/oso/9780190941659.003.0006 ◽

2021 ◽

pp. 219-276

Author(s):

Kai R. Larsen ◽

Daniel S. Becker

Keyword(s):

Machine Learning ◽

Time Series ◽

Life Cycle ◽

Maintenance Phase ◽

Systems Development ◽

Information Systems Development ◽

Entire Process ◽

Machine Learning Model ◽

The Cost ◽

Time Aware

This section covers the final section of the machine learning life cycle. Consider these the most important steps of the entire process. This is the point at which we have the greatest potential to help our organization reap the benefits of machine learning. In traditional information systems development, 60–80% of the cost of a system comes during the maintenance phase, so these steps are important. This section covers how to deploy a machine learning model, as well as documenting and maintaining this model. A chapter covers the seven types of target leakage followed by time-aware validation and time-series analysis.

Why Use Automated Machine Learning?

Automated Machine Learning for Business ◽

10.1093/oso/9780190941659.003.0001 ◽

2021 ◽

pp. 1-22

Author(s):

Kai R. Larsen ◽

Daniel S. Becker

Keyword(s):

Machine Learning ◽

Human Resources ◽

Customer Service ◽

Resource Availability ◽

Ease Of Use ◽

Algorithm Selection ◽

College Dropout ◽

Exploratory Data ◽

Automated Machine Learning ◽

Marketing Operations

Machine learning is involved in search, translation, detecting depression, likelihood of college dropout, finding lost children, and to sell all kinds of products. While barely beyond its inception, the current machine learning revolution will affect people and organizations no less than the Industrial Revolution’s effect on weavers and many other skilled laborers. Machine learning will automate hundreds of millions of jobs that were considered too complex for machines ever to take over even a decade ago, including driving, flying, painting, programming, and customer service, as well as many of the jobs previously reserved for humans in the fields of finance, marketing, operations, accounting, and human resources. This section explains how automated machine learning addresses exploratory data analysis, feature engineering, algorithm selection, hyperparameter tuning, and model diagnostics. The section covers the eight criteria considered essential for AutoML to have significant impact: accuracy, productivity, ease of use, understanding and learning, resource availability, process transparency, generalization, and recommended actions.

Automated Machine Learning for Business
Latest Publications

TOTAL DOCUMENTS

H-INDEX

Published By Oxford University Press

Model Data

Acquire and Integrate Data

Interpret and Communicate

Defining Project Objectives

Implement, Document, and Maintain

Why Use Automated Machine Learning?

Export Citation Format

Automated Machine Learning for BusinessLatest Publications

TOTAL DOCUMENTS

H-INDEX

Published By Oxford University Press

Model Data

Acquire and Integrate Data

Interpret and Communicate

Defining Project Objectives

Implement, Document, and Maintain

Why Use Automated Machine Learning?

Automated Machine Learning for Business
Latest Publications