Machine Learning Fundamentals

Mapping Intimacies ◽

10.1017/9781108938051 ◽

2021 ◽

Author(s):

Hui Jiang

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Case Studies ◽

Linear Algebra ◽

Neural Nets ◽

Supervised Machine Learning ◽

Core Concepts ◽

Big Picture ◽

Basic Calculus ◽

Probability And Statistics

This lucid, accessible introduction to supervised machine learning presents core concepts in a focused and logical way that is easy for beginners to follow. The author assumes basic calculus, linear algebra, probability and statistics but no prior exposure to machine learning. Coverage includes widely used traditional methods such as SVMs, boosted trees, HMMs, and LDAs, plus popular deep learning methods such as convolution neural nets, attention, transformers, and GANs. Organized in a coherent presentation framework that emphasizes the big picture, the text introduces each method clearly and concisely “from scratch” based on the fundamentals. All methods and algorithms are described by a clean and consistent style, with a minimum of unnecessary detail. Numerous case studies and concrete examples demonstrate how the methods can be applied in a variety of contexts.

Download Full-text

Deep Learning and Its Application to LHC Physics

Annual Review of Nuclear and Particle Science ◽

10.1146/annurev-nucl-101917-021019 ◽

2018 ◽

Vol 68 (1) ◽

pp. 161-181 ◽

Cited By ~ 61

Author(s):

Dan Guest ◽

Kyle Cranmer ◽

Daniel Whiteson

Keyword(s):

Machine Learning ◽

Deep Learning ◽

High Energy Physics ◽

High Energy ◽

Learning Tools ◽

Future Prospects ◽

Higher Dimensional ◽

Lhc Physics ◽

Core Concepts ◽

Energy Physics

Machine learning has played an important role in the analysis of high-energy physics data for decades. The emergence of deep learning in 2012 allowed for machine learning tools which could adeptly handle higher-dimensional and more complex problems than previously feasible. This review is aimed at the reader who is familiar with high-energy physics but not machine learning. The connections between machine learning and high-energy physics data analysis are explored, followed by an introduction to the core concepts of neural networks, examples of the key results demonstrating the power of deep learning for analysis of LHC data, and discussion of future prospects and concerns.

Download Full-text

Sentiment Analysis using various Machine Learning and Deep Learning Techniques

Journal of the Nigerian Society of Physical Sciences ◽

10.46481/jnsps.2021.308 ◽

2021 ◽

pp. 385-394

Author(s):

V Umarani ◽

A Julian ◽

J Deepa

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Sentiment Analysis ◽

Naive Bayes ◽

Naïve Bayes ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Support Vector ◽

Analysis Process ◽

Learning Techniques

Sentiment analysis has gained a lot of attention from researchers in the last year because it has been widely applied to a variety of application domains such as business, government, education, sports, tourism, biomedicine, and telecommunication services. Sentiment analysis is an automated computational method for studying or evaluating sentiments, feelings, and emotions expressed as comments, feedbacks, or critiques. The sentiment analysis process can be automated using machine learning techniques, which analyses text patterns faster. The supervised machine learning technique is the most used mechanism for sentiment analysis. The proposed work discusses the flow of sentiment analysis process and investigates the common supervised machine learning techniques such as multinomial naive bayes, Bernoulli naive bayes, logistic regression, support vector machine, random forest, K-nearest neighbor, decision tree, and deep learning techniques such as Long Short-Term Memory and Convolution Neural Network. The work examines such learning methods using standard data set and the experimental results of sentiment analysis demonstrate the performance of various classifiers taken in terms of the precision, recall, F1-score, RoC-Curve, accuracy, running time and k fold cross validation and helps in appreciating the novelty of the several deep learning techniques and also giving the user an overview of choosing the right technique for their application.

Download Full-text

Deep Semi-Supervised Learning Improves Universal Peptide Identification of Shotgun Proteomics Data

10.1101/2020.11.12.380881 ◽

2020 ◽

Author(s):

John T. Halloran ◽

Gregor Urban ◽

David Rocke ◽

Pierre Baldi

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Peptide Identification ◽

Shotgun Proteomics ◽

Database Search ◽

Supervised Machine Learning ◽

Superior Performance ◽

Support Vector ◽

Proteomics Data ◽

Learning Classifier

AbstractSemi-supervised machine learning post-processors critically improve peptide identification of shot-gun proteomics data. Such post-processors accept the peptide-spectrum matches (PSMs) and feature vectors resulting from a database search, train a machine learning classifier, and recalibrate PSMs using the trained parameters, often yielding significantly more identified peptides across q-value thresholds. However, current state-of-the-art post-processors rely on shallow machine learning methods, such as support vector machines. In contrast, the powerful training capabilities of deep learning models have displayed superior performance to shallow models in an ever-growing number of other fields. In this work, we show that deep models significantly improve the recalibration of PSMs compared to the most accurate and widely-used post-processors, such as Percolator and PeptideProphet. Furthermore, we show that deep learning is able to adaptively analyze complex datasets and features for more accurate universal post-processing, leading to both improved Prosit analysis and markedly better recalibration of recently developed database-search functions.

Download Full-text

Applications of supervised deep learning for seismic interpretation and inversion

The Leading Edge ◽

10.1190/tle38070526.1 ◽

2019 ◽

Vol 38 (7) ◽

pp. 526-533 ◽

Cited By ~ 12

Author(s):

York Zheng ◽

Qie Zhang ◽

Anar Yusifov ◽

Yunzhi Shi

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Case Studies ◽

Seismic Inversion ◽

Seismic Interpretation ◽

Elastic Model ◽

Regression Problem ◽

First Case ◽

Seismic Image ◽

Prestack Seismic Inversion

Recent advances in machine learning and its applications in various sectors are generating a new wave of experiments and solutions to solve geophysical problems in the oil and gas industry. We present two separate case studies in which supervised deep learning is used as an alternative to conventional techniques. The first case is an example of image classification applied to seismic interpretation. A convolutional neural network (CNN) is trained to pick faults automatically in 3D seismic volumes. Every sample in the input seismic image is classified as either a nonfault or fault with a certain dip and azimuth that are predicted simultaneously. The second case is an example of elastic model building — casting prestack seismic inversion as a machine learning regression problem. A CNN is trained to make predictions of 1D velocity and density profiles from input seismic records. In both case studies, we demonstrate that CNN models trained from synthetic data can be used to make efficient and effective predictions on field data. While results from the first example show that high-quality fault picks can be predicted from migrated seismic images, we find that it is more challenging in the prestack seismic inversion case where constraining the subsurface geologic variations and careful preconditioning of input seismic data are important for obtaining reasonably reliable results. This observation matches our experience using conventional workflows and methods, which also respond to improved signal to noise after migration and stack, and the inherent subsurface ambiguity makes unique parameter inversion difficult.

Download Full-text

The NoisyOffice Database: A Corpus To Train Supervised Machine Learning Filters For Image Processing

The Computer Journal ◽

10.1093/comjnl/bxz098 ◽

2019 ◽

Vol 63 (11) ◽

pp. 1658-1667

Author(s):

M J Castro-Bleda ◽

S España-Boquera ◽

J Pastor-Pellicer ◽

F Zamora-Martínez

Keyword(s):

Machine Learning ◽

Image Processing ◽

Deep Learning ◽

Supervised Learning ◽

Image Enhancement ◽

Super Resolution ◽

Supervised Machine Learning ◽

Text Documents ◽

Learning Techniques ◽

Printed Text

Abstract This paper presents the ‘NoisyOffice’ database. It consists of images of printed text documents with noise mainly caused by uncleanliness from a generic office, such as coffee stains and footprints on documents or folded and wrinkled sheets with degraded printed text. This corpus is intended to train and evaluate supervised learning methods for cleaning, binarization and enhancement of noisy images of grayscale text documents. As an example, several experiments of image enhancement and binarization are presented by using deep learning techniques. Also, double-resolution images are also provided for testing super-resolution methods. The corpus is freely available at UCI Machine Learning Repository. Finally, a challenge organized by Kaggle Inc. to denoise images, using the database, is described in order to show its suitability for benchmarking of image processing systems.

Download Full-text

Deep Learning and Conventional Machine Learning for Image-Based in-Situ Fault Detection During Laser Welding: A Comparative Study

10.20944/preprints202105.0272.v1 ◽

2021 ◽

Author(s):

Christian Knaak ◽

Moritz Kröger ◽

Frederic Schulze ◽

Peter Abels ◽

Arnold Gillner

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Near Infrared ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Support Vector ◽

Detection Rates ◽

Feature Extraction And Selection ◽

Welding Defects

An effective process monitoring strategy is a requirement for meeting the challenges posed by increasingly complex products and manufacturing processes. To address these needs, this study investigates a comprehensive scheme based on classical machine learning methods, deep learning algorithms, and feature extraction and selection techniques. In a first step, a novel deep learning architecture based on convolutional neural networks (CNN) and gated recurrent units (GRU) is introduced to predict the local weld quality based on mid-wave infrared (MWIR) and near-infrared (NIR) image data. The developed technology is used to discover critical welding defects including lack of fusion (false friends), sagging and lack of penetration, and geometric deviations of the weld seam. Additional work is conducted to investigate the significance of various geometrical, statistical, and spatio-temporal features extracted from the keyhole and weld pool regions. Furthermore, the performance of the proposed deep learning architecture is compared to that of classical supervised machine learning algorithms, such as multi-layer perceptron (MLP), logistic regression (LogReg), support vector machines (SVM), decision trees (DT), random forest (RF) and k-Nearest Neighbors (kNN). Optimal hyperparameters for each algorithm are determined by an extensive grid search. Ultimately, the three best classification models are combined into an ensemble classifier that yields the highest detection rates and achieves the most robust estimation of welding defects among all classifiers studied, which is validated on previously unknown welding trials.

Download Full-text

Bike Sharing Prediction using Deep Neural Networks

JOIV International Journal on Informatics Visualization ◽

10.30630/joiv.1.3.30 ◽

2017 ◽

Vol 1 (3) ◽

pp. 83 ◽

Cited By ~ 4

Author(s):

Chandrasegar Thirumalai ◽

Ravisankar Koppuravuri

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Deep Learning ◽

Deep Neural Networks ◽

Neural Nets ◽

Machine Learning Techniques ◽

Business Decisions ◽

Learning Techniques ◽

Bike Sharing ◽

The Way

In this paper, we will use deep neural networks for predicting the bike sharing usage based on previous years usage data. We will use because deep neural nets for getting higher accuracy. Deep neural nets are quite different from other machine learning techniques; here we can add many numbers of hidden layers to improve the accuracy of our prediction and the model can be trained in the way we want such that we can achieve the results we want. Nowadays many AI experts will say that deep learning is the best AI technique available now and we can achieve some unbelievable results using this technique. Now we will use that technique to predict bike sharing usage of a rental company to make sure they can take good business decisions based on previous years data.

Download Full-text

A deep learning and novelty detection framework for rapid phenotyping in high-content screening

10.1101/134627 ◽

2017 ◽

Cited By ~ 2

Author(s):

Christoph Sommer ◽

Rudolf Hoefler ◽

Matthias Samwer ◽

Daniel W. Gerlich

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Large Scale ◽

Novelty Detection ◽

A Priori ◽

Mitotic Cell ◽

Supervised Machine Learning ◽

High Content Screening ◽

Data Sets ◽

User Training

AbstractSupervised machine learning is a powerful and widely used method to analyze high-content screening data. Despite its accuracy, efficiency, and versatility, supervised machine learning has drawbacks, most notably its dependence on a priori knowledge of expected phenotypes and time-consuming classifier training. We provide a solution to these limitations with CellCognition Explorer, a generic novelty detection and deep learning framework. Application to several large-scale screening data sets on nuclear and mitotic cell morphologies demonstrates that CellCognition Explorer enables discovery of rare phenotypes without user training, which has broad implications for improved assay development in high-content screening.

Download Full-text

Automated Well-Log Processing and Lithology Classification by Identifying Optimal Features Through Unsupervised and Supervised Machine-Learning Algorithms

SPE Journal ◽

10.2118/202477-pa ◽

2020 ◽

Vol 25 (05) ◽

pp. 2778-2800 ◽

Cited By ~ 1

Author(s):

Harpreet Singh ◽

Yongkoo Seol ◽

Evgeniy M. Myshakin

Keyword(s):

Machine Learning ◽

Case Studies ◽

Ground Truth ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Well Logs ◽

Petroleum Engineering ◽

Well Log ◽

Classification Problems ◽

Rock Types

Summary The application of specialized machine learning (ML) in petroleum engineering and geoscience is increasingly gaining attention in the development of rapid and efficient methods as a substitute to existing methods. Existing ML-based studies that use well logs contain two inherent limitations. The first limitation is that they start with one predefined combination of well logs that by default assumes that the chosen combination of well logs is poised to give the best outcome in terms of prediction, although the variation in accuracy obtained through different combinations of well logs can be substantial. The second limitation is that most studies apply unsupervised learning (UL) for classification problems, but it underperforms by a substantial margin compared with nearly all the supervised learning (SL) algorithms. In this context, this study investigates a variety of UL and SL ML algorithms applied on multiple well-log combinations (WLCs) to automate the traditional workflow of well-log processing and classification, including an optimization step to achieve the best output. The workflow begins by processing the measured well logs, which includes developing different combinations of measured well logs and their physics-motivated augmentations, followed by removal of potential outliers from the input WLCs. Reservoir lithology with four different rock types is investigated using eight UL and seven SL algorithms in two different case studies. The results from the two case studies are used to identify the optimal set of well logs and the ML algorithm that gives the best matching reservoir lithology to its ground truth. The workflow is demonstrated using two wells from two different reservoirs on Alaska North Slope to distinguish four different rock types along the well (brine-dominated sand, hydrate-dominated sand, shale, and others/mixed compositions). The results show that the automated workflow investigated in this study can discover the ground truth for the lithology with up to 80% accuracy with UL and up to 90% accuracy with SL, using six routine well logs [vp, vs, ρb, ϕneut, Rt, gamma ray (GR)], which is a significant improvement compared with the accuracy reported in the current state of the art, which is less than 70%.

Download Full-text

Supervised Machine Learning and Deep Learning Classification Techniques to Identify Scholarly and Research Content

2021 Systems and Information Engineering Design Symposium (SIEDS) ◽

10.1109/sieds52267.2021.9483792 ◽

2021 ◽

Author(s):

Huilin Chang ◽

Yihnew Eshetu ◽

Celeste Lemrow

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Supervised Machine Learning ◽

Classification Techniques

Download Full-text