Adaptation of Classical Machine Learning Algorithms to Big Data Context: Problems and Challenges : Case Study: Hidden Markov Models Under Spark

Hidden Markov models (HMMs) are one of machine learning algorithms which have been widely used and demonstrated their efficiency in many conventional applications. This paper proposes a modified posterior decoding algorithm to solve hidden Markov models decoding problem based on MapReduce paradigm and spark’s resilient distributed dataset (RDDs) concept, for large-scale data processing. The objective of this work is to improve the performances of HMM to deal with big data challenges. The proposed algorithm shows a great improvement in reducing time complexity and provides good results in terms of running time, speedup, and parallelization efficiency for a large amount of data, i.e., large states number and large sequences number.

Download Full-text

The Predictability of Tree-based Machine Learning Algorithms in the Big Data Context

International Journal of Engineering ◽

10.5829/ije.2021.34.01a.10 ◽

2021 ◽

Vol 34 (1) ◽

Keyword(s):

Machine Learning ◽

Big Data ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Data Context

Download Full-text

On the Scalability of Machine-Learning Algorithms for Breast Cancer Prediction in Big Data Context

IEEE Access ◽

10.1109/access.2019.2927080 ◽

2019 ◽

Vol 7 ◽

pp. 91535-91546 ◽

Cited By ~ 9

Author(s):

Sara Alghunaim ◽

Heyam H. Al-Baity

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Big Data ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Cancer Prediction ◽

Data Context

Download Full-text

Machine Learning Algorithms for Short-Term Load Forecast in Residential Buildings Using Smart Meters, Sensors and Big Data Solutions

IEEE Access ◽

10.1109/access.2019.2958383 ◽

2019 ◽

Vol 7 ◽

pp. 177874-177889 ◽

Cited By ~ 10

Author(s):

Simona-Vasilica Oprea ◽

Adela Bara

Keyword(s):

Machine Learning ◽

Big Data ◽

Residential Buildings ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Short Term ◽

Smart Meters ◽

Load Forecast

Download Full-text

Twitter Sentiment Analysis Using Machine Learning Algorithms: A Case Study

2020 International Conference on Advances in Computing, Communication & Materials (ICACCM) ◽

10.1109/icaccm50413.2020.9213011 ◽

2020 ◽

Author(s):

Sheresh Zahoor ◽

Rajesh Rohilla

Keyword(s):

Machine Learning ◽

Sentiment Analysis ◽

Learning Algorithms ◽

Machine Learning Algorithms

Download Full-text

Using hidden Markov models to find discrete targets in continuous sociophonetic data

Linguistics Vanguard ◽

10.1515/lingvan-2020-0057 ◽

2021 ◽

Vol 7 (1) ◽

Author(s):

Daniel Duncan

Keyword(s):

Hidden Markov Models ◽

Markov Models ◽

Hidden Markov ◽

Abstract Representation ◽

Language Variation And Change ◽

Novel Approach ◽

Single Target ◽

The Individual ◽

Individual Speaker

Abstract Advances in sociophonetic research resulted in features once sorted into discrete bins now being measured continuously. This has implied a shift in what sociolinguists view as the abstract representation of the sociolinguistic variable. When measured discretely, variation is variation in selection: one variant is selected for production, and factors influencing language variation and change are influencing the frequency at which variants are selected. Measured continuously, variation is variation in execution: speakers have a single target for production, which they approximate with varying success. This paper suggests that both approaches can and should be considered in sociophonetic analysis. To that end, I offer the use of hidden Markov models (HMMs) as a novel approach to find speakers’ multiple targets within continuous data. Using the lot vowel among whites in Greater St. Louis as a case study, I compare 2-state and 1-state HMMs constructed at the individual speaker level. Ten of fifty-two speakers’ production is shown to involve the regular use of distinct fronted and backed variants of the vowel. This finding illustrates HMMs’ capacity to allow us to consider variation as both variant selection and execution, making them a useful tool in the analysis of sociophonetic data.

Download Full-text