Possibilistic Similarity Measures for Data Science and Machine Learning Applications

The continued advances in artificial intelligence and automation through machine learning applications, under the heading of data science, gives reason for pause within the educator community as we consider how to position future human factors engineers to contribute meaningfully in these projects. Do the lessons we learned and now teach regarding automation based on previous generations of technology still apply? What level of DS and ML expertise is needed for a human factors engineer to have a relevant role in the design of future automation? How do we integrate these topics into a field that often has not emphasized quantitative skills? This panel discussion brings together human factors engineers and educators at different stages of their careers to consider how curricula are being adapted to include data science and machine learning, and what the future of human factors education may look like in the coming years.

Download Full-text

Relative Hausdorff distance for network analysis

Applied Network Science ◽

10.1007/s41109-019-0198-0 ◽

2019 ◽

Vol 4 (1) ◽

Cited By ~ 1

Author(s):

Sinan G. Aksoy ◽

Kathleen E. Nowak ◽

Emilie Purvine ◽

Stephen J. Young

Keyword(s):

Machine Learning ◽

Network Analysis ◽

Similarity Measure ◽

Hausdorff Distance ◽

Data Science ◽

Edit Distance ◽

Similarity Measures ◽

Graph Edit Distance ◽

Computationally Intensive

Abstract Similarity measures are used extensively in machine learning and data science algorithms. The newly proposed graph Relative Hausdorff (RH) distance is a lightweight yet nuanced similarity measure for quantifying the closeness of two graphs. In this work we study the effectiveness of RH distance as a tool for detecting anomalies in time-evolving graph sequences. We apply RH to cyber data with given red team events, as well to synthetically generated sequences of graphs with planted attacks. In our experiments, the performance of RH distance is at times comparable, and sometimes superior, to graph edit distance in detecting anomalous phenomena. Our results suggest that in appropriate contexts, RH distance has advantages over more computationally intensive similarity measures.

Download Full-text

A real-time big data sentiment analysis for iraqi tweets using spark streaming

Bulletin of Electrical Engineering and Informatics ◽

10.11591/eei.v9i4.1897 ◽

2020 ◽

Vol 9 (4) ◽

pp. 1411-1419

Author(s):

Nashwan Dheyaa Zaki ◽

Nada Yousif Hashim ◽

Yasmin Makki Mohialden ◽

Mostafa Abdulghafoor Mohammed ◽

Tole Sutikno ◽

...

Keyword(s):

Machine Learning ◽

Big Data ◽

Sentiment Analysis ◽

Real Time ◽

Data Science ◽

Opinion Mining ◽

Data Streaming ◽

Machine Learning Applications ◽

Learning Research

The scale of data streaming in social networks, such as Twitter, is increasing exponentially. Twitter is one of the most important and suitable big data sources for machine learning research in terms of analysis, prediction, extract knowledge, and opinions. People use Twitter platform daily to express their opinion which is a fundamental fact that influence their behaviors. In recent years, the flow of Iraqi dialect has been increased, especially on the Twitter platform. Sentiment analysis for different dialects and opinion mining has become a hot topic in data science researches. In this paper, we will attempt to develop a real-time analytic model for sentiment analysis and opinion mining to Iraqi tweets using spark streaming, also create a dataset for researcher in this field. The Twitter handle Bassam AlRawi is the case study here. The new method is more suitable in the current day machine learning applications and fast online prediction.

Download Full-text

On Development of Data Science and Machine Learning Applications in Databricks

Services – SERVICES 2019 - Lecture Notes in Computer Science ◽

10.1007/978-3-030-23381-5_6 ◽

2019 ◽

pp. 78-91

Author(s):

Wenhao Ruan ◽

Yifan Chen ◽

Babak Forouraghi

Keyword(s):

Machine Learning ◽

Data Science ◽

Machine Learning Applications

Download Full-text

Data Science on Industrial Data—Today’s Challenges in Brown Field Applications

Challenges ◽

10.3390/challe12010002 ◽

2021 ◽

Vol 12 (1) ◽

pp. 2

Author(s):

Tilman Klaeger ◽

Sebastian Gottschall ◽

Lukas Oehm

Keyword(s):

Machine Learning ◽

Data Collection ◽

Data Science ◽

Production Systems ◽

Ground Truth ◽

Data Sets ◽

Human Machine Interaction ◽

Automation Systems ◽

Machine Learning Applications ◽

Cyber Physical Production Systems

Much research is done on data analytics and machine learning for data coming from industrial processes. In practical approaches, one finds many pitfalls restraining the application of these modern technologies especially in brownfield applications. With this paper, we want to show state of the art and what to expect when working with stock machines in the field. The paper is a review of literature found to cover challenges for cyber-physical production systems (CPPS) in brownfield applications. This review is combined with our own personal experience and findings gained while setting up such systems in processing and packaging machines as well as in other areas. A major focus in this paper is on data collection, which tends be more cumbersome than most people might expect. In addition, data quality for machine learning applications is a challenge once leaving the laboratory and its academic data sets. Topics here include missing ground truth or the lack of semantic description of the data. A last challenge covered is IT security and passing data through firewalls to allow for the cyber part in CPPS. However, all of these findings show that potentials of data driven production systems are strongly depending on data collection to build proclaimed new automation systems with more flexibility, improved human–machine interaction and better process-stability and thus less waste during manufacturing.

Download Full-text

Machine learning applications for shock train diagnostics

AIAA Scitech 2021 Forum ◽

10.2514/6.2021-1878 ◽

2021 ◽

Author(s):

Jared Chin ◽

Mirko Gamba

Keyword(s):

Machine Learning ◽

Shock Train ◽

Machine Learning Applications

Download Full-text

Data science in economics: comprehensive review of advanced machine learning and deep learning methods

10.31232/osf.io/4pxq2 ◽

2020 ◽

Author(s):

Saeed Nosratabadi ◽

Amir Mosavi ◽

Puhong Duan ◽

Pedram Ghamisi ◽

Ferdinand Filip ◽

...

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Data Science ◽

State Of The Art ◽

Science Methods ◽

Learning Models ◽

Diverse Range ◽

Hybrid Machine ◽

Economics Research

This paper provides a state-of-the-art investigation of advances in data science in emerging economic applications. The analysis was performed on novel data science methods in four individual classes of deep learning models, hybrid deep learning models, hybrid machine learning, and ensemble models. Application domains include a wide and diverse range of economics research from the stock market, marketing, and e-commerce to corporate banking and cryptocurrency. Prisma method, a systematic literature review methodology, was used to ensure the quality of the survey. The findings reveal that the trends follow the advancement of hybrid models, which, based on the accuracy metric, outperform other learning algorithms. It is further expected that the trends will converge toward the advancements of sophisticated hybrid deep learning models.

Download Full-text