Harnessing Machine Learning and Big Data Analytics for Real-World Applications: A Comprehensive Survey

2021 ◽  
pp. 734-747
Author(s):  
Soukaina Seddik ◽  
Hayat Routaib ◽  
Anass El Haddadi
2019 ◽  
Author(s):  
Peter Kieseberg ◽  
Lukas Daniel Klausner ◽  
Andreas Holzinger

In discussions on the General Data Protection Regulation (GDPR), anonymisation and deletion are frequently mentioned as suitable technical and organisational methods (TOMs) for privacy protection. The major problem of distortion in machine learning environments, as well as related issues with respect to privacy, are rarely mentioned. The Big Data Analytics project addresses these issues.


Webology ◽  
2021 ◽  
Vol 18 (1) ◽  
pp. 104-120
Author(s):  
S. Josephine Isabella ◽  
Sujatha Srinivasan ◽  
G. Suseendran

During the big data era, there is a continuous occurrence of developing the learning of imbalanced data gives a pathway for the research field along with data mining and machine learning concepts. In recent years, Big Data and Big Data Analytics having high eminence due to data exploration by many of the applications in real-time. Using machine learning will be a greater solution to solve the difficulties that occur when we learn the imbalanced data. Many real-world applications have to predict the solutions for highly imbalanced datasets with the imbalanced target variable. In most of the cases, the target variable assigns or having the least occurrences of the target values due to the sort of imbalances associated with things or events strongly applicable for the users who avail the solutions (for example, results of stock changes, fraud finding, network security, etc.). The expansion of the availability of data due to the rise of big data from the network systems such as security, internet transactions, finance manipulations, surveillance of CCTV or other devices makes the chance to the critical study of insufficient knowledge from the imbalance data when supporting the decision making processes. The data imbalance occurrence is a challenge to the research field. In recent trends, there is more data level and an algorithm level method is being upgraded constantly and leads to develop a new hybrid framework to solve this problem in classification. Classifying the imbalanced data is a challenging task in the field of big data analytics. This study mainly concentrates on the problem existing in most cases of real-world applications as an imbalance occurs in the data. This difficulty present due to the data distribution with skewed nature. We have analyses the data imbalance and find the solution. This paper concentrates mainly on finding a better solution to this nature of the problem to be solved with the proposed framework using a hybrid ensemble classifier based on the Binary Cross-Entropy method as loss function along with the Gradient Boost Algorithm.


Symmetry ◽  
2018 ◽  
Vol 10 (10) ◽  
pp. 485 ◽  
Author(s):  
Muhammad Ashfaq Khan ◽  
Md. Rezaul Karim ◽  
Yangwoo Kim

Every day we experience unprecedented data growth from numerous sources, which contribute to big data in terms of volume, velocity, and variability. These datasets again impose great challenges to analytics framework and computational resources, making the overall analysis difficult for extracting meaningful information in a timely manner. Thus, to harness these kinds of challenges, developing an efficient big data analytics framework is an important research topic. Consequently, to address these challenges by exploiting non-linear relationships from very large and high-dimensional datasets, machine learning (ML) and deep learning (DL) algorithms are being used in analytics frameworks. Apache Spark has been in use as the fastest big data processing arsenal, which helps to solve iterative ML tasks, using distributed ML library called Spark MLlib. Considering real-world research problems, DL architectures such as Long Short-Term Memory (LSTM) is an effective approach to overcoming practical issues such as reduced accuracy, long-term sequence dependency, and vanishing and exploding gradient in conventional deep architectures. In this paper, we propose an efficient analytics framework, which is technically a progressive machine learning technique merged with Spark-based linear models, Multilayer Perceptron (MLP) and LSTM, using a two-stage cascade structure in order to enhance the predictive accuracy. Our proposed architecture enables us to organize big data analytics in a scalable and efficient way. To show the effectiveness of our framework, we applied the cascading structure to two different real-life datasets to solve a multiclass and a binary classification problem, respectively. Experimental results show that our analytical framework outperforms state-of-the-art approaches with a high-level of classification accuracy.


2020 ◽  
Vol 102 (913) ◽  
pp. 199-234
Author(s):  
Nema Milaninia

AbstractAdvances in mobile phone technology and social media have created a world where the volume of information generated and shared is outpacing the ability of humans to review and use that data. Machine learning (ML) models and “big data” analytical tools have the power to ease that burden by making sense of this information and providing insights that might not otherwise exist. In the context of international criminal and human rights law, ML is being used for a variety of purposes, including to uncover mass graves in Mexico, find evidence of homes and schools destroyed in Darfur, detect fake videos and doctored evidence, predict the outcomes of judicial hearings at the European Court of Human Rights, and gather evidence of war crimes in Syria. ML models are also increasingly being incorporated by States into weapon systems in order to better enable targeting systems to distinguish between civilians, allied soldiers and enemy combatants or even inform decision-making for military attacks.The same technology, however, also comes with significant risks. ML models and big data analytics are highly susceptible to common human biases. As a result of these biases, ML models have the potential to reinforce and even accelerate existing racial, political or gender inequalities, and can also paint a misleading and distorted picture of the facts on the ground. This article discusses how common human biases can impact ML models and big data analytics, and examines what legal implications these biases can have under international criminal law and international humanitarian law.


Sign in / Sign up

Export Citation Format

Share Document