scholarly journals Iktishaf+: A Big Data Tool with Automatic Labeling for Road Traffic Social Sensing and Event Detection Using Distributed Machine Learning

Sensors ◽  
2021 ◽  
Vol 21 (9) ◽  
pp. 2993
Author(s):  
Ebtesam Alomari ◽  
Iyad Katib ◽  
Aiiad Albeshri ◽  
Tan Yigitcanlar ◽  
Rashid Mehmood

Digital societies could be characterized by their increasing desire to express themselves and interact with others. This is being realized through digital platforms such as social media that have increasingly become convenient and inexpensive sensors compared to physical sensors in many sectors of smart societies. One such major sector is road transportation, which is the backbone of modern economies and costs globally 1.25 million deaths and 50 million human injuries annually. The cutting-edge on big data-enabled social media analytics for transportation-related studies is limited. This paper brings a range of technologies together to detect road traffic-related events using big data and distributed machine learning. The most specific contribution of this research is an automatic labelling method for machine learning-based traffic-related event detection from Twitter data in the Arabic language. The proposed method has been implemented in a software tool called Iktishaf+ (an Arabic word meaning discovery) that is able to detect traffic events automatically from tweets in the Arabic language using distributed machine learning over Apache Spark. The tool is built using nine components and a range of technologies including Apache Spark, Parquet, and MongoDB. Iktishaf+ uses a light stemmer for the Arabic language developed by us. We also use in this work a location extractor developed by us that allows us to extract and visualize spatio-temporal information about the detected events. The specific data used in this work comprises 33.5 million tweets collected from Saudi Arabia using the Twitter API. Using support vector machines, naïve Bayes, and logistic regression-based classifiers, we are able to detect and validate several real events in Saudi Arabia without prior knowledge, including a fire in Jeddah, rains in Makkah, and an accident in Riyadh. The findings show the effectiveness of Twitter media in detecting important events with no prior knowledge about them.

Author(s):  
Muhammad Junaid ◽  
Shiraz Ali Wagan ◽  
Nawab Muhammad Faseeh Qureshi ◽  
Choon Sung Nam ◽  
Dong Ryeol Shin

Author(s):  
Kağan Okatan

All these types of analytics have been answering business questions for a long time about the principal methods of investigating data warehouses. Especially data mining and business intelligence systems support decision makers to reach the information they want. Many existing systems are trying to keep up with a phenomenon that has changed the rules of the game in recent years. This is undoubtedly the undeniable attraction of 'big data'. In particular, the issue of evaluating the big data generated especially by social media is among the most up-to-date issues of business analytics, and this issue demonstrates the importance of integrating machine learning into business analytics. This section introduces the prominent machine learning algorithms that are increasingly used for business analytics and emphasizes their application areas.


2020 ◽  
pp. 214-244
Author(s):  
Prithish Banerjee ◽  
Mark Vere Culp ◽  
Kenneth Jospeh Ryan ◽  
George Michailidis

This chapter presents some popular graph-based semi-supervised approaches. These techniques apply to classification and regression problems and can be extended to big data problems using recently developed anchor graph enhancements. The background necessary for understanding this Chapter includes linear algebra and optimization. No prior knowledge in methods of machine learning is necessary. An empirical demonstration of the techniques for these methods is also provided on real data set benchmarks.


2021 ◽  
Vol 12 ◽  
Author(s):  
Muhammad Usman Tariq ◽  
Muhammad Babar ◽  
Marc Poulin ◽  
Akmal Saeed Khattak ◽  
Mohammad Dahman Alshehri ◽  
...  

Intelligent big data analysis is an evolving pattern in the age of big data science and artificial intelligence (AI). Analysis of organized data has been very successful, but analyzing human behavior using social media data becomes challenging. The social media data comprises a vast and unstructured format of data sources that can include likes, comments, tweets, shares, and views. Data analytics of social media data became a challenging task for companies, such as Dailymotion, that have billions of daily users and vast numbers of comments, likes, and views. Social media data is created in a significant amount and at a tremendous pace. There is a very high volume to store, sort, process, and carefully study the data for making possible decisions. This article proposes an architecture using a big data analytics mechanism to efficiently and logically process the huge social media datasets. The proposed architecture is composed of three layers. The main objective of the project is to demonstrate Apache Spark parallel processing and distributed framework technologies with other storage and processing mechanisms. The social media data generated from Dailymotion is used in this article to demonstrate the benefits of this architecture. The project utilized the application programming interface (API) of Dailymotion, allowing it to incorporate functions suitable to fetch and view information. The API key is generated to fetch information of public channel data in the form of text files. Hive storage machinist is utilized with Apache Spark for efficient data processing. The effectiveness of the proposed architecture is also highlighted.


The main objective of this paper is Analyze the reviews of Social Media Big Data of E-Commerce product’s. And provides helpful result to online shopping customers about the product quality and also provides helpful decision making idea to the business about the customer’s mostly liking and buying products. This covers all features or opinion words, like capitalized words, sequence of repeated letters, emoji, slang words, exclamatory words, intensifiers, modifiers, conjunction words and negation words etc available in tweets. The existing work has considered only two or three features to perform Sentiment Analysis with the machine learning technique Natural Language Processing (NLP). In this proposed work familiar Machine Learning classification models namely Multinomial Naïve Bayes, Support Vector Machine, Decision Tree Classifier, and, Random Forest Classifier are used for sentiment classification. The sentiment classification is used as a decision support system for the customers and also for the business.


Author(s):  
Karimah Mohammad Qutah ◽  
Safar A. Alsaleem ◽  
Abdullah Ahmed Najmi ◽  
Muteb Bawwah Zabbani

Aim: To assess mother's knowledge and attitude regarding self-expressed milk in Jazan, Saudi Arabia. Methodology: Study Area: An observational and cross sectional study done in Obstetric Department (Well Baby and immunization Clinics) in King Fahd Central Hospital (KFCH), Jazan, Saudi Arabia and in six PHCCs in Jazan (randomly selected) from  December 2016 - March 2017.  Pregnant women who delivered babies before and post-partum women in Obstetric departments, Obstetric outpatient clinic, mother’s in well baby, and immunization clinics in mentioned areas were included in the study. Stratified multistage sampling techniques were used.  N = 499 Saudi mothers calculated according to survey system with confidence level % 95.  The questionnaire was self-administering questionnaire (in Arabic language).  All data processed via Statistical Package for the Social Sciences (SPSS) version 19. Shapiro-Wilk test. Kruskal-Wallis test used to see the association between level of knowledge and practice with demographic variables that contains more than 2 variables. Mann-Whitney test and Spearman correlation were used. Results: Total of 499 mothers was participating aged 30±7 years with mean number of kids 2.98 ± 2. Mothers heard about self-expressed breast milks accounts 73.5% and 236 mothers of them were practice it. Both level of knowledge and practice accuracy were inadequate. Around one third of mothers heard about it from social media. More than third of the women practice it because of work related issues. The higher the educational level was the higher knowledge (p<0.001). Age and number of kids, has no statistically significant effect on the knowledge level (P = 0.417, 0.285).  Working mothers have higher knowledge level than house wife and students (p<0.001), nurses especially who toke breast feeding teaching have higher knowledge level than physicians then teachers (p<0.001). Mothers who toke their knowledge from breast feeding courses have the highest knowledge level followed by medical stuffs other than physicians followed by social media and internet websites then physicians then mothers and last are friends (p<0.001). Mothers with more accurate practice were more knowledgeable than mothers with less accurate practices (p<0.001). Conclusion: Mothers knowledge and practice regarding self-expressed breast milk needed to be improved in order to give the babies better chance for exclusive breast feeding. Breast feeding courses for mothers give better results in term of accuracy of mother’s knowledge and practice of expressed breast milk.


Sign in / Sign up

Export Citation Format

Share Document