Detecting Drinking-Related Contents on Social Media by Classifying Heterogeneous Data Types

Author(s):  
Omar ElTayeby ◽  
Todd Eaglin ◽  
Malak Abdullah ◽  
David Burlinson ◽  
Wenwen Dou ◽  
...  
2018 ◽  
Vol 25 (4) ◽  
pp. 1756-1767 ◽  
Author(s):  
Omar ElTayeby ◽  
Todd Eaglin ◽  
Malak Abdullah ◽  
David Burlinson ◽  
Wenwen Dou ◽  
...  

Binge drinking is a severe health problem faced by many US colleges and universities. College students often post drinking-related text and images on social media, portraying their alcohol use as socially desirable. In this project, we investigated the feasibility of mining the heterogeneous data (e.g. text, images, and videos) on Facebook to identify drinking-related contents. We manually annotated 4266 posts during 21 October 2011 and 3 November 2014 from “I’m Shmacked” group on Facebook, where 511 posts were drinking-related. Our machine learning models show that by combining heterogeneous data types, we were able to identify drinking-related posts with an F1-score of 0.81. Prediction models built on text data were more reliable compared to those built on image and video data for predicting drinking-related contents. As the first step of our efforts in this direction, this feasibility study showed promise toward unleashing the potential of mining social media to identify students who binge drink.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Gianluca Solazzo ◽  
Ylenia Maruccia ◽  
Gianluca Lorenzo ◽  
Valentina Ndou ◽  
Pasquale Del Vecchio ◽  
...  

Purpose This paper aims to highlight how big social data (BSD) and analytics exploitation may help destination management organisations (DMOs) to understand tourist behaviours and destination experiences and images. Gathering data from two different sources, Flickr and Twitter, textual and visual contents are used to perform different analytics tasks to generate insights on tourist behaviour and the affective aspects of the destination image. Design/methodology/approach This work adopts a method based on a multimodal approach on BSD and analytics, considering multiple BSD sources, different analytics techniques on heterogeneous data types, to obtain complementary results on the Salento region (Italy) case study. Findings Results show that the generated insights allow DMOs to acquire new knowledge about discovery of unknown clusters of points of interest, identify trends and seasonal patterns of tourist demand, monitor topic and sentiment and identify attractive places. DMOs can exploit insights to address its needs in terms of decision support for the management and development of the destination, the enhancement of destination attractiveness, the shaping of new marketing and communication strategies and the planning of tourist demand within the destination. Originality/value The originality of this work is in the use of BSD and analytics techniques for giving DMOs specific insights on a destination in a deep and wide fashion. Collected data are used with a multimodal analytic approach to build tourist characteristics, images, attitudes and preferred destination attributes, which represent for DMOs a unique mean for problem-solving, decision-making, innovation and prediction.


Data Mining ◽  
2013 ◽  
pp. 816-836
Author(s):  
Farid Bourennani ◽  
Shahryar Rahnamayan

Nowadays, many world-wide universities, research centers, and companies share their own data electronically. Naturally, these data are from heterogeneous types such as text, numerical data, multimedia, and others. From user side, this data should be accessed in a uniform manner, which implies a unified approach for representing and processing data. Furthermore, unified processing of the heterogeneous data types can lead to richer semantic results. In this chapter, we present a unified pre-processing approach that leads to generation of richer semantics of qualitative and quantitative data.


Author(s):  
José Antonio Seoane Fernández ◽  
Mónica Miguélez Rico

Large worldwide projects like the Human Genome Project, which in 2003 successfully concluded the sequencing of the human genome, and the recently terminated Hapmap Project, have opened new perspectives in the study of complex multigene illnesses: they have provided us with new information to tackle the complex mechanisms and relationships between genes and environmental factors that generate complex illnesses (Lopez, 2004; Dominguez, 2006). Thanks to these new genomic and proteomic data, it becomes increasingly possible to develop new medicines and therapies, establish early diagnoses, and even discover new solutions for old problems. These tasks however inevitably require the analysis, filtration, and comparison of a large amount of data generated in a laboratory with an enormous amount of data stored in public databases, such as the NCBI and the EBI. Computer sciences equip biomedicine with an environment that simplifies our understanding of the biological processes that take place in each and every organizational level of live matter (molecular level, genetic level, cell, tissue, organ, individual, and population) and the intrinsic relationships between them. Bioinformatics can be described as the application of computational methods to biological discoveries (Baldi, 1998). It is a multidisciplinary area that includes computer sciences, biology, chemistry, mathematics, and statistics. The three main tasks of bioinformatics are the following: develop algorithms and mathematical models to test the relationships between the members of large biological datasets, analyze and interpret heterogeneous data types, and implement tools that allow the storage, retrieve, and management of large amounts of biological data.


2020 ◽  
Vol 12 (10) ◽  
pp. 4246 ◽  
Author(s):  
David Pastor-Escuredo ◽  
Yolanda Torres ◽  
María Martínez-Torres ◽  
Pedro J. Zufiria

Natural disasters affect hundreds of millions of people worldwide every year. The impact assessment of a disaster is key to improve the response and mitigate how a natural hazard turns into a social disaster. An actionable quantification of impact must be integratively multi-dimensional. We propose a rapid impact assessment framework that comprises detailed geographical and temporal landmarks as well as the potential socio-economic magnitude of the disaster based on heterogeneous data sources: Environment sensor data, social media, remote sensing, digital topography, and mobile phone data. As dynamics of floods greatly vary depending on their causes, the framework may support different phases of decision-making during the disaster management cycle. To evaluate its usability and scope, we explored four flooding cases with variable conditions. The results show that social media proxies provide a robust identification with daily granularity even when rainfall detectors fail. The detection also provides information of the magnitude of the flood, which is potentially useful for planning. Network analysis was applied to the social media to extract patterns of social effects after the flood. This analysis showed significant variability in the obtained proxies, which encourages the scaling of schemes to comparatively characterize patterns across many floods with different contexts and cultural factors. This framework is presented as a module of a larger data-driven system designed to be the basis for responsive and more resilient systems in urban and rural areas. The impact-driven approach presented may facilitate public–private collaboration and data sharing by providing real-time evidence with aggregated data to support the requests of private data with higher granularity, which is the current most important limitation in implementing fully data-driven systems for disaster response from both local and international actors.


Author(s):  
Cyril Alias ◽  
Udo Salewski ◽  
Viviana Elizabeth Ortiz Ruiz ◽  
Frank Eduardo Alarcón Olalla ◽  
José do Egypto Neirão Reymão ◽  
...  

With global megatrends like automation and digitization changing societies, economies, and ultimately businesses, shift is underway, disrupting current business plans and entire industries. Business actors have accordingly developed an instinctive fear of economic decline and realized the necessity of taking adequate measures to keep up with the times. Increasingly, organizations find themselves in an evolve-or-die race with their success depending on their capability of recognizing the requirements for serving a specific market and adopting those requirements accurately into their own structure. In the transportation and logistics sector, emerging technological and information challenges are reflected in fierce competition from within and outside. Especially, processes and supporting information systems are put to the test when technological innovation start to spread among an increasing number of actors and promise higher performance or lower cost. As to warehousing, technological innovation continuously finds its way into the premises of the heterogeneous warehouse operators, leading to modifications and process improvements. Such innovation can be at the side of the hardware equipment or in the form of new software solutions. Particularly, the fourth industrial revolution is globally underway. Same applies to Future Internet technologies, a European term for innovative software technologies and the research upon them. On the one hand, new hardware solutions using robotics, cyber-physical systems and sensors, and advanced materials are constantly put to widespread use. On the other one, software solutions based on intensified digitization including new and more heterogeneous sources of information, higher volumes of data, and increasing processing speed are also becoming an integral part of popular information systems for warehouses, particularly for warehouse management systems. With a rapidly and dynamically changing environment and new legal and business requirements towards processes in the warehouses and supporting information systems, new performance levels in terms of quality and cost of service are to be obtained. For this purpose, new expectations of the functionality of warehouse management systems need to be derived. While introducing wholly new solutions is one option, retrofitting and adapting existing systems to the new requirements is another one. The warehouse management systems will need to deal with more types of data from new and heterogeneous data sources. Also, it will need to connect to innovative machines and represent their respective operating principles. In both scenarios, systems need to satisfy the demand for new features in order to remain capable of processing information and acting and, thereby, to optimize logistics processes in real time. By taking a closer look at an industrial use case of a warehouse management system, opportunities of incorporating such new requirements are presented as the system adapts to new data types, increased processing speed, and new machines and equipment used in the warehouse. Eventually, the present paper proves the adaptability of existing warehouse management systems to the requirements of the new digital world, and viable methods to adopt the necessary renovation processes.


2018 ◽  
Vol 7 (4.38) ◽  
pp. 939
Author(s):  
Nur Atiqah Sia Abdullah ◽  
Hamizah Binti Anuar

Facebook and Twitter are the most popular social media platforms among netizen. People are now more aggressive to express their opinions, perceptions, and emotions through social media platforms. These massive data provide great value for the data analyst to understand patterns and emotions related to a certain issue. Mining the data needs techniques and time, therefore data visualization becomes trending in representing these types of information. This paper aims to review data visualization studies that involved data from social media postings. Past literature used node-link diagram, node-link tree, directed graph, line graph, heatmap, and stream graph to represent the data collected from the social media platforms. An analysis by comparing the social media data types, representation, and data visualization techniques is carried out based on the previous studies. This paper critically discussed the comparison and provides a suggestion for the suitability of data visualization based on the type of social media data in hand.      


2019 ◽  
Vol 35 (1) ◽  
pp. 25-48 ◽  
Author(s):  
Cristina Alaimo ◽  
Jannis Kallinikos ◽  
Erika Valderrama

The growing business expansion of social media platforms is changing their identity and transforming the practices of networking, data and content sharing with which social media have been commonly associated. We empirically investigate these shifts in the context of TripAdvisor and its evolution since its very establishment. We trace the mutations of the platform along three stages we identify as search engine, social media platform and end-to-end service ecosystem. Our findings reveal the underlying patterns of data types, technological functionalities and actor configurations that punctuate the business expansion of TripAdvisor and lead to the formation of its service ecosystem. We contribute to the understanding of the current trajectory in which social media find themselves as well as to the literature on platforms and ecosystems. We point out the importance of services that develop as commercially viable and constantly updatable data bundles out of diverse and dynamic data types. Such services are essential to the making of the complementarities that are claimed to underlie ecosystem formation.


Sign in / Sign up

Export Citation Format

Share Document