Techniques for Sampling Online Text-Based Data Sets

2013 ◽

pp. 95-114 ◽

Cited By ~ 8

Author(s):

Lynne M. Webb ◽

Yuanxin Wang

Keyword(s):

Big Data ◽

Social Networking ◽

Adaptive Sampling ◽

Online Gaming ◽

Online Media ◽

Sampling Techniques ◽

Data Sets ◽

Online Data ◽

Social Networking Websites ◽

Report Analysis

The chapter reviews traditional sampling techniques and suggests adaptations relevant to big data studies of text downloaded from online media such as email messages, online gaming, blogs, micro-blogs (e.g., Twitter), and social networking websites (e.g., Facebook). The authors review methods of probability, purposeful, and adaptive sampling of online data. They illustrate the use of these sampling techniques via published studies that report analysis of online text.

Download Full-text

Tuning Active Sampling Techniques for Evolutionary Learner from Big Data Sets: Review and Discussion

2016 Intl IEEE Conferences on Ubiquitous Intelligence & Computing, Advanced and Trusted Computing, Scalable Computing and Communications, Cloud and Big Data Computing, Internet of People, and Smart World Congress (UIC/ATC/ScalCom/CBDCom/IoP/SmartWorld) ◽

10.1109/uic-atc-scalcom-cbdcom-iop-smartworld.2016.0184 ◽

2016 ◽

Author(s):

Sana Ben Hamida ◽

Marta Rukoz

Keyword(s):

Big Data ◽

Sampling Techniques ◽

Data Sets ◽

Active Sampling

Download Full-text

Familiarity with Big Data, Privacy Concerns, and Self-Disclosure Accuracy in Social Networking Websites: An APCO Model

Communications of the Association for Information Systems ◽

10.17705/1cais.04104 ◽

2017 ◽

Vol 41 ◽

pp. 62-96 ◽

Cited By ~ 3

Author(s):

Tawfiq Alashoor ◽

Sehee Han ◽

Rhoda C. Joseph

Keyword(s):

Big Data ◽

Social Networking ◽

Data Privacy ◽

Self Disclosure ◽

Privacy Concerns ◽

Social Networking Websites ◽

Big Data Privacy

Download Full-text

A survey on text mining in social networks

The Knowledge Engineering Review ◽

10.1017/s0269888914000277 ◽

2015 ◽

Vol 30 (2) ◽

pp. 157-170 ◽

Cited By ~ 42

Author(s):

Rizwana Irfan ◽

Christine K. King ◽

Daniel Grages ◽

Sam Ewen ◽

Samee U. Khan ◽

...

Keyword(s):

Text Mining ◽

Social Networking ◽

Social Networking Sites ◽

Data Sets ◽

Text Documents ◽

Network Applications ◽

Social Networking Websites ◽

The Social ◽

Information Patterns ◽

Basic Approaches

AbstractIn this survey, we review different text mining techniques to discover various textual patterns from the social networking sites. Social network applications create opportunities to establish interaction among people leading to mutual learning and sharing of valuable knowledge, such as chat, comments, and discussion boards. Data in social networking websites is inherently unstructured and fuzzy in nature. In everyday life conversations, people do not care about the spellings and accurate grammatical construction of a sentence that may lead to different types of ambiguities, such as lexical, syntactic, and semantic. Therefore, analyzing and extracting information patterns from such data sets are more complex. Several surveys have been conducted to analyze different methods for the information extraction. Most of the surveys emphasized on the application of different text mining techniques for unstructured data sets reside in the form of text documents, but do not specifically target the data sets in social networking website. This survey attempts to provide a thorough understanding of different text mining techniques as well as the application of these techniques in the social networking websites. This survey investigates the recent advancement in the field of text analysis and covers two basic approaches of text mining, such as classification and clustering that are widely used for the exploration of the unstructured text available on the Web.

Download Full-text

Psychoinformatics: a theoretical approach on information science and psychology

Journal of Management and Science ◽

10.26524/jms.2020.2.2 ◽

2020 ◽

Vol 10 (2) ◽

pp. 7-10

Author(s):

Deepti Pandey

Keyword(s):

Big Data ◽

Social Networking ◽

Information Science ◽

Large Data ◽

Large Data Sets ◽

Future Research ◽

Data Sets ◽

Scientific Methods ◽

New Challenges ◽

Insight Into

This article provides insight into an emerging research discipline called Psychoinformatics.In the context of Psychoinformatics, we emphasize the co-operation between the disciplines of Psychology and Information Science which handles large data sets is derivative from severely used devices like smartphones or any online social networking in order to highlight sychological qualities including both personality and mood. New challenges await psychologists considering the result “Big Data” sets because classic psychological methods will only in part be able to analyze this data derived from ubiquitous mobile devices as well as other everyday technologies. Consequently, psychologist must enrich their scientific methods through the inclusion of methods from informatics. Furthermore, we also emphasize on data which is derived from Psychoinformatics to combine in a such a way to give meaningful way with data from human neuroscience. We close the article with some observations of areas for future research and problems that require consideration within this new discipline.

Download Full-text

Understanding intellectual capital disclosure in online media Big Data

Meditari Accountancy Research ◽

10.1108/medar-03-2018-0302 ◽

2018 ◽

Vol 26 (3) ◽

pp. 499-530 ◽

Cited By ~ 6

Author(s):

Valentina Ndou ◽

Giustina Secundo ◽

John Dumay ◽

Elvin Gjevori

Keyword(s):

Social Media ◽

Big Data ◽

Intellectual Capital ◽

Collective Intelligence ◽

Online Media ◽

Online Data ◽

Content Type ◽

Future Goals ◽

Research Stream ◽

Media Channels

PurposeIntellectual capital disclosure (ICD) in universities is gaining increasing attention, especially through the adoption of innovative technologies. Online media, as a relevant source of Big Data, is shifting ICD. The purpose of this paper is to explore how Big Data generated through online media, such as websites and platforms like Facebook, can be used as rich sources of data and viable disclosure channels for ICD in a university.Design/methodology/approachThis is an exploratory case study, following the methodology in Yin (2014), that examines how online media data contributes to closing the ICD gap. The IC disclosed through different online media channels by a private university in Albania is analysed using Secundo et al.’s (2016) collective intelligence framework. The online data sources include the university’s website, Facebook page, periodic reports and statements outlining future goals.FindingsWhat the authors discover in this research is that IC is an important part of how universities operate, and IC is communicated through social media, although unintentionally. However, this only serves to highlight the importance of IC, and if researchers want to discover IC and understand how it works in an organisation, they need to include social media and a prime resource for developing that understanding.Research limitations/implicationsMost importantly, the findings add to a growing consensus that ICD researchers, and researchers in other management and accounting disciplines, who traditionally rely on annual corporate social responsibility and other periodic reports, they need to change their medium of analysis because these reports no longer can be relied on to understand IC and its impact on an organisation.Originality/valueOnline media tools and the advent of Big Data have created new opportunities for universities to disclose their IC information to stakeholders in a timely manner and to gain relevant insights into their impact on the society. The originality of the paper resides in the contribution of Big Data to the ICD research stream.

Download Full-text

Methods of Researching Online Communities

Thick Big Data ◽

10.1093/oso/9780198839705.003.0003 ◽

2020 ◽

pp. 23-112

Author(s):

Dariusz Jemielniak

Keyword(s):

Big Data ◽

Narrative Analysis ◽

Cultural Production ◽

Quantitative Methods ◽

Methodological Approach ◽

Digital Culture ◽

Data Sets ◽

Online Data ◽

Digital Communities ◽

Ethnographic Analysis

The chapter presents the idea of Thick Big Data, a methodological approach combining big data sets with thick, ethnographic analysis. It presents different quantitative methods, including Google Correlate, social network analysis (SNA), online polls, culturomics, and data scraping, as well as easy tools to start working with online data. It describes the key differences in performing qualitative studies online, by focusing on the example of digital ethnography. It helps using case studies for digital communities as well. It gives specific guidance on conducting interviews online, and describes how to perform narrative analysis of digital culture. It concludes with describing methods of studying online cultural production, and discusses the notions of remix culture, memes, and trolling.

Download Full-text

The Impact Of Social Networking Websites On The System Of University Values Among Students Of Al Ain University Of Science And Technology In The U.A.E.

10.35516/0103-047-001-035 ◽

2020 ◽

pp. 205

Author(s):

محمد سلمان فياض الخزاعلة

Keyword(s):

Social Networking ◽

Science And Technology ◽

Social Networking Websites ◽

Al Ain ◽

The Impact

Download Full-text

Construction of 3-D Terrain Models from BIG Data Sets

10.21236/ada607383 ◽

2014 ◽

Author(s):

Pankaj K. Agarwal ◽

Thomas Moelhave

Keyword(s):

Big Data ◽

Data Sets ◽

Terrain Models

Download Full-text

Big Data Security Challenges and Solution of Distributed Computing in Hadoop Environment: A Security Framework

Recent Advances in Computer Science and Communications ◽

10.2174/2213275912666190822095422 ◽

2020 ◽

Vol 13 (4) ◽

pp. 790-797

Author(s):

Gurjit Singh Bhathal ◽

Amardeep Singh Dhiman

Keyword(s):

Big Data ◽

Data Security ◽

Data Sets ◽

Security Framework ◽

Hadoop Distributed File System ◽

Current Scenario ◽

Hadoop Cluster ◽

Ciphertext Policy ◽

In Transit ◽

Hadoop Framework

Background: In current scenario of internet, large amounts of data are generated and processed. Hadoop framework is widely used to store and process big data in a highly distributed manner. It is argued that Hadoop Framework is not mature enough to deal with the current cyberattacks on the data. Objective: The main objective of the proposed work is to provide a complete security approach comprising of authorisation and authentication for the user and the Hadoop cluster nodes and to secure the data at rest as well as in transit. Methods: The proposed algorithm uses Kerberos network authentication protocol for authorisation and authentication and to validate the users and the cluster nodes. The Ciphertext-Policy Attribute- Based Encryption (CP-ABE) is used for data at rest and data in transit. User encrypts the file with their own set of attributes and stores on Hadoop Distributed File System. Only intended users can decrypt that file with matching parameters. Results: The proposed algorithm was implemented with data sets of different sizes. The data was processed with and without encryption. The results show little difference in processing time. The performance was affected in range of 0.8% to 3.1%, which includes impact of other factors also, like system configuration, the number of parallel jobs running and virtual environment. Conclusion: The solutions available for handling the big data security problems faced in Hadoop framework are inefficient or incomplete. A complete security framework is proposed for Hadoop Environment. The solution is experimentally proven to have little effect on the performance of the system for datasets of different sizes.

Download Full-text