Detecting Local Communities within a Large Scale Social Network Using Mapreduce

2014 ◽  
Vol 10 (1) ◽  
pp. 57-76 ◽  
Author(s):  
Hongjun Yin ◽  
Jing Li ◽  
Yue Niu

Social network partitioning has become a very important function. One objective for partitioning is to identify interested communities to target for marketing and advertising activities. The bottleneck to detection of these communities is the large scalability of the social network. Previous methods did not effectively address the problem because they considered the overall network. Social networks have strong locality, so designing a local algorithm to find an interested community to address this objective is necessary. In this paper, we develop a local partition algorithm, named, Personalized PageRank Partitioning, to identify the community. We compute the conductance of the social network with a Personalized PageRank and Markov chain stationary distribution of the social network, and then sweep the conductance to find the smallest cut. The efficiency of the cut can reach. In order to detect a larger scale social network, we design and implement the algorithm on a MapReduce-programming framework. Finally, we execute our experiment on several actual social network data sets and compare our method to others. The experimental results show that our algorithm is feasible and very effective.

2019 ◽  
Vol 92 (2) ◽  
pp. 105-123 ◽  
Author(s):  
Isabel J. Raabe ◽  
Zsófia Boda ◽  
Christoph Stadtfeld

Individuals’ favorite subjects in school can predetermine their educational and occupational careers. If girls develop weaker preferences for science, technology, engineering, and math (STEM), it can contribute to macrolevel gender inequalities in income and status. Relying on large-scale panel data on adolescents from Sweden (218 classrooms, 4,998 students), we observe a widening gender gap in preferring STEM subjects within a year (girls, 19 to 15 percent; boys, 21 to 20 percent). By applying newly developed random-coefficient multilevel stochastic actor-oriented models on social network data (27,428 friendships), we investigate how social context contributes to those changes. We find strong evidence that students adjust their preferences to those of their friends (friend influence). Moreover, girls tend to retain their STEM preferences when other girls in their classroom also like STEM (peer exposure). We conclude that these mechanisms amplify preexisting preferences and thereby contribute to the observed dramatic widening of the STEM gender gap.


Methodology ◽  
2006 ◽  
Vol 2 (1) ◽  
pp. 42-47 ◽  
Author(s):  
Bonne J. H. Zijlstra ◽  
Marijtje A. J. van Duijn ◽  
Tom A. B. Snijders

The p 2 model is a random effects model with covariates for the analysis of binary directed social network data coming from a single observation of a social network. Here, a multilevel variant of the p 2 model is proposed for the case of multiple observations of social networks, for example, in a sample of schools. The multilevel p 2 model defines an identical p 2 model for each independent observation of the social network, where parameters are allowed to vary across the multiple networks. The multilevel p 2 model is estimated with a Bayesian Markov Chain Monte Carlo (MCMC) algorithm that was implemented in free software for the statistical analysis of complete social network data, called StOCNET. The new model is illustrated with a study on the received practical support by Dutch high school pupils of different ethnic backgrounds.


2019 ◽  
pp. 81-93
Author(s):  
Iliya L. Musabirov ◽  

The article presents a description of the approach to the use of data visualization in various educational Analytics tools when building University courses. In addition to the analysis of educational behavior, socio-psychological approaches, including the theory of expectations and social values, and the social network approach, are separately considered as prospects for analysis. An example of designing training Analytics using modern data analysis and visualization tools is analyzed.


E-Marketing ◽  
2012 ◽  
pp. 185-197
Author(s):  
Przemyslaw Kazienko ◽  
Piotr Doskocz ◽  
Tomasz Kajdanowicz

The chapter describes a method how to perform a classification task without any demographic features and based only on the social network data. The concept of such collective classification facilitates to identify potential customers by means of services used or products purchased by the current customers, i.e. classes they belong to as well as using social relationships between the known and potential customers. As a result, a personalized offer can be prepared for the new clients. This innovative marketing method can boost targeted marketing campaigns.


Author(s):  
Przemyslaw Kazienko ◽  
Piotr Doskocz ◽  
Tomasz Kajdanowicz

The chapter describes a method how to perform a classification task without any demographic features and based only on the social network data. The concept of such collective classification facilitates to identify potential customers by means of services used or products purchased by the current customers, i.e. classes they belong to as well as using social relationships between the known and potential customers. As a result, a personalized offer can be prepared for the new clients. This innovative marketing method can boost targeted marketing campaigns.


Author(s):  
Anatoliy Gruzd

The chapter presents a new web-based system called ICTA (http://netlytic.org) for automated analysis and visualization of online conversations in virtual communities. ICTA is designed to help researchers and other interested parties derive wisdom from large datasets. The system does this by offering a set of text mining techniques coupled with useful visualizations. The first part of the chapter describes ICTA’s infrastructure and user interface. The second part discusses two social network discovery procedures used by ICTA with a particular focus on a novel content-based method called name networks. The main advantage of this method is that it can be used to transform even unstructured Internet data into social network data. With the social network data available it is much easier to analyze, and make judgments about, social connections in a virtual community.


2020 ◽  
Vol 32 (7) ◽  
pp. 1393-1404
Author(s):  
Wenhe Liu ◽  
Dong Gong ◽  
Mingkui Tan ◽  
Javen Qinfeng Shi ◽  
Yi Yang ◽  
...  

Author(s):  
Sanur Sharma ◽  
Vishal Bhatnagar

In recent times, there has been a tremendous increase in the number of social networking sites and their users. With the amount of information posted on the public forums, it becomes essential for the service providers to maintain the privacy of an individual. Anonymization as a technique to secure social network data has gained popularity, but there are challenges in implementing it effectively. In this chapter, the authors have presented a conceptual framework to secure the social network data effectively by using data mining techniques to perform in-depth social network analysis before carrying out the actual anonymization process. The authors’ framework in the first step defines the role of community analysis in social network and its various features and temporal metrics. In the next step, the authors propose the application of those data mining techniques that can deal with the dynamic nature of social network and discover important attributes of the social network. Finally, the authors map their security requirements and their findings of the network properties which provide an appropriate base for selection and application of the anonymization technique to protect privacy of social network data.


Sign in / Sign up

Export Citation Format

Share Document