A novel semi-supervised self-training method based on resampling for Twitter fake account identification

2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Ziming Zeng ◽  
Tingting Li ◽  
Shouqiang Sun ◽  
Jingjing Sun ◽  
Jie Yin

PurposeTwitter fake accounts refer to bot accounts created by third-party organizations to influence public opinion, commercial propaganda or impersonate others. The effective identification of bot accounts is conducive to accurately judge the disseminated information for the public. However, in actual fake account identification, it is expensive and inefficient to manually label Twitter accounts, and the labeled data are usually unbalanced in classes. To this end, the authors propose a novel framework to solve these problems.Design/methodology/approachIn the proposed framework, the authors introduce the concept of semi-supervised self-training learning and apply it to the real Twitter account data set from Kaggle. Specifically, the authors first train the classifier in the initial small amount of labeled account data, then use the trained classifier to automatically label large-scale unlabeled account data. Next, iteratively select high confidence instances from unlabeled data to expand the labeled data. Finally, an expanded Twitter account training set is obtained. It is worth mentioning that the resampling technique is integrated into the self-training process, and the data class is balanced at the initial stage of the self-training iteration.FindingsThe proposed framework effectively improves labeling efficiency and reduces the influence of class imbalance. It shows excellent identification results on 6 different base classifiers, especially for the initial small-scale labeled Twitter accounts.Originality/valueThis paper provides novel insights in identifying Twitter fake accounts. First, the authors take the lead in introducing a self-training method to automatically label Twitter accounts from the semi-supervised background. Second, the resampling technique is integrated into the self-training process to effectively reduce the influence of class imbalance on the identification effect.

2020 ◽  
Vol 47 (3) ◽  
pp. 547-560 ◽  
Author(s):  
Darush Yazdanfar ◽  
Peter Öhman

PurposeThe purpose of this study is to empirically investigate determinants of financial distress among small and medium-sized enterprises (SMEs) during the global financial crisis and post-crisis periods.Design/methodology/approachSeveral statistical methods, including multiple binary logistic regression, were used to analyse a longitudinal cross-sectional panel data set of 3,865 Swedish SMEs operating in five industries over the 2008–2015 period.FindingsThe results suggest that financial distress is influenced by macroeconomic conditions (i.e. the global financial crisis) and, in particular, by various firm-specific characteristics (i.e. performance, financial leverage and financial distress in previous year). However, firm size and industry affiliation have no significant relationship with financial distress.Research limitationsDue to data availability, this study is limited to a sample of Swedish SMEs in five industries covering eight years. Further research could examine the generalizability of these findings by investigating other firms operating in other industries and other countries.Originality/valueThis study is the first to examine determinants of financial distress among SMEs operating in Sweden using data from a large-scale longitudinal cross-sectional database.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Lam Hoang Viet Le ◽  
Toan Luu Duc Huynh ◽  
Bryan S. Weber ◽  
Bao Khac Quoc Nguyen

PurposeThis paper aims to identify the disproportionate impacts of the COVID-19 pandemic on labor markets.Design/methodology/approachThe authors conduct a large-scale survey on 16,000 firms from 82 industries in Ho Chi Minh City, Vietnam, and analyze the data set by using different machine-learning methods.FindingsFirst, job loss and reduction in state-owned enterprises have been significantly larger than in other types of organizations. Second, employees of foreign direct investment enterprises suffer a significantly lower labor income than those of other groups. Third, the adverse effects of the COVID-19 pandemic on the labor market are heterogeneous across industries and geographies. Finally, firms with high revenue in 2019 are more likely to adopt preventive measures, including the reduction of labor forces. The authors also find a significant correlation between firms' revenue and labor reduction as traditional econometrics and machine-learning techniques suggest.Originality/valueThis study has two main policy implications. First, although government support through taxes has been provided, the authors highlight evidence that there may be some additional benefit from targeting firms that have characteristics associated with layoffs or other negative labor responses. Second, the authors provide information that shows which firm characteristics are associated with particular labor market responses such as layoffs, which may help target stimulus packages. Although the COVID-19 pandemic affects most industries and occupations, heterogeneous firm responses suggest that there could be several varieties of targeted policies-targeting firms that are likely to reduce labor forces or firms likely to face reduced revenue. In this paper, the authors outline several industries and firm characteristics which appear to more directly be reducing employee counts or having negative labor responses which may lead to more cost–effect stimulus.


Significance Although large-scale social protest in Bahrain has been cowed over the ten years since the ‘Arab uprisings’, small-scale demonstrations recur, reflecting a base level of discontent. Mobilising issues include economic pressures, limited political representation (especially of the Shia majority) and, most recently, ties with Israel. Impacts Despite protests, Israel’s and Bahrain’s respective ambassadors will keep up high-profile activity and statements. The authorities are likely to exaggerate the role of Iranian interference in order to deepen the Sunni-Shia divide. If Riyadh manages to extricate itself from the Yemen war, that could partly reduce the pressure on Manama.


2017 ◽  
Vol 22 (6) ◽  
pp. 486-505 ◽  
Author(s):  
Benjamin Tukamuhabwa ◽  
Mark Stevenson ◽  
Jerry Busby

Purpose In few prior empirical studies on supply chain resilience (SCRES), the focus has been on the developed world. Yet, organisations in developing countries constitute a significant part of global supply chains and have also experienced the disastrous effects of supply chain failures. The purpose of this paper is therefore to empirically investigate SCRES in a developing country context and to show that this also provides theoretical insights into the nature of what is meant by resilience. Design/methodology/approach Using a case study approach, a supply network of 20 manufacturing firms in Uganda is analysed based on a total of 45 interviews. Findings The perceived threats to SCRES in this context are mainly small-scale, chronic disruptive events rather than discrete, large-scale catastrophic events typically emphasised in the literature. The data reveal how threats of disruption, resilience strategies and outcomes are inter-related in complex, coupled and non-linear ways. These interrelationships are explained by the political, cultural and territorial embeddedness of the supply network in a developing country. Further, this embeddedness contributes to the phenomenon of supply chain risk migration, whereby an attempt to mitigate one threat produces another threat and/or shifts the threat to another point in the supply network. Practical implications Managers should be aware, for example, of potential risk migration from one threat to another when crafting strategies to build SCRES. Equally, the potential for risk migration across the supply network means managers should look at the supply chain holistically because actors along the chain are so interconnected. Originality/value The paper goes beyond the extant literature by highlighting how SCRES is not only about responding to specific, isolated threats but about the continuous management of risk migration. It demonstrates that resilience requires both an understanding of the interconnectedness of threats, strategies and outcomes and an understanding of the embeddedness of the supply network. Finally, this study’s focus on the context of a developing country reveals that resilience should be equally concerned both with smaller in scale, chronic disruptions and with occasional, large-scale catastrophic events.


2016 ◽  
Vol 40 (7) ◽  
pp. 867-881 ◽  
Author(s):  
Dingguo Yu ◽  
Nan Chen ◽  
Xu Ran

Purpose With the development and application of mobile internet access, social media represented by Weibo, WeChat, etc. has become the main channel for information release and sharing. High-impact users in social networks are key factors stimulating the large-scale propagation of information within social networks. User influence is usually related to the user’s attention rate, activity level, and message content. The paper aims to discuss these issues. Design/methodology/approach In this paper, the authors focused on Sina Weibo users, centered on users’ behavior and interactive information, and formulated a weighted interactive information network model, then present a novel computational model for Weibo user influence, which combined multiple indexes such as the user’s attention rate, activity level, and message content influence, etc., the model incorporated the time dimension, through the calculation of users’ attribute influence and interactive influence, to comprehensively measure the user influence of Sina Weibo users. Findings Compared with other models, the model reflected the dynamics and timeliness of the user influence in a more accurate way. Extensive experiments are conducted on the real-world data set, and the results validate the performance of the approach, and demonstrate the effectiveness of the dynamics and timeliness. Due to the similarity in platform architecture and user behavior between Sina Weibo and Twitter, the calculation model is also applicable to Twitter. Originality/value This paper presents a novel computational model for Weibo user influence, which combined multiple indexes such as the user’s attention rate, activity level, and message content influence, etc.


2018 ◽  
Vol 857 ◽  
pp. 907-936 ◽  
Author(s):  
A. Cimarelli ◽  
A. Leonforte ◽  
D. Angeli

The separating and reattaching flows and the wake of a finite rectangular plate are studied by means of direct numerical simulation data. The large amount of information provided by the numerical approach is exploited here to address the multi-scale features of the flow and to assess the self-sustaining mechanisms that form the basis of the main unsteadinesses of the flows. We first analyse the statistically dominant flow structures by means of three-dimensional spatial correlation functions. The developed flow is found to be statistically dominated by quasi-streamwise vortices and streamwise velocity streaks as a result of flow motions induced by hairpin-like structures. On the other hand, the reverse flow within the separated region is found to be characterized by spanwise vortices. We then study the spectral properties of the flow. Given the strongly inhomogeneous nature of the flow, the spectral analysis has been conducted along two selected streamtraces of the mean velocity field. This approach allows us to study the spectral evolution of the flow along its paths. Two well-separated characteristic scales are identified in the near-wall reverse flow and in the leading-edge shear layer. The first is recognized to represent trains of small-scale structures triggering the leading-edge shear layer, whereas the second is found to be related to a very large-scale phenomenon that embraces the entire flow field. A picture of the self-sustaining mechanisms of the flow is then derived. It is shown that very-large-scale fluctuations of the pressure field alternate between promoting and suppressing the reverse flow within the separation region. Driven by these large-scale dynamics, packages of small-scale motions trigger the leading-edge shear layers, which in turn created them, alternating in the top and bottom sides of the rectangular plate with a relatively long period of inversion, thus closing the self-sustaining cycle.


2015 ◽  
Vol 34 (9) ◽  
pp. 1094-1112 ◽  
Author(s):  
Darren Lee-Ross

Purpose – The purpose of this paper is to permit further understanding of entrepreneurial personality characteristics of need for achievement, locus of control, innovation, risk-taking and competitive aggression by comparing the self-employed with waged and salaried workers and the general population. Design/methodology/approach – A logistic regression equation was used on the “World Values Survey (WVS)” data set to test the relationship between entrepreneurship and personality characteristics by estimating the probability of an event occurring directly. Findings – This research replicated and extended the earlier work of Beugelsdijk and Noorderhaven (2005). Using two reference groups for comparison, entrepreneurs are different in terms of their psychological characteristics. Specifically, these are need for achievement and locus of control; these were the strongest characteristics. Competitive aggression and risk-taking were moderate in this respect with innovation finding least support. Research limitations/implications – In terms of limitations, the present study does not account for environmental enablers or mitigation of starting and sustaining businesses. Also how do the national media, society and culture regard entrepreneurship? Moreover, is there only one model of entrepreneurship or several? For example, amongst indigenous societies, entrepreneurship is more of a collective rather than an individual pursuit where culture and heritage preservation are more important than purely profit generation. Similarly, no account is taken of the differences (nuanced or otherwise) between entrepreneurial personality characteristics in factor vs opportunity/innovation-driven economies. Practical implications – The self-employed in this study were different to both comparison groups which is important information for government policy formation at all levels in terms of targeted business/career education, infrastructure, funding, opportunity creation and incubator programmes. Furthermore, rudimentary community and university diagnostics could be formulated around these entrepreneurial characteristics to identify potential entrepreneurs for a “career” of self-employment or placement within large firms as “intrapreneurs” to improve productivity and economic growth. Originality/value – This study is the first to use the WVS for scrutiny of entrepreneurial personality traits. It expands and augments earlier work in the field which used the smaller “European Values Survey” by including many more questions pertaining to entrepreneurial personality characteristics adding additional robustness to the outcomes.


2016 ◽  
Vol 36 (11/12) ◽  
pp. 774-791
Author(s):  
Pavol Frič ◽  
Martin Vávra

Purpose The purpose of this paper is to answer following question: what is the relationship between member activism performed through civil society organizations (CSOs) and individualized freelance activism (in form of online activism, everyday making, political consumerism or checkbook activism) independent of organizational framework? Is it a relationship of mutual competition or support? Design/methodology/approach Analysis is carried out on data from 2009 questionnaire-based survey on volunteering, representative for adult Czech population. The data set allowed the authors to relate member activism with freelance activism and in case of member activism distinguish the type of organization and the level of its professionalization. Findings Dominant pattern the authors identified in data is mutual support of both types of volunteering documented by significant overlap of these forms of public engagement. The most striking is the overlap for active members of new advocacy NGOs and the weakest for traditional clubs. Regression analysis shows that on an individual level “mixed activism” (compared with “pure freelance activism”) is linked with higher education and higher confidence in civic organizations. Originality/value The civil practice of individualized freelance activism was described and analysed by various authors as an activity of specific types of activist, but there has not yet been any research giving reflection on such a large scale of freelance activism types as in the analysis. The authors set them together in contrast to the member (collective, organized) form of civic activism and also took into account the influence of professionalization and type of CSOs.


2018 ◽  
Vol 78 (5) ◽  
pp. 592-610 ◽  
Author(s):  
Abbas Ali Chandio ◽  
Yuansheng Jiang ◽  
Feng Wei ◽  
Xu Guangshun

Purpose The purpose of this paper is to evaluate the impact of short-term loan (STL) vs long-term loan (LTL) on wheat productivity of small farms in Sindh, Pakistan. Design/methodology/approach The econometric estimation is based on cross-sectional data collected in 2016 from 18 villages in three districts, i.e. Shikarpur, Sukkur and Shaheed Benazirabad, Sindh, Pakistan. The sample data set consist of 180 wheat farmers. The collected data were analyzed through different econometric techniques like Cobb–Douglas production function and Instrumental variables (two-stage least squares) approach. Findings This study reconfirmed that agricultural credit has a positive and highly significant effect on wheat productivity, while the short-term loan has a stronger effect on wheat productivity than the long-term loan. The reasons behind the phenomenon may be the significantly higher usage of agricultural inputs like seeds of improved variety and fertilizers which can be transformed into the wheat yield in the same year. However, the LTL users have significantly higher investments in land preparation, irrigation and plant protection, which may lead to higher wheat production in the coming years. Research limitations/implications In the present study, only those wheat farmers were considered who obtained agricultural loans from formal financial institutions like Zarai Taraqiati Bank Limited and Khushhali Bank. However, in the rural areas of Sindh, Pakistan, a considerable proportion of small-scale farmers take credit from informal financial channels. Therefore future researchers should consider the informal credits as well. Originality/value This is the first paper to examine the effects of agricultural credit on wheat productivity of small farms in Sindh, Pakistan. This paper will be an important addition to the emerging literature regarding effects of credit studies.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
BinBin Zhang ◽  
Fumin Zhang ◽  
Xinghua Qu

Purpose Laser-based measurement techniques offer various advantages over conventional measurement techniques, such as no-destructive, no-contact, fast and long measuring distance. In cooperative laser ranging systems, it’s crucial to extract center coordinates of retroreflectors to accomplish automatic measurement. To solve this problem, this paper aims to propose a novel method. Design/methodology/approach We propose a method using Mask RCNN (Region Convolutional Neural Network), with ResNet101 (Residual Network 101) and FPN (Feature Pyramid Network) as the backbone, to localize retroreflectors, realizing automatic recognition in different backgrounds. Compared with two other deep learning algorithms, experiments show that the recognition rate of Mask RCNN is better especially for small-scale targets. Based on this, an ellipse detection algorithm is introduced to obtain the ellipses of retroreflectors from recognized target areas. The center coordinates of retroreflectors in the camera coordinate system are obtained by using a mathematics method. Findings To verify the accuracy of this method, an experiment was carried out: the distance between two retroreflectors with a known distance of 1,000.109 mm was measured, with 2.596 mm root-mean-squar error, meeting the requirements of the coarse location of retroreflectors. Research limitations/implications The research limitations/implications are as follows: (i) As the data set only has 200 pictures, although we have used some data augmentation methods such as rotating, mirroring and cropping, there is still room for improvement in the generalization ability of detection. (ii) The ellipse detection algorithm needs to work in relatively dark conditions, as the retroreflector is made of stainless steel, which easily reflects light. Originality/value The originality/value of the article lies in being able to obtain center coordinates of multiple retroreflectors automatically even in a cluttered background; being able to recognize retroreflectors with different sizes, especially for small targets; meeting the recognition requirement of multiple targets in a large field of view and obtaining 3 D centers of targets by monocular model-based vision.


Sign in / Sign up

Export Citation Format

Share Document