Collaborative matrix factorization mechanism for group recommendation in big data-based library systems

2018 ◽  
Vol 36 (3) ◽  
pp. 458-481 ◽  
Author(s):  
Yezheng Liu ◽  
Lu Yang ◽  
Jianshan Sun ◽  
Yuanchun Jiang ◽  
Jinkun Wang

Purpose Academic groups are designed specifically for researchers. A group recommendation procedure is essential to support scholars’ research-based social activities. However, group recommendation methods are rarely applied in online libraries and they often suffer from scalability problem in big data context. The purpose of this paper is to facilitate academic group activities in big data-based library systems by recommending satisfying articles for academic groups. Design/methodology/approach The authors propose a collaborative matrix factorization (CoMF) mechanism and implement paralleled CoMF under Hadoop framework. Its rationale is collaboratively decomposing researcher-article interaction matrix and group-article interaction matrix. Furthermore, three extended models of CoMF are proposed. Findings Empirical studies on CiteULike data set demonstrate that CoMF and three variants outperform baseline algorithms in terms of accuracy and robustness. The scalability evaluation of paralleled CoMF shows its potential value in scholarly big data environment. Research limitations/implications The proposed methods fill the gap of group-article recommendation in online libraries domain. The proposed methods have enriched the group recommendation methods by considering the interaction effects between groups and members. The proposed methods are the first attempt to implement group recommendation methods in big data contexts. Practical implications The proposed methods can improve group activity effectiveness and information shareability in academic groups, which are beneficial to membership retention and enhance the service quality of online library systems. Furthermore, the proposed methods are applicable to big data contexts and make library system services more efficient. Social implications The proposed methods have potential value to improve scientific collaboration and research innovation. Originality/value The proposed CoMF method is a novel group recommendation method based on the collaboratively decomposition of researcher-article matrix and group-article matrix. The process indirectly reflects the interaction between groups and members, which accords with actual library environments and provides an interpretable recommendation result.

2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Federica De Santis ◽  
Giuseppe D’Onza

Purpose This study aims to analyze the utilization of big data and data analytics (BDA) in financial auditing, focusing on the process of producing legitimacy around these techniques, the factors fostering or hindering such process and the action auditors take to legitimate BDA inside and outside the audit community. Design/methodology/approach The analysis bases on semi-structured interviews with partners and senior managers of Italian audit companies. Findings The BDA’s legitimation process is more advanced in the audit professional environment than outside the audit community. The Big Four lead the BDA-driven audit innovation process and BDA is used to complement traditional audit procedures. Outside the audit community, the digital maturity of audit clients, the lack of audit standards and the audit oversight authority’s negative view prevent the full legitimation of BDA. Practical implications This research highlights factors influencing the utilization of BDA to enhance audit quality. The results can, thus, be used to enhance the audit strategy and to innovate audit practices by using BDA as a source of adequate audit evidence. Audit regulators and standards setters can also use the results to revise the current auditing standards and guidance. Originality/value This study adds to the literature on digital transformation in auditing by analyzing the legitimation process of a new audit technique. The paper answers the call for more empirical studies on the utilization of BDA in financial auditing by analyzing the application of such techniques in an unexplored operational setting in which auditees are mainly medium-sized enterprises and family-run businesses.


2019 ◽  
Vol 33 (4) ◽  
pp. 369-379 ◽  
Author(s):  
Xia Liu

Purpose Social bots are prevalent on social media. Malicious bots can severely distort the true voices of customers. This paper aims to examine social bots in the context of big data of user-generated content. In particular, the author investigates the scope of information distortion for 24 brands across seven industries. Furthermore, the author studies the mechanisms that make social bots viral. Last, approaches to detecting and preventing malicious bots are recommended. Design/methodology/approach A Twitter data set of 29 million tweets was collected. Latent Dirichlet allocation and word cloud were used to visualize unstructured big data of textual content. Sentiment analysis was used to automatically classify 29 million tweets. A fixed-effects model was run on the final panel data. Findings The findings demonstrate that social bots significantly distort brand-related information across all industries and among all brands under study. Moreover, Twitter social bots are significantly more effective at spreading word of mouth. In addition, social bots use volumes and emotions as major effective mechanisms to influence and manipulate the spread of information about brands. Finally, the bot detection approaches are effective at identifying bots. Research limitations/implications As brand companies use social networks to monitor brand reputation and engage customers, it is critical for them to distinguish true consumer opinions from fake ones which are artificially created by social bots. Originality/value This is the first big data examination of social bots in the context of brand-related user-generated content.


2019 ◽  
Vol 46 (7) ◽  
pp. 1319-1331 ◽  
Author(s):  
Simplice Asongu ◽  
Nicholas M. Odhiambo

Purpose The purpose of this paper is to examine the relationship between tourism and social media from a cross section of 138 countries with data for the year 2012. Design/methodology/approach The empirical evidence is based on Ordinary Least Squares, Negative Binomial and Quantile Regressions. Findings Two main findings are established. First, there is a positive relationship between Facebook penetration and the number of tourist arrivals. Second, Facebook penetration is more relevant in promoting tourist arrivals in countries where initial levels in tourist arrivals are the highest and low. The established positive relationship can be elucidated from four principal angles: the transformation of travel research, the rise in social sharing, improvements in customer service and the reshaping of travel agencies. Originality/value This study explores a new data set on social media. There are very few empirical studies on the relevance of social media in development outcomes.


2020 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Jiali Zheng ◽  
Han Qiao ◽  
Xiumei Zhu ◽  
Shouyang Wang

Purpose This study aims to explore the role of equity investment in knowledge-driven business model innovation (BMI) in context of open modes according to the evidence from China’s primary market. Design/methodology/approach Based on the database of China’s private market and data set of news clouds, the statistic approach is applied to explore and explain whether equity investment promotes knowledge-driven BMI. Machine learning method is also used to prove and predict the performance of such open innovation. Findings The results of logistic regression show that explanatory variables are significant, providing evidence that knowledge management (KM) promotes BMI through equity investment. By further using back propagation neural network, the classification learning algorithm estimates the possibility of BMI, which can be regarded as a score to quantify the performance of knowledge-driven BMI Research limitations/implications The quality of secondhand big data is not very ideal, and future empirical studies should use first-hand survey data. Practical implications This study provides new insights into the link between KM and BMI by highlighting the important roles of external investments in open modes. Social implications From the perspective of investment, the findings of this study suggest the importance for stakeholders to share knowledge and strategies for entrepreneurs to manage innovation. Originality/value The concepts and indicators related to business models are difficult to quantify currently, while this study provides feasible and practical methods to estimate knowledge-driven BMI with secondhand data from the primary market. The mechanism of knowledge and innovation bridged by the experience from investors is introduced and analyzed.


Author(s):  
Nick Kelly ◽  
Maximiliano Montenegro ◽  
Carlos Gonzalez ◽  
Paula Clasing ◽  
Augusto Sandoval ◽  
...  

Purpose The purpose of this paper is to demonstrate the utility of combining event-centred and variable-centred approaches when analysing big data for higher education institutions. It uses a large, university-wide data set to demonstrate the methodology for this analysis by using the case study method. It presents empirical findings about relationships between student behaviours in a learning management system (LMS) and the learning outcomes of students, and further explores these findings using process modelling techniques. Design/methodology/approach The paper describes a two-year study in a Chilean university, using big data from a LMS and from the central university database of student results and demographics. Descriptive statistics of LMS use in different years presents an overall picture of student use of the system. Process mining is described as an event-centred approach to give a deeper level of understanding of these findings. Findings The study found evidence to support the idea that instructors do not strongly influence student use of an LMS. It replicates existing studies to show that higher-performing students use an LMS differently from the lower-performing students. It shows the value of combining variable- and event-centred approaches to learning analytics. Research limitations/implications The study is limited by its institutional context, its two-year time frame and by its exploratory mode of investigation to create a case study. Practical implications The paper is useful for institutions in developing a methodology for using big data from a LMS to make use of event-centred approaches. Originality/value The paper is valuable in replicating and extending recent studies using event-centred approaches to analysis of learning data. The study here is on a larger scale than the existing studies (using a university-wide data set), in a novel context (Latin America), that provides a clear description for how and why the methodology should inform institutional approaches.


2017 ◽  
Vol 1 (2) ◽  
pp. 105-126 ◽  
Author(s):  
Xiu Susie Fang ◽  
Quan Z. Sheng ◽  
Xianzhi Wang ◽  
Anne H.H. Ngu ◽  
Yihong Zhang

Purpose This paper aims to propose a system for generating actionable knowledge from Big Data and use this system to construct a comprehensive knowledge base (KB), called GrandBase. Design/methodology/approach In particular, this study extracts new predicates from four types of data sources, namely, Web texts, Document Object Model (DOM) trees, existing KBs and query stream to augment the ontology of the existing KB (i.e. Freebase). In addition, a graph-based approach to conduct better truth discovery for multi-valued predicates is also proposed. Findings Empirical studies demonstrate the effectiveness of the approaches presented in this study and the potential of GrandBase. The future research directions regarding GrandBase construction and extension has also been discussed. Originality/value To revolutionize our modern society by using the wisdom of Big Data, considerable KBs have been constructed to feed the massive knowledge-driven applications with Resource Description Framework triples. The important challenges for KB construction include extracting information from large-scale, possibly conflicting and different-structured data sources (i.e. the knowledge extraction problem) and reconciling the conflicts that reside in the sources (i.e. the truth discovery problem). Tremendous research efforts have been contributed on both problems. However, the existing KBs are far from being comprehensive and accurate: first, existing knowledge extraction systems retrieve data from limited types of Web sources; second, existing truth discovery approaches commonly assume each predicate has only one true value. In this paper, the focus is on the problem of generating actionable knowledge from Big Data. A system is proposed, which consists of two phases, namely, knowledge extraction and truth discovery, to construct a broader KB, called GrandBase.


2014 ◽  
Vol 5 (2) ◽  
pp. 209-232
Author(s):  
Olawale Oladipo Adejuwon

Purpose – In order to achieve a desirable level of market efficiency, regulators need to identify the strategic groups within an industry and understand the way the constituent groups relate to one another. The paper aims to discuss these issues. Design/methodology/approach – In the current study, factors that may lead to strategic group formation were developed and used as clustering variables in a k-means cluster statistical analysis to categorize the firms into strategic groups. The factors used are entry costs, timing of entry, technology type and scope of operations. In addition, the number and type of competitive actions employed by the firms in the industry were identified by structured content analysis of a public source. The competitive actions were used to examine the dynamics of the resulting groups within the context of competitive behavior, resource and scope commitments and corporate social responsibility (CSR) actions. In addition, χ2 analysis was employed to ascertain the likelihood that actions of a firm will be responded to by firms from the same group or from outside the group. Findings – License fees was found to be the most significant clustering variable. The study also showed that groups with significantly higher license fees carried out considerably more competitive actions, had higher resource and scope commitments and executed more CSR actions. In addition, the study revealed significantly more competition within strategic groups than between groups. Research limitations/implications – The absence of financial records for firms in the sample necessitated the use of CSR activity as a measure of firm performance. Some empirical studies have shown strong links between CSR and firm performance. Practical implications – The study revealed high mobility barriers which prevent ease of movement of firms in the industry from one strategic group to the other. Therefore regulators who wish to promote competition must do so by identifying the strategic groups with significant market power and permitting entry not by lowering entry barriers but by allowing the entry of firms with proven resources similar to the firms in those groups and to stipulate similar commitments in entry conditions. The results also offer management practitioners an insight into competitive behavior in the industry. Originality/value – The study utilized a unique data set (competitive actions of firms in the Nigerian Telecommunications industry as reported in the media) in contributing to empirical studies on competitive dynamics and strategic group literature.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Chi Kwok ◽  
Ngai Keung Chan

Purpose This study aims to develop an interdisciplinary political theory of data justice by connecting three major political theories of the public good with empirical studies about the functions of big data and offering normative principles for restricting and guiding the state’s data practices from a public good perspective. Design/methodology/approach Drawing on three major political theories of the public good – the market failure approach, the basic rights approach and the democratic approach – and critical data studies, this study synthesizes existing studies on the promises and perils of big data for public good purposes. The outcome is a conceptual paper that maps philosophical discussions about the conditions under which the state has a legitimate right to collect and use big data for public goods purposes. Findings This study argues that market failure, basic rights protection and deepening democracy can be normative grounds for justifying the state’s right to data collection and utilization, from the perspective of political theories of the public good. The state’s data practices, however, should be guided by three political principles, namely, the principle of transparency and accountability; the principle of fairness; and the principle of democratic legitimacy. The paper draws on empirical studies and practical examples to explicate these principles. Originality/value Bringing together normative political theory and critical data studies, this study contributes to a more philosophically rigorous understanding of how and why big data should be used for public good purposes while discussing the normative boundaries of such data practices.


2019 ◽  
Vol 53 (2) ◽  
pp. 217-229 ◽  
Author(s):  
Xiaomei Wei ◽  
Yaliang Zhang ◽  
Yu Huang ◽  
Yaping Fang

PurposeThe traditional drug development process is costly, time consuming and risky. Using computational methods to discover drug repositioning opportunities is a promising and efficient strategy in the era of big data. The explosive growth of large-scale genomic, phenotypic data and all kinds of “omics” data brings opportunities for developing new computational drug repositioning methods based on big data. The paper aims to discuss this issue.Design/methodology/approachHere, a new computational strategy is proposed for inferring drug–disease associations from rich biomedical resources toward drug repositioning. First, the network embedding (NE) algorithm is adopted to learn the latent feature representation of drugs from multiple biomedical resources. Furthermore, on the basis of the latent vectors of drugs from the NE module, a binary support vector machine classifier is trained to divide unknown drug–disease pairs into positive and negative instances. Finally, this model is validated on a well-established drug–disease association data set with tenfold cross-validation.FindingsThis model obtains the performance of an area under the receiver operating characteristic curve of 90.3 percent, which is comparable to those of similar systems. The authors also analyze the performance of the model and validate its effect on predicting the new indications of old drugs.Originality/valueThis study shows that the authors’ method is predictive, identifying novel drug–disease interactions for drug discovery. The new feature learning methods also positively contribute to the heterogeneous data integration.


2019 ◽  
Vol 57 (8) ◽  
pp. 1980-1992 ◽  
Author(s):  
Gabriele Santoro ◽  
Fabio Fiano ◽  
Bernardo Bertoldi ◽  
Francesco Ciampi

Purpose The purpose of this paper is to shed light on how big data deployment transforms organizational practices, thereby generating potential benefits, in a specific industry: retail. Design/methodology/approach To achieve the paper’s goal, the authors have conducted several semi-structured interviews with marketing managers of four retailers in Italy, and researched secondary data to get a broader picture of big data deployment in the organizations. Findings Data analysis helped identify specific aspects related to big data deployment, data gathering methods, required competences and data sharing approaches. Originality/value Despite the growing interest in big data in various fields of research, there are still few empirical studies on big data deployment in organizations in the management field, and even fewer on specific sectors. This research provides evidence of specific areas of analysis concerning big data in the retail industry.


Sign in / Sign up

Export Citation Format

Share Document