An interactive query-based approach for summarizing scientific documents

2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Farnoush Bayatmakou ◽  
Azadeh Mohebi ◽  
Abbas Ahmadi

Purpose Query-based summarization approaches might not be able to provide summaries compatible with the user’s information need, as they mostly rely on a limited source of information, usually represented as a single query by the user. This issue becomes even more challenging when dealing with scientific documents, as they contain more specific subject-related terms, while the user may not be able to express his/her specific information need in a query with limited terms. This study aims to propose an interactive multi-document text summarization approach that generates an eligible summary that is more compatible with the user’s information need. This approach allows the user to interactively specify the composition of a multi-document summary. Design/methodology/approach This approach exploits the user’s opinion in two stages. The initial query is refined by user-selected keywords/keyphrases and complete sentences extracted from the set of retrieved documents. It is followed by a novel method for sentence expansion using the genetic algorithm, and ranking the final set of sentences using the maximal marginal relevance method. Basically, for implementation, the Web of Science data set in the artificial intelligence (AI) category is considered. Findings The proposed approach receives feedback from the user in terms of favorable keywords and sentences. The feedback eventually improves the summary as the end. To assess the performance of the proposed system, this paper has asked 45 users who were graduate students in the field of AI to fill out a questionnaire. The quality of the final summary has been also evaluated from the user’s perspective and information redundancy. It has been investigated that the proposed approach leads to higher degrees of user satisfaction compared to the ones with no or only one step of the interaction. Originality/value The interactive summarization approach goes beyond the initial user’s query, while it includes the user’s preferred keywords/keyphrases and sentences through a systematic interaction. With respect to these interactions, the system gives the user a more clear idea of the information he/she is looking for and consequently adjusting the final result to the ultimate information need. Such interaction allows the summarization system to achieve a comprehensive understanding of the user’s information needs while expanding context-based knowledge and guiding the user toward his/her information journey.

2015 ◽  
Vol 67 (1) ◽  
pp. 2-26 ◽  
Author(s):  
Joachim Griesbaum ◽  
Nadine Mahrholz ◽  
Kim von Löwe Kiedrowski ◽  
Marc Rittberger

Purpose – The purpose of this paper is to get a first approximation of the usefulness of online forums with regard to information seeking and knowledge generation. Design/methodology/approach – This study captures the characteristics of knowledge generation by examining the pragmatics and types of information needs of posted questions and by investigating knowledge related characteristics of discussion posts as well as the success of communication. Three online forums were examined. The data set consists of 55 threads, containing 533 posts which were categorized manually by two researchers. Findings – Results show that questioners often ask for personal estimations. Information needs often aim for actionable insights or uncertainty reduction. With regard to answers, factual information is the dominant content type and has the highest knowledge value as it is the strongest predictor with regard to the generation of new knowledge. Opinions are also relevant, but in a rather subsequent and complementary way. Emotional aspects are scarcely observed. Overall, results indicate that knowledge creation predominantly follows a socio-cultural paradigm of knowledge exchange. Research limitations/implications – Although the investigation captures important aspects of knowledge building processes, the measurement of the forums’ knowledge value is still rather limited. Success is only partly measurable with the current scheme. The central coding category “new topical knowledge” is only of nominal value and therefore not able to compare different kinds of knowledge gains in the course of discussion. Originality/value – The investigation reaches out beyond studies that do not consider that the role and relevance of posts is dependent on the state of the discussion. Furthermore, the paper integrates two perspectives of knowledge value: the success of the questioner with regard to the expressed information need and the knowledge building value for communicants and readers.


2019 ◽  
Vol 31 (4) ◽  
pp. 478-495 ◽  
Author(s):  
Jan van Helden ◽  
Christoph Reichard

Purpose The purpose of this paper is to dismantle the complex issue of “use of accounting information (AI)” by pointing to different groups of information users, diverging interests and needs of these user groups and various influential factors on the usability and the actual use of AI. Design/methodology/approach This paper includes a literature review and conceptual reflections. Findings The review of recently published articles on the issue of “use of accounting information” presents an actual picture of the academic debate on purposes of use, user types, needs of various user groups and factors influencing the usability and the actual use of AI. The subsequent conceptual reflections deal with so far less regarded user groups, with options to strengthen the user perspective in budgeting and financial reporting, with approaches for engaging users in the content of accounting documents, with interrelations between user needs, usability and use intensity, including various antecedents of the different variables of the information-use issue. Research limitations/implications This paper presents promising routes for future research. Practical implications The paper emphasizes the importance of paying more attention to the specific information needs and the motivations of various stakeholder groups generally interested in using financial information. Originality/value The paper presents results of reviewing recent literature on the issue of “use of accounting information” and provides some insight into specific aspects of this issue.


2020 ◽  
Vol 38 (4) ◽  
pp. 821-842
Author(s):  
Haihua Chen ◽  
Yunhan Yang ◽  
Wei Lu ◽  
Jiangping Chen

Purpose Citation contexts have been found useful in many scenarios. However, existing context-based recommendations ignored the importance of diversity in reducing the redundant issues and thus cannot cover the broad range of user interests. To address this gap, the paper aims to propose a novelty task that can recommend a set of diverse citation contexts extracted from a list of citing articles. This will assist users in understanding how other scholars have cited an article and deciding which articles they should cite in their own writing. Design/methodology/approach This research combines three semantic distance algorithms and three diversification re-ranking algorithms for the diversifying recommendation based on the CiteSeerX data set and then evaluates the generated citation context lists by applying a user case study on 30 articles. Findings Results show that a diversification strategy that combined “word2vec” and “Integer Linear Programming” leads to better reading experience for participants than other diversification strategies, such as CiteSeerX using a list sorted by citation counts. Practical implications This diversifying recommendation task is valuable for developing better systems in information retrieval, automatic academic recommendations and summarization. Originality/value The originality of the research lies in the proposal of a novelty task that can recommend a diversification context list describing how other scholars cited an article, thereby making citing decisions easier. A novel mixed approach is explored to generate the most efficient diversifying strategy. Besides, rather than traditional information retrieval evaluation, a user evaluation framework is introduced to reflect user information needs more objectively.


2019 ◽  
Vol 17 (5) ◽  
pp. 442-462 ◽  
Author(s):  
Brenda Groen ◽  
Theo van der Voordt ◽  
Bartele Hoekstra ◽  
Hester van Sprang

Purpose This paper aims to explore the relationship between satisfaction with buildings, facilities and services and perceived productivity support and to test whether the findings from a similar study of Batenburg and Van der Voordt (2008) are confirmed in a repeat study after 10 years with more recent data. Design/methodology/approach Data were traced from a database with data on user satisfaction and perceived productivity support. These data were collected through the work environment diagnostic tool WODI light. The data include responses from 25,947 respondents and 191 organisations that have been analysed by stepwise multiple-regression analyses. Findings In total 38% of the variation of office employees’ satisfaction with support of productivity can be explained by employee satisfaction with facilities, the organisation, current work processes and personal- and job-related characteristics. The most important predictor of self-assessed support of productivity is employee satisfaction with facilities. In particular, psychological aspects, i.e. opportunities to concentrate and to communicate, privacy, level of openness, and functionality, comfort and diversity of the workplaces are very important. The findings confirm that employee satisfaction with facilities correlates significantly with perceived productivity support. Other factors that are not included in the data set, such as intrinsic motivation, labour circumstances and human resource management may have an impact as well. Originality/value This research provides a clear insight in the relation between employee satisfaction with facilities and the perceived support of productivity, based on survey data collected over almost 10 years in 191 organisations.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Archana Yashodip Chaudhari ◽  
Preeti Mulay

Purpose To reduce the electricity consumption in our homes, a first step is to make the user aware of it. Reading a meter once in a month is not enough, instead, it requires real-time meter reading. Smart electricity meter (SEM) is capable of providing a quick and exact meter reading in real-time at regular time intervals. SEM generates a considerable amount of household electricity consumption data in an incremental manner. However, such data has embedded load patterns and hidden information to extract and learn consumer behavior. The extracted load patterns from data clustering should be updated because consumer behaviors may be changed over time. The purpose of this study is to update the new clustering results based on the old data rather than to re-cluster all of the data from scratch. Design/methodology/approach This paper proposes an incremental clustering with nearness factor (ICNF) algorithm to update load patterns without overall daily load curve clustering. Findings Extensive experiments are implemented on real-world SEM data of Irish Social Science Data Archive (Ireland) data set. The results are evaluated by both accuracy measures and clustering validity indices, which indicate that proposed method is useful for using the enormous amount of smart meter data to understand customers’ electricity consumption behaviors. Originality/value ICNF can provide an efficient response for electricity consumption patterns analysis to end consumers via SEMs.


2017 ◽  
Vol 23 (1) ◽  
pp. 108-129 ◽  
Author(s):  
Dilupa Nakandala ◽  
Premaratne Samaranayake ◽  
Henry Lau ◽  
Krishnamurthy Ramanathan

Purpose Despite much research on supply chain (SC) integration and the growing emphasis on recent information technology advancements as an enabler of improved performance, there has been limited research focussed specifically on information integration in supply chains (SCs). The purpose of this paper is to systematically review the literature on information integration in the fresh food supply chain (FFSC) from a holistic perspective. Design/methodology/approach Literature review is done by systematically collecting and analysing the recent literature to identify various participant entities of the FFSC information network and their specific information needs. Findings The information needs of FFSC entities are diverse but the needs are common across multiple entities. Research limitations/implications This study only reviewed the FFSC-related literature; an extended study of the food industry may reveal a more comprehensive view. Practical implications These findings are useful for practitioners in understanding the participant entities in the information network and their information needs and for policymakers in formulating FFSC development initiatives. Originality/value The authors are not aware of another study that investigates the FFSC in a holistic approach, one that identifies the actors, their interactions and information needs.


2020 ◽  
Vol 54 (4) ◽  
pp. 529-549
Author(s):  
Arshey M. ◽  
Angel Viji K. S.

PurposePhishing is a serious cybersecurity problem, which is widely available through multimedia, such as e-mail and Short Messaging Service (SMS) to collect the personal information of the individual. However, the rapid growth of the unsolicited and unwanted information needs to be addressed, raising the necessity of the technology to develop any effective anti-phishing methods.Design/methodology/approachThe primary intention of this research is to design and develop an approach for preventing phishing by proposing an optimization algorithm. The proposed approach involves four steps, namely preprocessing, feature extraction, feature selection and classification, for dealing with phishing e-mails. Initially, the input data set is subjected to the preprocessing, which removes stop words and stemming in the data and the preprocessed output is given to the feature extraction process. By extracting keyword frequency from the preprocessed, the important words are selected as the features. Then, the feature selection process is carried out using the Bhattacharya distance such that only the significant features that can aid the classification are selected. Using the selected features, the classification is done using the deep belief network (DBN) that is trained using the proposed fractional-earthworm optimization algorithm (EWA). The proposed fractional-EWA is designed by the integration of EWA and fractional calculus to determine the weights in the DBN optimally.FindingsThe accuracy of the methods, naive Bayes (NB), DBN, neural network (NN), EWA-DBN and fractional EWA-DBN is 0.5333, 0.5455, 0.5556, 0.5714 and 0.8571, respectively. The sensitivity of the methods, NB, DBN, NN, EWA-DBN and fractional EWA-DBN is 0.4558, 0.5631, 0.7035, 0.7045 and 0.8182, respectively. Likewise, the specificity of the methods, NB, DBN, NN, EWA-DBN and fractional EWA-DBN is 0.5052, 0.5631, 0.7028, 0.7040 and 0.8800, respectively. It is clear from the comparative table that the proposed method acquired the maximal accuracy, sensitivity and specificity compared with the existing methods.Originality/valueThe e-mail phishing detection is performed in this paper using the optimization-based deep learning networks. The e-mails include a number of unwanted messages that are to be detected in order to avoid the storage issues. The importance of the method is that the inclusion of the historical data in the detection process enhances the accuracy of detection.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Qing Ke ◽  
Jia Tina Du ◽  
Lu Ji

PurposeThe purpose of this paper is to understand how the contextual factors of health crisis information needs are different from a general health context and how these factors work together to shape human information needs.Design/methodology/approachThis study collected the COVID-19-related questions posted on a Chinese social Q&A website for a period of 90 days since the pandemic outbreak in China. A qualitative thematic approach was applied to analyze the 1,681 valid questions using an open coding process.FindingsA taxonomy of information need topics for a health crisis context that identifies 8 main categories and 33 subcategories was developed, from which four overarching themes were extracted. These include understanding, clarification and preparation; affection expression of worries and confidence; coping with a challenging situation and resuming normal life; and social roles in the pandemic. The authors discussed the differences between a health crisis and a normal health context shaping information needs. Finally, a conceptual framework was developed to illustrate the typology, nature and triggers of health crisis information needs.Research limitations/implicationsFirst, only the Baidu Zhidao platform was investigated, and caution is advised before assuming the generalizability of the results, as the questioners of Baidu Zhidao are not representative of the whole population. Furthermore, since at the time of writing the COVID-19 is still in an emerging and evolving situation (Centers for Disease Control and Prevention, 2020), the collected data included only a relatively small sample size compared to the post-pandemic period, and this might have impact on the interpretation of the study’s findings.Practical implicationsThe study’s taxonomy of information needs provides a reference for indexing and organizing related information during a disaster.Social implicationsThe study helps authoritative organizations track and send information in social media and to inform about policies related to the pandemic (e.g., quarantine and traffic control policies in our study) to the right people in the right regions and settings when the next disaster emerges.Originality/valueThe taxonomy of information need topics for a health crisis context can be used to index and organize related information during a disaster and support many information agents to enhance their information service practices. It also deepens the understanding of the formation mechanism of information needs during a global health crisis.


2009 ◽  
Vol 58 (1) ◽  
pp. 17-27 ◽  
Author(s):  
Noorhidawati Abdullah ◽  
Forbes Gibb

PurposeThe purpose of this paper is to present the third of three inter‐related experiments investigating the use and usability of e‐books in Higher Education based on experiments conducted at the University of Strathclyde. This study has looked in greater detail at user interactions with e‐books for reference purposes by focusing on searching and browsing tasks using three search tools: back‐of the‐book index (BoBI), table of contents (ToC) and full text search (FTS).Design/methodology/approachThis study was carried out by subject‐specific users and using a between‐subjects approach. The target population was MSc and research students in the Department of Computer and Information Sciences, at the University of Strathclyde and involved a total of 45 responses.FindingsThe study found that a BoBI was more efficient compared to a ToC and FTS tool for finding information in an e‐book environment. A BoBI was found to perform the best for accurately finding relevant content in e‐books. The usability evaluation also found that a BoBI was more useful compared to a ToC for finding information in an e‐book environment.Research limitations/implicationsThe study was focused only on the usability of e‐books, and in particular on retrieval performance, user satisfaction and preferences regarding BoBI, ToC and FTS, and not on other features such as the user interface. The e‐book usability evaluation was constrained in so far as the e‐books used were: non‐fiction; in the domain of information retrieval; e‐books that already had BoBIs with hyperlinks from the BoBI to the text; e‐books that had ToCs with hyperlinks; e‐books that had FTS tools; and e‐books that were available in PDF format.Practical implicationsThe study is important in gaining a better understanding of the retrieval performance of three search tools (BoBI, ToC and FTS) for browsing for relevant, and searching for specific, information in e‐books. This will be of value for designing better e‐books and access systems.Originality/valueThe strengths and novelty of this study are the methodology that was used, the comprehensive inter‐comparison of tools, and the size of the population. The findings have supported empirically – through an assessment of the performances of BoBIs and ToCs – the need for an enhanced library catalogue system in order to improve users’ browsing and searching capabilities for relevant book content.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Ridhima Mehta

Purpose This paper aims to evaluate the user satisfaction criterion for qualitative assessment of timeliness and efficacy of digital libraries based on the multivariate fuzzy logic technique. Design/methodology/approach In this paper, the performance of digital library services using fuzzy logic modeling are evaluated. This model based on fuzzy logic control is used to compute the dynamic response of users by using multiple independent variables. These parameters with inherent uncertainties in practical scenarios are characterized by fuzzy linguistic information. Findings Several parameters determining the user satisfaction metric in the deployment of digital library exhibit implicit uncertainties which can be intelligently modeled by means of fuzzy control systems. Given the sample data set for the proposed fuzzy multi-attribute decision-making framework, the simulation results are used to compute various error performance measures in the estimation of the fuzzy output variables. Research limitations/implications The size of the considered sample data set is considerably small. Scalable real-world data sets can be used to reinforce the statistical efficiency and accuracy of the proposed model. Moreover, other techniques such as evolutionary multi-objective optimization and the Markovian process can be implemented to explore the efficient correlation between different parameters influencing the users’ behavior and facilitate the general application of the proposed technique. Originality/value The paper applied a fuzzy design methodology in which several attributes related to the service of digital library and the affiliated online resource provisions are used to assess their synchronous impact on user convenience in accessing and manipulating the library information. End-users’ satisfaction is crucial for quality-based valuation of compliance with the time limitations and proficiency of digital libraries.


Sign in / Sign up

Export Citation Format

Share Document