scholarly journals A recommendation system for scientific water data

Author(s):  
Zhaokun Xue ◽  
Alva Couch

AbstractWe describe a recommendation system for HydroShare, a platform for scientific water data sharing. We discuss similarities, differences and challenges for implementing recommendation systems for scientific water data sharing. We discuss and analyze the behaviors that scientists exhibit in using HydroShare as documented by users’ activity logs. Unlike entertainment system users, users on HydroShare tend to be task-oriented, where the set of tasks of interest can change over time, and older interests are sometimes no longer relevant. By validating recommendation approaches against user behavior as expressed in activity logs, we conclude that a combination of content-based filtering and a latent Dirichlet allocation (LDA) topic modeling of user behavior—rather than and instead of LDA classification of dataset topics—provides a workable solution for HydroShare and compares this approach to existing recommendation methods.

2018 ◽  
Vol 169 ◽  
pp. 01003 ◽  
Author(s):  
Hao Li ◽  
Huan Xia ◽  
Yan Kang ◽  
Mohammad Nashir Uddin

As a new interactive service technology, IPTV has been extensively studying in the field of TV pro-gram recommendation, but the sparse of the user-program rating matrix and the cold-start problem is a bottleneck that the program recommended accurately. In this paper, a flexible combination of two recommendation strategies proposed, which explored the sparse and cold-start problem as well as the issue of user interest change over time. This paper achieved content-based filtering section and collaborative filtering section according to the two combination strategies, which effectively solved the cold-start program and over the sparse problem and the problem of users interest change over time. The experimental results showed that this combinational recommendation system in optimal parameters compared by using any one of two combination strategies or not using any combination strategy at all, and the reducing range of MAE is [2.7%,3%].The increasing range of precision and recall is [13.8%95.5%] and [0,97.8%], respectively. The experiment showed better results when using combinational recommendation system in optimal parameters than using each combination strategies individually or not using any combination strategy.


2021 ◽  
pp. 1-22
Author(s):  
Karen Mossberger ◽  
Eric W. Welch ◽  
Yonghong Wu

Broadband internet use is often heralded for its transformative potential in a broad range of policy areas, but there is scarce evidence on whether this is so, and how it can be utilized most effectively by organizations and communities. While the attribution of change to programmatic efforts is a familiar challenge in evaluation research, broadband technologies present some particular issues for evaluation: the “black box” problem of understanding user behavior, the complexity of theorizing about the interaction between technology and policy-specific processes, and understanding change over time. How can we better address both the challenges and the opportunities for evaluating broadband initiatives? This chapter introduces the plan of the volume in the context of answering these questions.


Computation ◽  
2020 ◽  
Vol 8 (2) ◽  
pp. 30 ◽  
Author(s):  
Jose Aguilar ◽  
Camilo Salazar ◽  
Henry Velasco ◽  
Julian Monsalve-Pulido ◽  
Edwin Montoya

This paper analyses the capabilities of different techniques to build a semantic representation of educational digital resources. Educational digital resources are modeled using the Learning Object Metadata (LOM) standard, and these semantic representations can be obtained from different LOM fields, like the title, description, among others, in order to extract the features/characteristics from the digital resources. The feature extraction methods used in this paper are the Best Matching 25 (BM25), the Latent Semantic Analysis (LSA), Doc2Vec, and the Latent Dirichlet allocation (LDA). The utilization of the features/descriptors generated by them are tested in three types of educational digital resources (scientific publications, learning objects, patents), a paraphrase corpus and two use cases: in an information retrieval context and in an educational recommendation system. For this analysis are used unsupervised metrics to determine the feature quality proposed by each one, which are two similarity functions and the entropy. In addition, the paper presents tests of the techniques for the classification of paraphrases. The experiments show that according to the type of content and metric, the performance of the feature extraction methods is very different; in some cases are better than the others, and in other cases is the inverse.


Author(s):  
Mariia Andriienko ◽  
Viktoriia Davydiuk

The article is devoted to the specification of areas and features of improving the classification of costs of the enterprise by elements, in order to successfully manage them. The study was considered on the example of both Ukrainian and Iraqi enterprises, as this classification differs slightly at these enterprises. But it is clarified that differences in the classification of costs by elements may exist not only for different countries, but also due to different opinions of scientists. Questions on production costs in various aspects were dealt with by such domestic and foreign scientists as: F. Butynets, V. Kozak, V. Lastovetsky, O. Moshkovskaya, M. Skrypnyk, O. Grishnova, A. Turilo, Y. Kravchuk and others. It has been found that the issue of classification of costs by elements has recently lost some popularity among Ukrainian economists. There is a fairly large number of criteria for classifying costs, which indicates the importance of information about this object in different views for management purposes. It is specified that the main factors of production (activity), ie the monetary expression of the expenditure of these factors, should be considered as the basis for the classification of costs by elements. The necessity of flexible change of classification of expenses on elements depending on evolution of change in quantity of the used factors and cost structure of expenses for their attraction is substantiated. It is proposed to divide the costs into constants and variables within each item according to the element classification. This logic of cost classification will clarify the cost structure, make it more convenient for management purposes (analysis, rationing, pricing, budgeting). It was found that the costs of the proposed elements will differ in terms of dominance of fixed or variable components. It is proposed in further explorations in this direction to clarify the possibilities of further classification of costs within each element. The generally approved forms of statistical reporting should change over time to describe more objectively what is happening at most enterprises in the country. However, the change of these forms will always be slower than required by the actual circumstances and changes in existing enterprises.


Water ◽  
2020 ◽  
Vol 12 (7) ◽  
pp. 1980 ◽  
Author(s):  
Michelle H. Busch ◽  
Katie H. Costigan ◽  
Ken M. Fritz ◽  
Thibault Datry ◽  
Corey A. Krabbenhoft ◽  
...  

Rivers that cease to flow are globally prevalent. Although many epithets have been used for these rivers, a consensus on terminology has not yet been reached. Doing so would facilitate a marked increase in interdisciplinary interest as well as critical need for clear regulations. Here we reviewed literature from Web of Science database searches of 12 epithets to learn (Objective 1—O1) if epithet topics are consistent across Web of Science categories using latent Dirichlet allocation topic modeling. We also analyzed publication rates and topics over time to (O2) assess changes in epithet use. We compiled literature definitions to (O3) identify how epithets have been delineated and, lastly, suggest universal terms and definitions. We found a lack of consensus in epithet use between and among various fields. We also found that epithet usage has changed over time, as research focus has shifted from description to modeling. We conclude that multiple epithets are redundant. We offer specific definitions for three epithets (non-perennial, intermittent, and ephemeral) to guide consensus on epithet use. Limiting the number of epithets used in non-perennial river research can facilitate more effective communication among research fields and provide clear guidelines for writing regulatory documents.


Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Ahlam Fuad ◽  
Maha Al-Yahya

Mobile app stores provide an extremely rich source of information on app descriptions, characteristics, and usage, and analyzing these data provides insights and a deeper understanding of the nature of apps. However, manual analysis of this vast amount of information on mobile apps is not a simple and straightforward task; it is costly in terms of human effort and time. Computational methods such as topic modeling can provide an efficient and satisfactory approach to mobile app information analysis. Topic modeling is a type of statistical modeling technique for discovering abstract topics that occur in a set of documents. This study explores the relationship between features of Arabic apps and investigates how well the current predefined Google Play app categories represent the type and genre of Arabic mobile apps. Based on the textual app description analysis, we aim to design and develop a sustainable classification system using the Latent Dirichlet Allocation (LDA) method of topic modeling in order to cover the Arabic apps classification in Google Play app store. Our study supports the hypothesis that the textual app descriptions are effective in suggesting new categories for Arabic mobile apps in Google Play app store. Also, the results indicated that the current classification on Google Play app store is not suitable for our case study “Arabic apps,” as well as it is not sustainable, as it can not cover the new app types including Arabic apps. This study offers an important contribution to Arabic app analysis and design, to improve app search and exploration in several domains such as business, marketing, and technical development. Furthermore, it provides insights for the future of Arabic app research and provides guidance for the development of an Arabic app dashboard that will support users on how to select an app based on their specific needs.


2020 ◽  
Vol 10 (10) ◽  
pp. 3388
Author(s):  
Sung-Hwan Kim ◽  
Hwan-Gue Cho

Analyzing user behavior in online spaces is an important task. This paper is dedicated to analyzing the online community in terms of topics. We present a user–topic model based on the latent Dirichlet allocation (LDA), as an application of topic modeling in a domain other than textual data. This model substitutes the concept of word occurrence in the original LDA method with user participation. The proposed method deals with many problems regarding topic modeling and user analysis, which include: inclusion of dynamic topics, visualization of user interaction networks, and event detection. We collected datasets from four online communities with different characteristics, and conducted experiments to demonstrate the effectiveness of our method by revealing interesting findings covering numerous aspects.


Author(s):  
Ligaj Pradhan ◽  
Chengcui Zhang ◽  
Steven Bethard

Intricate user-behaviors can be understood by discovering user interests from their reviews. Topic modeling techniques have been extensively explored to discover latent user interests from user reviews. However, a topic extracted by topic modelling techniques can be a mixture of several quite different concepts and thus less interpretable. In this paper, the authors present a method that uses topic modeling techniques to discover a large number of topics and applies hierarchical clustering to generate a much smaller number of interpretable User-Concerns. These User-Concerns are further compared with topics generated by Latent Dirichlet Allocation (LDA) and Pachinko Allocation Model (PAM) and shown to be more coherent and interpretable. The authors cut the linkage tree formed while performing the hierarchical clustering of the User-Concerns, at different levels, and generate a hierarchy of User-Concerns. They also discuss how collaborative filtering based recommendation systems can be enriched by infusing additional user-behavioral knowledge from such hierarchy.


2016 ◽  
Author(s):  
Mallory Kidwell ◽  
Ljiljana B. Lazarevic ◽  
Erica Baranski ◽  
Tom Elis Hardwicke ◽  
Sarah Piechowski ◽  
...  

Beginning January 2014, Psychological Science gave authors the opportunity to signal open data and materials if they qualified for badges that accompanied published articles. Before badges, less than 3% of Psychological Science articles reported open data. After badges, 23% reported open data, with an accelerating trend; 39% reported open data in the first half of 2015, an increase of more than an order of magnitude from baseline. There was no change over time in the low rates of data sharing among comparison journals. Moreover, reporting openness does not guarantee openness. When badges were earned, reportedly available data were more likely to be actually available, correct, usable, and complete than when badges were not earned. Open materials also increased to a weaker degree, and there was more variability among comparison journals. Badges are simple, effective signals to promote open practices and improve preservation of data and materials by using independent repositories.


Sign in / Sign up

Export Citation Format

Share Document