Improving Time-Aware Recommendations in Open Source Packages

2019 ◽  
Vol 28 (06) ◽  
pp. 1960007
Author(s):  
Panagiotis Symeonidis ◽  
Ludovik Coba ◽  
Markus Zanker

Collaborative filtering techniques have been studied extensively during the last decade. Many open source packages (Apache Mahout, LensKit, MyMediaLite, rrecsys etc.) have implemented them, but typically the top-N recommendation lists are only based on a highest predicted ratings approach. However, exploiting frequencies in the user/item neighborhood for the formation of the top-N recommendation lists has been shown to provide superior accuracy results in offline simulations. In addition, most open source packages use a time-independent evaluation protocol to test the quality of recommendations, which may result to misleading conclusions since it cannot simulate well the real-life systems, which are strongly related to the time dimension. In this paper, we have therefore implemented the time-aware evaluation protocol to the open source recommendation package for the R language — denoted rrecsys — and compare its performance across open source packages for reasons of replicability. Our experimental results clearly demonstrate that using the most frequent items in neighborhood approach significantly outperforms the highest predicted rating approach on three public datasets. Moreover, the time-aware evaluation protocol has been shown to be more adequate for capturing the life-time effectiveness of recommender systems.

Author(s):  
Ying Sun ◽  
Xiao-Yuan Jing ◽  
Fei Wu ◽  
Xiwei Dong ◽  
Yanfei Sun ◽  
...  

The heterogeneous defect prediction (HDP) technique can predict defects in a target company using heterogeneous metric data from external company, which has received substantial research attention. However, existing HDP methods assume that source data is labeled but labeling data is expensive. Semi-supervised defect prediction technique can perform defect prediction with few labeled data. In this paper, we investigate a new problem — semi-supervised HDP (SHDP). To solve this problem, we propose a new approach named cost-sensitive kernel semi-supervised correlation analysis (CKSCA) as a solution of SHDP problem. It introduces unified metric representation and canonical correlation analysis to make the data distributions of different company projects more similar. CKSCA also designs a cost-sensitive kernel semi-supervised discriminant analysis mechanism to utilize the limited labeled data and sufficient real-life unlabeled data from different companies. Besides we collect lots of open-source projects from GitHub website to construct a new large-scale unlabeled dataset called GITHUB dataset. It contains 26,407 modules and is greater than each public project dataset. It has been public online and can be extended continuously. Experiments on the GITHUB dataset and other public datasets indicate that unlabeled GITHUB data can help prediction model improve prediction performance, and CKSCA is effective and efficient for solving SHDP problem.


2021 ◽  
pp. 295-303
Author(s):  
Ksenofon Krisafi ◽  
Jonida Vila

The nature is the origin of being. This is one of the reason why mostly the imagine of nature are present in any web-page. Searching and navigating on network we often are like tourist or better virtual tourist which explore unreachable real beauty of the moment. On it’s own human being desire to upgrade the state of his evolution. In nowadays we apprehend the motion of our everyday life through the mass use of Artificial Intelligence device which are influence by the rule created on the parallel dimension the cyber-world. The cyber-world is a dimension where each of us becomes part of the cyber-society that indicate much faster and foster the opinion which afterward will be spread through the words or news in the real life time. Aware for the multidimensional evolution of the science, we can benefit from facilitated opportunities and at the same time to have much more possibilities for reflecting our actions in positive light.


2020 ◽  
Author(s):  
Andrew Fang ◽  
Jonathan Kia-Sheng Phua ◽  
Terrence Chiew ◽  
Daniel De-Liang Loh ◽  
Lincoln Ming Han Liow ◽  
...  

BACKGROUND During the Coronavirus Disease 2019 (COVID-19) outbreak, community care facilities (CCF) were set up as temporary out-of-hospital isolation facilities to contain the surge of cases in Singapore. Confined living spaces within CCFs posed an increased risk of communicable disease spread among residents. OBJECTIVE This inspired our healthcare team managing a CCF operation to design a low-cost communicable disease outbreak surveillance system (CDOSS). METHODS Our CDOSS was designed with the following considerations: (1) comprehensiveness, (2) efficiency through passive reconnoitering from electronic medical record (EMR) data, (3) ability to provide spatiotemporal insights, (4) low-cost and (5) ease of use. We used Python to develop a lightweight application – Python-based Communicable Disease Outbreak Surveillance System (PyDOSS) – that was able perform syndromic surveillance and fever monitoring. With minimal user actions, its data pipeline would generate daily control charts and geospatial heat maps of cases from raw EMR data and logged vital signs. PyDOSS was successfully implemented as part of our CCF workflow. We also simulated a gastroenteritis (GE) outbreak to test the effectiveness of the system. RESULTS PyDOSS was used throughout the entire duration of operation; the output was reviewed daily by senior management. No disease outbreaks were identified during our medical operation. In the simulated GE outbreak, PyDOSS was able to effectively detect an outbreak within 24 hours and provided information about cluster progression which could aid in contact tracing. The code for a stock version of PyDOSS has been made publicly available. CONCLUSIONS PyDOSS is an effective surveillance system which was successfully implemented in a real-life medical operation. With the system developed using open-source technology and the code made freely available, it significantly reduces the cost of developing and operating CDOSS and may be useful for similar temporary medical operations, or in resource-limited settings.


Sensors ◽  
2021 ◽  
Vol 21 (15) ◽  
pp. 5172
Author(s):  
Yuying Dong ◽  
Liejun Wang ◽  
Shuli Cheng ◽  
Yongming Li

Considerable research and surveys indicate that skin lesions are an early symptom of skin cancer. Segmentation of skin lesions is still a hot research topic. Dermatological datasets in skin lesion segmentation tasks generated a large number of parameters when data augmented, limiting the application of smart assisted medicine in real life. Hence, this paper proposes an effective feedback attention network (FAC-Net). The network is equipped with the feedback fusion block (FFB) and the attention mechanism block (AMB), through the combination of these two modules, we can obtain richer and more specific feature mapping without data enhancement. Numerous experimental tests were given by us on public datasets (ISIC2018, ISBI2017, ISBI2016), and a good deal of metrics like the Jaccard index (JA) and Dice coefficient (DC) were used to evaluate the results of segmentation. On the ISIC2018 dataset, we obtained results for DC equal to 91.19% and JA equal to 83.99%, compared with the based network. The results of these two main metrics were improved by more than 1%. In addition, the metrics were also improved in the other two datasets. It can be demonstrated through experiments that without any enhancements of the datasets, our lightweight model can achieve better segmentation performance than most deep learning architectures.


PLoS ONE ◽  
2021 ◽  
Vol 16 (11) ◽  
pp. e0258512
Author(s):  
Phillip Oluwatobi Awodutire ◽  
Oluwafemi Samson Balogun ◽  
Akintayo Kehinde Olapade ◽  
Ethelbert Chinaka Nduka

In this work, a new family of distributions, which extends the Beta transmuted family, was obtained, called the Modified Beta Transmuted Family of distribution. This derived family has the Beta Family of Distribution and the Transmuted family of distribution as subfamilies. The Modified beta transmuted frechet, modified beta transmuted exponential, modified beta transmuted gompertz and modified beta transmuted lindley were obtained as special cases. The analytical expressions were studied for some statistical properties of the derived family of distribution which includes the moments, moments generating function and order statistics. The estimates of the parameters of the family were obtained using the maximum likelihood estimation method. Using the exponential distribution as a baseline for the family distribution, the resulting distribution (modified beta transmuted exponential distribution) was studied and its properties. The modified beta transmuted exponential distribution was applied to a real life time data to assess its flexibility in which the results shows a better fit when compared to some competitive models.


2020 ◽  
pp. 1199-1212
Author(s):  
Syeda Erfana Zohora ◽  
A. M. Khan ◽  
Arvind K. Srivastava ◽  
Nhu Gia Nguyen ◽  
Nilanjan Dey

In the last few decades there has been a tremendous amount of research on synthetic emotional intelligence related to affective computing that has significantly advanced from the technological point of view that refers to academic studies, systematic learning and developing knowledge and affective technology to a extensive area of real life time systems coupled with their applications. The objective of this paper is to present a general idea on the area of emotional intelligence in affective computing. The overview of the state of the art in emotional intelligence comprises of basic definitions and terminology, a study of current technological scenario. The paper also proposes research activities with a detailed study of ethical issues, challenges with importance on affective computing. Lastly, we present a broad area of applications such as interactive learning emotional systems, modeling emotional agents with an intention of employing these agents in human computer interactions as well as in education.


2020 ◽  
Vol 4 (s1) ◽  
pp. 63-63
Author(s):  
Jeffrey Robinson ◽  
Annica Wayman

OBJECTIVES/GOALS: Introduce students to programming and software development practices in the life sciences by analyzing standard clinical diagnostic bloodwork for differential immune responses. Including lectures and a semester project with the goal of enhancing undergraduate students’ education to prepare them for careers in translational science. METHODS/STUDY POPULATION: The educational content was taught for the first time as a component of the newly developed course BTEC 330 “Software Applications in the Life Sciences” in UMBC’s Translational Life Science Technology (TLST) Bachelor’s degree program at the Universities at Shady Grove campus. Eleven students took the course. All were beginners with no programming background. Lectures provided background on the diagnostic components of the CBC, criteria for differential diagnosis in the clinical setting, and introduction to hematology and flow cytometry, forming underpinnings for interpretation of the CBC results. Weekly computer lab practical sessions provided training fundamentals of R programming language, the R-studio integrated development environment (IDE), and the GitHub.com open-source software development platform. RESULTS/ANTICIPATED RESULTS: The graded assignment consisted of a coding project in which students were each assigned an individual parameter from the CBC results. These include, for example, relative lymphocyte count or hemoglobin readouts. Students each created their own R-language script using R-studio, with functional code which: 1) Read in data from a file provided, 2) Performed statistical testing, 3) Read out statistical results as text, and charts as image files, 4) “Diagnosed” individuals in the dataset as being inside or outside the clinical normal range for that parameter. Each student also registered their own GitHub account and published their open-source code. Grading was performed on code functionality by downloading each student repository and running the code with the instructor as an outside developer using the resource. DISCUSSION/SIGNIFICANCE OF IMPACT: In this curriculum, students with no background in programming learned to code a basic R-language script and use GitHub to automate interpretation of CBC results. With advanced automation now becoming commonplace in translational science, such course content can provide introductory level of literacy in development of clinical informatics software.


2020 ◽  
Vol 6 (1) ◽  
pp. 6-9 ◽  
Author(s):  
Juraj Packa ◽  
Vladimir Kujan ◽  
Daniel Štrkula ◽  
Vladimír Šály ◽  
Milan Perný

<span style="font-family: 'Times New Roman',serif; font-size: 10pt; mso-fareast-font-family: 'Times New Roman'; mso-ansi-language: EN-US; mso-fareast-language: EN-US; mso-bidi-language: AR-SA;" lang="EN-US">An important part of the photovoltaic power plants are cable systems. The dielectric properties of cables, reliability and durability depend on quality of production processes, operating conditions and degradation factors, as well. Expected lifetime of cable systems is more than 20-30 years in general. Their failure free operation and long-term stability of properties has a direct impact on the economic return of the investments. According to our experiences the tests in compliance with valid standards are not adequate to verify real life time during operation. Photovoltaic cables intended for use in outdoor applications for the connection between the solar panels and possible connection between panels and inverter were chosen for our experiments. <span style="-ms-layout-grid-mode: line;">The changes </span>of insulation resistance and breakdown voltage caused by some degradation factors, mainly water, are presented. This research was inspired by real failure in operation.</span>


2019 ◽  
Vol 10 (03) ◽  
pp. 534-542 ◽  
Author(s):  
Monika Maya Wahi ◽  
Natasha Dukach

Background Health care-associated infections, specifically catheter-associated urinary tract infections (CAUTIs), can cause significant mortality and morbidity. However, the process of collecting CAUTI surveillance data, storing it, and visualizing the data to inform health policy has been fraught with challenges. Objectives No standard has been developed, so the objective of this article is to present a prototype solution for dashboarding public health surveillance data based on a real-life use-case for the purposes of enhancing clinical and policy-level decision-making. Methods The solution was developed in open source software R, which allows for the creation of dashboard applications using the integrated development environment developed for R called RStudio, and a package for R called Rshiny. How the surveillance system was designed, why R was chosen, how the dashboard was developed, and how the dashboard features were programmed and function will be described. Results The prototype dashboard includes multiple tabs for visualizing data, and allows the user to interact with the data by setting dynamic filters. Controls were used to facilitate the interaction between the user and application. Rshiny is reactive, in that when the user (e.g., clinician or policymaker) changes the parameters on the data, the application automatically updates the visualization as well as parameters available based on current filters. Conclusion The prototype dashboard has the potential to enhance clinical and policy-level decision-making because it facilitates interaction with the data that provides useful visualizations to provide such guidance.


Author(s):  
Clement Boateng Ampadu ◽  
Abdulzeid Yen Anafo

This paper introduces a new class of distributions called the generalized Ampadu-G (GA-G for short) family of distributions, and with a certain restriction on the parameter space, the family is shown to be a life-time distribution. The shape of the density function and hazard rate function of the GA-G family is described analytically. When G follows the Weibull distribution, the generalized Ampadu-Weibull (GA-W for short) is presented along with its hazard and survival function. Several sub-models of the GA-W family are presented. The transformation technique is applied to this new family of distributions, and we obtain the quantile function of the new family. Power series representations for the cumulative distribution function (CDF) and probability density function (PDF) are also obtained. The rth non-central moments, moment generating function, and Renyi entropy associated with the new family of distributions are derived. Characterization theorems based on two truncated moments and conditional expectation are also presented. A simulation study is also conducted, and we find that using the method of maximum likelihood to estimate model parameters is adequate. The GA-W family of distributions is shown to be practically significant in modeling real life data, and is shown to be superior to some non-trivial generalizations of the Weibull distribution. A further development concludes the paper.


Sign in / Sign up

Export Citation Format

Share Document