A standardized framework for testing the performance of sleep-tracking technology: step-by-step guidelines and open-source code

Abstract Sleep-tracking devices, particularly within the consumer sleep technology (CST) space, are increasingly used in both research and clinical settings, providing new opportunities for large-scale data collection in highly ecological conditions. Due to the fast pace of the CST industry combined with the lack of a standardized framework to evaluate the performance of sleep trackers, their accuracy and reliability in measuring sleep remains largely unknown. Here, we provide a step-by-step analytical framework for evaluating the performance of sleep trackers (including standard actigraphy), as compared with gold-standard polysomnography (PSG) or other reference methods. The analytical guidelines are based on recent recommendations for evaluating and using CST from our group and others (de Zambotti and colleagues; Depner and colleagues), and include raw data organization as well as critical analytical procedures, including discrepancy analysis, Bland–Altman plots, and epoch-by-epoch analysis. Analytical steps are accompanied by open-source R functions (depicted at https://sri-human-sleep.github.io/sleep-trackers-performance/AnalyticalPipeline_v1.0.0.html). In addition, an empirical sample dataset is used to describe and discuss the main outcomes of the proposed pipeline. The guidelines and the accompanying functions are aimed at standardizing the testing of CSTs performance, to not only increase the replicability of validation studies, but also to provide ready-to-use tools to researchers and clinicians. All in all, this work can help to increase the efficiency, interpretation, and quality of validation studies, and to improve the informed adoption of CST in research and clinical settings.

Download Full-text

BSO-MV: An Optimized Multiview Clustering Approach for Items Recommendation in Social Networks

JUCS - Journal of Universal Computer Science ◽

10.3897/jucs.70341 ◽

2021 ◽

Vol 27 (7) ◽

pp. 667-692

Author(s):

Lamia Berkani ◽

Lylia Betit ◽

Louiza Belarif

Keyword(s):

Social Networks ◽

Large Scale ◽

Data Sets ◽

Large Scale Data ◽

Recommendation Algorithms ◽

Clustering Approach ◽

Real World Datasets ◽

Multiview Clustering ◽

Improving Accuracy

Clustering-based approaches have been demonstrated to be efficient and scalable to large-scale data sets. However, clustering-based recommender systems suffer from relatively low accuracy and coverage. To address these issues, we propose in this article an optimized multiview clustering approach for the recommendation of items in social networks. First, the selection of the initial medoids is optimized using the Bees Swarm optimization algorithm (BSO) in order to generate better partitions (i.e. refining the quality of medoids according to the objective function). Then, the multiview clustering (MV) is applied, where users are iteratively clustered from the views of both rating patterns and social information (i.e. friendships and trust). Finally, a framework is proposed for testing the different alternatives, namely: (1) the standard recommendation algorithms; (2) the clustering-based and the optimized clustering-based recommendation algorithms using BSO; and (3) the MV and the optimized MV (BSO-MV) algorithms. Experimental results conducted on two real-world datasets demonstrate the effectiveness of the proposed BSO-MV algorithm in terms of improving accuracy, as it outperforms the existing related approaches and baselines.

Download Full-text

Vertical Integration Between Providers With Possible Cloud Migration

Advances in Computer and Electrical Engineering - Advanced Methodologies and Technologies in Network Architecture, Mobile Computing, and Data Analytics ◽

10.4018/978-1-5225-7598-6.ch020 ◽

2019 ◽

pp. 274-284

Author(s):

Aleksandra Kostic-Ljubisavljevic ◽

Branka Mikavica

Keyword(s):

Large Scale ◽

Virtual Machines ◽

Wholesale Price ◽

Rejection Rate ◽

Large Scale Data ◽

Cloud Migration ◽

Computational Resources ◽

Charging Strategy ◽

Scale Data

All vertically integrated participants in content provisioning process are influenced by bandwidth requirements. Provisioning of self-owned resources that satisfy peak bandwidth demand leads to network underutilization and it is cost ineffective. Under-provisioning leads to rejection of customers' requests. Vertically integrated providers need to consider cloud migration in order to minimize costs and improve quality of service and quality of experience of their customers. Cloud providers maintain large-scale data centers to offer storage and computational resources in the form of virtual machines instances. They offer different pricing plans: reservation, on-demand, and spot pricing. For obtaining optimal integration charging strategy, revenue sharing, cost sharing, wholesale price is applied frequently. The vertically integrated content provider's incentives for cloud migration can induce significant complexity in integration contracts, and consequently improvements in costs and requests' rejection rate.

Download Full-text

Vertical Integration Between Providers With Possible Cloud Migration

Encyclopedia of Information Science and Technology, Fourth Edition ◽

10.4018/978-1-5225-2255-3.ch100 ◽

2018 ◽

pp. 1164-1173 ◽

Cited By ~ 1

Author(s):

Aleksandra Kostic-Ljubisavljevic ◽

Branka Mikavica

Keyword(s):

Large Scale ◽

Virtual Machines ◽

Wholesale Price ◽

Rejection Rate ◽

Large Scale Data ◽

Cloud Migration ◽

Computational Resources ◽

Data Centres ◽

Charging Strategy

All vertically integrated participants in content provisioning process are influenced by bandwidth requirements. Provisioning of self-owned resources that satisfy peak bandwidth demand leads to network underutilization and it is cost ineffective. Under-provisioning leads to rejection of customers' requests. Vertically integrated providers need to consider cloud migration in order to minimize costs and improve Quality of Service and Quality of Experience of their customers. Cloud providers maintain large-scale data centres to offer storage and computational resources in the form of Virtual Machines instances. They offer different pricing plans: reservation, on-demand and spot pricing. For obtaining optimal integration charging strategy, Revenue Sharing, Cost Sharing, Wholesale Price is applied frequently. The vertically integrated content provider's incentives for cloud migration can induce significant complexity in integration contracts, and consequently improvements in costs and requests' rejection rate.

Download Full-text

Blog Snippets Based Drug Effects Extraction System Using Lexical and Grammatical Restrictions

International Journal of Multimedia Data Engineering and Management ◽

10.4018/ijmdem.2014040101 ◽

2014 ◽

Vol 5 (2) ◽

pp. 1-17 ◽

Cited By ~ 1

Author(s):

Shiho Kitajima ◽

Rafal Rzepka ◽

Kenji Araki

Keyword(s):

Quality Of Life ◽

Natural Language ◽

Large Scale ◽

Medical Information ◽

Drug Effects ◽

Large Scale Data ◽

Medical Documents ◽

Baseline System ◽

Scale Data

Obtaining medical information has a beneficial influence on patients' treatment and QOL (quality of life). The authors aim to make a system that helps patients to collect narrative information. Extracting information from data written by patients will allow the acquisition of information which is easy to understand and provides encouragement. Additionally, by using large-scale data, the system can be utilized for discovering unknown effects or patterns. As the first step, the purpose of this paper is to extract descriptions of the effects caused by taking drugs as a triplet of expressions from illness survival blogs' snippets. This paper proposes a method to extract the triplets using specific clue words and parsing the results in order to extract from blogs written in free natural language. Moreover, recall was improved by combining their proposed method and a baseline system, and precision was improved by filtering using dictionaries we created from existing medical documents.

Download Full-text

PENGELOMPOKAN PERSENTASE BUTA HURUF UMUR 15-44 MENURUT PROVINSI MENGGUNAKAN ALGORITMA K-MEANS

KLIK - KUMPULAN JURNAL ILMU KOMPUTER ◽

10.20527/klik.v7i3.329 ◽

2020 ◽

Vol 7 (3) ◽

pp. 230

Author(s):

Saifullah Saifullah ◽

Nani Hidayati

Keyword(s):

Data Mining ◽

Human Resources ◽

Data Clustering ◽

Large Scale ◽

Market Basket ◽

Large Scale Data ◽

The Government ◽

Large Scale Data Processing ◽

Scale Data

Data Mining is a method that is often needed in large-scale data processing, so data mining has important access to the fields of life including industry, finance, weather, science and technology. In data mining techniques there are methods that can be used, namely classification, clustering, regression, variable selection, and market basket analysis. Illiteracy is one of the factors that hinder the quality of human resources. One of the basic things that must be fulfilled to improve the quality of human resources is the eradication of illiteracy among the community. The purpose of this study is to determine the clustering of illiterate communities based on provinces in Indonesia. The results of the study are illiterate data clustering according to the age proportion of 15-44 namely 1 high group node, low group has 27 nodes, and medium group 6 nodes. The results of this study become input for the government to determine illiteracy eradication policies in Indonesia based on provinces.Kata Kunci: Illiterate, Data mining, K-Means ClusteringData Mining termasuk metode yang sering dibutuhkan dalam pengolahan data berskala besar, maka data mining mempunyai akses penting pada bidang kehidupan diantaranya yaitu bidang industri, bidang keuangan, cuaca, ilmu dan teknologi. Pada teknik data mining terdapat metode-metode yang dapat digunakan yaitu klasifikasi, clustering, regresi, seleksi variabel, dan market basket analisis. Buta huruf merupakan salah satu faktor yang menghambat kualitas sumber daya manusia. Salah satu hal mendasar yang harus dipenuhi untuk meningkatkan kualitas sumber daya manusia adalah pemberantasan buta huruf di kalangan masyarakat Adapun tujuan penelitian ini adalah menetukan clustering masyarakat buta huruf berdasarkan propinsi di Indonesia. Hasil dari penelitian adalah data clustering buta huruf menurut propisi umur 15-44 yaitu 1 node kelompok tinggi, kelompok rendah memiliki 27 node, dan kelompok sedang 6 node. Hasil penelitian ini menjadi bahan masukan kepada pemerintah untuk menentukan kebijakan pemberantasan buta huruf di Indonesia berdasarakn propinsi.Kata Kunci: Buta Huruf, Data mining, K-Means Clustering

Download Full-text

Obtaining High-Quality Label by Distinguishing between Easy and Hard Items in Crowdsourcing

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/413 ◽

2017 ◽

Cited By ~ 4

Author(s):

Wei Wang ◽

Xiang-Yu Guo ◽

Shao-Yuan Li ◽

Yuan Jiang ◽

Zhi-Hua Zhou

Keyword(s):

Large Scale ◽

Experimental Results ◽

Training Set ◽

High Quality ◽

Quality Label ◽

Large Scale Data ◽

Voluntary Workers ◽

Scale Data

Crowdsourcing systems make it possible to hire voluntary workers to label large-scale data by offering them small monetary payments. Usually, the taskmaster requires to collect high-quality labels, while the quality of labels obtained from the crowd may not satisfy this requirement. In this paper, we study the problem of obtaining high-quality labels from the crowd and present an approach of learning the difficulty of items in crowdsourcing, in which we construct a small training set of items with estimated difficulty and then learn a model to predict the difficulty of future items. With the predicted difficulty, we can distinguish between easy and hard items to obtain high-quality labels. For easy items, the quality of their labels inferred from the crowd could be high enough to satisfy the requirement; while for hard items, the crowd could not provide high-quality labels, it is better to choose a more knowledgable crowd or employ specialized workers to label them. The experimental results demonstrate that the proposed approach by learning to distinguish between easy and hard items can significantly improve the label quality.

Download Full-text

Localization of information on communication networks of an open-source online community

International Journal of Modern Physics C ◽

10.1142/s0129183117500917 ◽

2017 ◽

Vol 28 (07) ◽

pp. 1750091 ◽

Cited By ~ 1

Author(s):

Jianmei Yang ◽

Hui Li ◽

Hao Liao ◽

Zheng He ◽

Huijie Yang ◽

...

Keyword(s):

Open Source ◽

Communication Networks ◽

Online Community ◽

Large Scale ◽

Information Communication ◽

Community Based ◽

Large Scale Data ◽

Communication Ability ◽

Properties Of Information ◽

Data Collections

In order to study the information communication ability (i.e. information conductivity) of CodePlex C# community, an Open-Source Online Community (OSOC), we first construct the models of weighted communication networks in 11 periods for the community based on large-scale data collections. Then by using two ways of quantum mapping of complex networks, we analyze the localization properties of information on the maximum connected graphs (named as communication networks) of these weighted networks according to the idea of analyzing the localization properties of an electron on a large cluster. We draw the following conclusions. (1) CodePlex C# OSOC usually has information isolativity. (2) The community has some degree of information communication ability and the ability increases as time goes on. (3) The localization of information on the communication networks in any period induced by its structure is weaker than that induced by its structure together with the connection intensities between its nodes. (4) Our idea and methods can be used to analyze the information communication ability of other online communities.

Download Full-text

Describing and improving homoeopathy

British Homeopathic Journal ◽

10.1016/s0007-0785(05)80859-2 ◽

1994 ◽

Vol 83 (03) ◽

pp. 135-141 ◽

Cited By ~ 2

Author(s):

P. Fisher ◽

R. Van Haselen

Keyword(s):

Data Collection ◽

Large Scale ◽

Daily Practice ◽

Efficacy And Safety ◽

Modern Information Technology ◽

Large Scale Data ◽

Modern Information ◽

User Friendly ◽

Scale Data

AbstractLarge scale data collection combined with modern information technology is a powerful tool to evaluate the efficacy and safety of homoeopathy. It also has great potential to improve homoeopathic practice. Data collection has not been widely used in homoeopathy. This appears to be due to the clumsiness sof the methodology and the perception that it is of little value to daily practice. 3 protocols addressing different aspects of this issue are presented.- A proposal to establish common basic data collection methodology for homoeopaths throughout Europe.- A systematic survey of the results of homoeopathic treatment of patients with rheumatoid arthritis using quality of life and objective assessments.- Verification of a set of homoeopathic prescribing features for Rhus toxicodendron.These proposals are designed to be ‘user-friendly’ and to provide practical information relevant to daily homoeopathic practice.

Download Full-text

Evaluating the healthiness of chain-restaurant menu items using crowdsourcing: a new method

Public Health Nutrition ◽

10.1017/s1368980016001804 ◽

2016 ◽

Vol 20 (1) ◽

pp. 18-24 ◽

Cited By ~ 3

Author(s):

Lenard I Lesser ◽

Leslie Wu ◽

Timothy B Matthiessen ◽

Harold S Luft

Keyword(s):

Public Health ◽

Nutritional Quality ◽

Large Scale ◽

Registered Dietitian ◽

Food Items ◽

Large Scale Data ◽

Nutrient Profiling ◽

The Cost ◽

Scale Data

AbstractObjectiveTo develop a technology-based method for evaluating the nutritional quality of chain-restaurant menus to increase the efficiency and lower the cost of large-scale data analysis of food items.DesignUsing a Modified Nutrient Profiling Index (MNPI), we assessed chain-restaurant items from the MenuStat database with a process involving three steps: (i) testing ‘extreme’ scores; (ii) crowdsourcing to analyse fruit, nut and vegetable (FNV) amounts; and (iii) analysis of the ambiguous items by a registered dietitian.ResultsIn applying the approach to assess 22 422 foods, only 3566 could not be scored automatically based on MenuStat data and required further evaluation to determine healthiness. Items for which there was low agreement between trusted crowd workers, or where the FNV amount was estimated to be >40 %, were sent to a registered dietitian. Crowdsourcing was able to evaluate 3199, leaving only 367 to be reviewed by the registered dietitian. Overall, 7 % of items were categorized as healthy. The healthiest category was soups (26 % healthy), while desserts were the least healthy (2 % healthy).ConclusionsAn algorithm incorporating crowdsourcing and a dietitian can quickly and efficiently analyse restaurant menus, allowing public health researchers to analyse the healthiness of menu items.

Download Full-text

A study of quality prediction for large-scale open source software projects

Artificial Intelligence Research ◽

10.5430/air.v10n1p34 ◽

2021 ◽

Vol 10 (1) ◽

pp. 34

Author(s):

Shinji Akatsu ◽

Ayako Masuda ◽

Tsuyoshi Shida ◽

Kazuhiko Tsuda

Keyword(s):

Decision Making ◽

Open Source ◽

Open Source Software ◽

Large Scale ◽

Quality Analysis ◽

Quality Prediction ◽

Software Projects ◽

Resolution Rate ◽

The Status

Open source software (OSS) has seen remarkable progress in recent years. Moreover, OSS usage in corporate information systems has been increasing steadily; consequently, the overall impact of OSS on the society is increasing as well. While product quality of enterprise software is assured by the provider, the deliverables of an OSS are developed by the OSS developer community; therefore, their quality is not guaranteed. Thus, the objective of this study is to build an artificial-intelligence-based quality prediction model that corporate businesses could use for decision-making to determine whether a desired OSS should be adopted. We define the quality of an OSS as “the resolution rate of issues processed by OSS developers as well as the promptness and continuity of doing so.” We selected 44 large-scale OSS projects from GitHub for our quality analysis. First, we investigated the monthly changes in the status of issue creation and resolution for each project. It was found that there are three different patterns in the increase of issue creation, and three patterns in the relationship between the increase in issue creation and that of resolution. It was confirmed that there are multiple cases of each pattern that affect the final resolution rate. Next, we investigated the correlation between the final resolution rate and that for a relevant number of months after issue creation. We deduced that the correlation coefficient even between the resolution rate in the first month and the final rate exceeded 0.5. Based on these analysis results, we conclude that the issue resolution rate in the first month once an issue is created is applicable as knowledge for knowledge-based AI systems that can be used to assist in decision-making regarding OSS adoption in business projects.

Download Full-text