query patterns
Recently Published Documents


TOTAL DOCUMENTS

69
(FIVE YEARS 12)

H-INDEX

8
(FIVE YEARS 1)

Sensors ◽  
2021 ◽  
Vol 21 (12) ◽  
pp. 4035
Author(s):  
Teresa Zawadzka ◽  
Tomasz Wierciński ◽  
Grzegorz Meller ◽  
Mateusz Rock ◽  
Robert Zwierzycki ◽  
...  

Data reusability is an important feature of current research, just in every field of science. Modern research in Affective Computing, often rely on datasets containing experiments-originated data such as biosignals, video clips, or images. Moreover, conducting experiments with a vast number of participants to build datasets for Affective Computing research is time-consuming and expensive. Therefore, it is extremely important to provide solutions allowing one to (re)use data from a variety of sources, which usually demands data integration. This paper presents the Graph Representation Integrating Signals for Emotion Recognition and Analysis (GRISERA) framework, which provides a persistent model for storing integrated signals and methods for its creation. To the best of our knowledge, this is the first approach in Affective Computing field that addresses the problem of integrating data from multiple experiments, storing it in a consistent way, and providing query patterns for data retrieval. The proposed framework is based on the standardized graph model, which is known to be highly suitable for signal processing purposes. The validation proved that data from the well-known AMIGOS dataset can be stored in the GRISERA framework and later retrieved for training deep learning models. Furthermore, the second case study proved that it is possible to integrate signals from multiple sources (AMIGOS, ASCERTAIN, and DEAP) into GRISERA and retrieve them for further statistical analysis.


2021 ◽  
Vol 55 (1) ◽  
pp. 21-37
Author(s):  
Daniel Mawhirter ◽  
Sam Reinehr ◽  
Connor Holmes ◽  
Tongping Liu ◽  
Bo Wu

Subgraph matching is a fundamental task in many applications which identifies all the embeddings of a query pattern in an input graph. Compilation-based subgraph matching systems generate specialized implementations for the provided patterns and often substantially outperform other systems. However, the generated code causes significant computation redundancy and the compilation process incurs too much overhead to be used online, both due to the inherent symmetry in the structure of the query pattern. In this paper, we propose an optimizing query compiler, named GraphZero, to completely address these limitations through symmetry breaking based on group theory. GraphZero implements three novel techniques. First, its schedule explorer efficiently prunes the schedule space without missing any high-performance schedule. Second, it automatically generates and enforces a set of restrictions to eliminate computation redundancy. Third, it generalizes orientation, a surprisingly effective optimization that was only used for clique patterns, to apply to arbitrary patterns. Evaluation on multiple query patterns shows that GraphZero outperforms two state-of-the-art compilation and non-compilation based systems by up to 40X and 2654X, respectively.


Author(s):  
Ali Davoudian ◽  
Liu Chen ◽  
Hongwei Tu ◽  
Mengchi Liu

AbstractStreaming graph partitioning methods have recently gained attention due to their ability to scale to very large graphs with limited resources. However, many such methods do not consider workload and graph characteristics. This may degrade the performance of queries by increasing inter-node communication and computational load imbalance. Moreover, existing workload-aware methods cannot consistently provide good performance as they do not consider dynamic workloads that keep emerging in graph applications. We address these issues by proposing a novel workload-adaptive streaming partitioner named WASP, that aims to achieve low-latency and high-throughput online graph queries. As each workload typically contains frequent query patterns, WASP exploits the existing workload to capture active vertices and edges which are frequently visited and traversed, respectively. This information is used to heuristically improve the quality of partitions either by avoiding the concentration of active vertices in a few partitions proportional to their visit frequencies or by reducing the probability of the cut of active edges proportional to their traversal frequencies. In order to assess the impact of WASP on a graph store and to show how easily the approach can be plugged on top of the system, we exploit it in a distributed graph-based RDF store. Our experiments over three synthetic and real-world graph datasets and the corresponding static and dynamic query workloads show that WASP achieves a better query performance against state-of-the-art graph partitioners, especially in dynamic query workloads.


2021 ◽  
Vol 12 ◽  
Author(s):  
Hong-Zhang Yu ◽  
Tian Fu ◽  
Jia-Nan Zhou ◽  
Ping Ke ◽  
Yun-Xia Wang

Background: In China, we have seen dramatic increases in public concern over depression and mental health after the suicide of some famous persons. The objective of this study is to investigate the changes of search-engine query patterns to monitor this phenomenon based on the tragic suicide of a young Chinese pop star, Kimi Qiao.Methods: The daily search volume for depression was retrieved from both the Baidu Index (BDI) and the Sina MicroBlog Index (SMI). Besides, the daily BDI for suicide, schizophrenia, obsessive-compulsive disorder, common cold, stomach cancer, and liver cancer were collected for comparison. According to the time of Qiao's suicide, all data were divided into two periods (i.e., Period One from 1 September 2015 to 31 August 2016 while Period Two ranged from 1 October 2016 to 30 September 2017). The paired t-test was used to compare the differences in search volumes between two periods. The Pearson correlation analysis was used to estimate correlations between the BDI and SMI for depression.Results: The average BDI for depression, BDI for suicide, and SMI for depression in Period Two were significantly higher than in Period One (p < 0.05). There was a strong positive correlation between the BDI and SMI for depression (r = 0.97, p < 0.001). And no significant difference in BDI for other diseases between the two periods was found.Conclusions: The changes of search-engine query patterns indicated that the celebrity's suicide might be able to improve the netizens' concern about depression in China. The study suggests publishing more practical knowledge and advice on depression through the Internet and social media, to improve the public's mental health literacy and help people to cope with their depressive symptoms appropriately.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Ceri Binding ◽  
Claudio Gnoli ◽  
Douglas Tudhope

PurposeThe Integrative Levels Classification (ILC) is a comprehensive “freely faceted” knowledge organization system not previously expressed as SKOS (Simple Knowledge Organization System). This paper reports and reflects on work converting the ILC to SKOS representation.Design/methodology/approachThe design of the ILC representation and the various steps in the conversion to SKOS are described and located within the context of previous work considering the representation of complex classification schemes in SKOS. Various issues and trade-offs emerging from the conversion are discussed. The conversion implementation employed the STELETO transformation tool.FindingsThe ILC conversion captures some of the ILC facet structure by a limited extension beyond the SKOS standard. SPARQL examples illustrate how this extension could be used to create faceted, compound descriptors when indexing or cataloguing. Basic query patterns are provided that might underpin search systems. Possible routes for reducing complexity are discussed.Originality/valueComplex classification schemes, such as the ILC, have features which are not straight forward to represent in SKOS and which extend beyond the functionality of the SKOS standard. The ILC's facet indicators are modelled as rdf:Property sub-hierarchies that accompany the SKOS RDF statements. The ILC's top-level fundamental facet relationships are modelled by extensions of the associative relationship – specialised sub-properties of skos:related. An approach for representing faceted compound descriptions in ILC and other faceted classification schemes is proposed.


Author(s):  
Samita Bai ◽  
Shakeel A. Khoja

The link traversal strategies to query Linked Data over WWW can retrieve up-to-date results using a recursive URI lookup process in real-time. The downside of this approach comes with the query patterns having subject unbound (i.e. ?S rdf:type:Class). Such queries fail to start up the traversal process as the RDF pages are subject-centric in nature. Thus, zero-knowledge link traversal leads to the empty query results for these queries. In this paper, the authors analyze a large corpus of real-world SPARQL query logs and identify the Most Frequent Predicates (MFPs) occurring in these queries. The knowledge of these MFPs helps in finding and indexing a limited number of triples from the original data set. Additionally, the authors propose a Hybrid Query Execution (HQE) approach to execute the queries over this index for initial data source selection followed by link traversal process to fetch complete results. The evaluation of HQE on the latest real data benchmarks reveals that it retrieves at least five times more results than the existing approaches.


10.2196/19483 ◽  
2020 ◽  
Vol 22 (7) ◽  
pp. e19483
Author(s):  
Henry C Cousins ◽  
Clara C Cousins ◽  
Alon Harris ◽  
Louis R Pasquale

Background Timely allocation of medical resources for coronavirus disease (COVID-19) requires early detection of regional outbreaks. Internet browsing data may predict case outbreaks in local populations that are yet to be confirmed. Objective We investigated whether search-engine query patterns can help to predict COVID-19 case rates at the state and metropolitan area levels in the United States. Methods We used regional confirmed case data from the New York Times and Google Trends results from 50 states and 166 county-based designated market areas (DMA). We identified search terms whose activity precedes and correlates with confirmed case rates at the national level. We used univariate regression to construct a composite explanatory variable based on best-fitting search queries offset by temporal lags. We measured the raw and z-transformed Pearson correlation and root-mean-square error (RMSE) of the explanatory variable with out-of-sample case rate data at the state and DMA levels. Results Predictions were highly correlated with confirmed case rates at the state (mean r=0.69, 95% CI 0.51-0.81; median RMSE 1.27, IQR 1.48) and DMA levels (mean r=0.51, 95% CI 0.39-0.61; median RMSE 4.38, IQR 1.80), using search data available up to 10 days prior to confirmed case rates. They fit case-rate activity in 49 of 50 states and in 103 of 166 DMA at a significance level of .05. Conclusions Identifiable patterns in search query activity may help to predict emerging regional outbreaks of COVID-19, although they remain vulnerable to stochastic changes in search intensity.


2020 ◽  
Vol 13 (10) ◽  
pp. 1696-1708
Author(s):  
Robin Rehrmann ◽  
Carsten Binnig ◽  
Alexander Böhm ◽  
Kihong Kim ◽  
Wolfgang Lehner

OLTP applications are usually executed by a high number of clients in parallel and are typically faced with high throughput demand as well as a constraint latency requirement for individual statements. Interestingly, OLTP workloads are often read-heavy and comprise similar query patterns, which provides a potential to share work of statements belonging to different transactions. Consequently, OLAP techniques for sharing work have started to be applied also to OLTP workloads, lately. In this paper, we present an approach for merging read statements within interactively submitted multi-statement transactions consisting of reads and writes. We first define a formal framework for merging transactions running under a given isolation level and provide insights into a prototypical implementation of merging within a commercial database system. In our experimental evaluation, we show that, depending on the isolation level, the load in the system and the read-share of the workload, an improvement of the transaction throughput by up to a factor of 2.5X is possible without compromising the transactional semantics.


2020 ◽  
Author(s):  
Henry C Cousins ◽  
Clara C Cousins ◽  
Alon Harris ◽  
Louis R Pasquale

BACKGROUND Timely allocation of medical resources for coronavirus disease (COVID-19) requires early detection of regional outbreaks. Internet browsing data may predict case outbreaks in local populations that are yet to be confirmed. OBJECTIVE We investigated whether search-engine query patterns can help to predict COVID-19 case rates at the state and metropolitan area levels in the United States. METHODS We used regional confirmed case data from the New York Times and Google Trends results from 50 states and 166 county-based designated market areas (DMA). We identified search terms whose activity precedes and correlates with confirmed case rates at the national level. We used univariate regression to construct a composite explanatory variable based on best-fitting search queries offset by temporal lags. We measured the raw and z-transformed Pearson correlation and root-mean-square error (RMSE) of the explanatory variable with out-of-sample case rate data at the state and DMA levels. RESULTS Predictions were highly correlated with confirmed case rates at the state (mean <i>r</i>=0.69, 95% CI 0.51-0.81; median RMSE 1.27, IQR 1.48) and DMA levels (mean <i>r</i>=0.51, 95% CI 0.39-0.61; median RMSE 4.38, IQR 1.80), using search data available up to 10 days prior to confirmed case rates. They fit case-rate activity in 49 of 50 states and in 103 of 166 DMA at a significance level of .05. CONCLUSIONS Identifiable patterns in search query activity may help to predict emerging regional outbreaks of COVID-19, although they remain vulnerable to stochastic changes in search intensity.


10.29007/z8tp ◽  
2020 ◽  
Author(s):  
Izzat Alsmadi ◽  
Zaid Almubaid ◽  
Hisham Al-Mubaid

In the recent years, people are becoming more dependent on the Internet as their main source of information about healthcare. A number of research projects in the past few decades examined and utilized the internet data for information extraction in healthcare including disease surveillance and monitoring. In this paper, we investigate and study the potential of internet data like internet search keywords and search query patterns in the healthcare domain for disease monitoring and detection. Specifically, we investigate search keyword patterns for disease outbreak detection. Accurate prediction and detection of disease outbreaks in a timely manner can have a big positive impact on the entire health care system. Our method utilizes machine learning in identifying interesting patterns related to target disease outbreak from search keyword logs. We conducted experiments on the flu disease, which is the most searched disease in the interest of this problem. We showed examples of keywords that can be good predictors of outbreaks of the flu. Our method proved that the correlation between search queries and keyword trends are truly reliable in the sense that it can be used to predict the outbreak of the disease.


Sign in / Sign up

Export Citation Format

Share Document