multiple queries Latest Research Papers

For data analysis with differential privacy, an analysis task usually requires multiple queries to complete, and the total budget needs to be divided into different parts and allocated to each query. However, at present, the budget allocation in differential privacy lacks efficient and general allocation strategies, and most of the research tends to adopt an average or exclusive allocation method. In this paper, we propose two series strategies for budget allocation: the geometric series and the Taylor series. We show the different characteristics of the two series and provide a calculation method for selecting the key parameters. To better reflect a user’s preference of noise during the allocation, we explored the relationship between sensitivity and noise in detail, and, based on this, we propose an optimization for the series strategies. Finally, to prevent collusion attacks and improve security, we provide three ideas for protecting the budget sequence. Both the theoretical analysis and experimental results show that our methods can support more queries and achieve higher utility. This shows that our series allocation strategies have a high degree of flexibility which can meet the user’s need and allow them to be better applied to differentially private algorithms to achieve high performance while maintaining the security.

Download Full-text

Histogram Publication over Numerical Values under Local Differential Privacy

Wireless Communications and Mobile Computing ◽

10.1155/2021/8886255 ◽

2021 ◽

Vol 2021 ◽

pp. 1-11

Author(s):

Xu Zheng ◽

Ke Yan ◽

Jingyuan Duan ◽

Wenyi Tang ◽

Ling Tian

Keyword(s):

Privacy Preservation ◽

Differential Privacy ◽

Frequency Estimation ◽

Mean Value ◽

Distributed Data ◽

Distributed Environment ◽

Real World Data ◽

Multiple Queries ◽

Optimal Resource ◽

Value Estimation

Local differential privacy has been considered the standard measurement for privacy preservation in distributed data collection. Corresponding mechanisms have been designed for multiple types of tasks, like the frequency estimation for categorical values and the mean value estimation for numerical values. However, the histogram publication of numerical values, containing abundant and crucial clues for the whole dataset, has not been thoroughly considered under this measurement. To simply encode data into different intervals upon each query will soon exhaust the bandwidth and the privacy budgets, which is infeasible for real scenarios. Therefore, this paper proposes a highly efficient framework for differentially private histogram publication of numerical values in a distributed environment. The proposed algorithms can efficiently adopt the correlations among multiple queries and achieve an optimal resource consumption. We also conduct extensive experiments on real-world data traces, and the results validate the improvement of proposed algorithms.

Download Full-text

Directing and Combining Multiple Queries for Exploratory Search by Visual Interactive Intent Modeling

10.1007/978-3-030-85613-7_34 ◽

2021 ◽

pp. 514-535

Author(s):

Jonathan Strahl ◽

Jaakko Peltonen ◽

Patrik Floréen

Keyword(s):

Exploratory Search ◽

Multiple Queries ◽

Intent Modeling

Download Full-text

ProQuest African American Heritage

The Charleston Advisor ◽

10.5260/chara.22.3.39 ◽

2021 ◽

Vol 22 (3) ◽

pp. 39-42

Author(s):

Thomas J. Beck

Keyword(s):

African American ◽

Family History ◽

School Size ◽

African American Family ◽

American Family ◽

Range Of Movement ◽

Africana Studies ◽

Multiple Queries ◽

American Heritage ◽

The U.S

African American Heritage a database for African American family history research, provided by ProQuest. Here, the user has access to a wide variety of military, birth, marriage, cohabitation, death, and census records. Also included are records from the Freedman’s Bank and various registers of slaves and free(d) persons of color. The former was a bank chartered by the federal government to encourage and guide the economic development of African American communities in the period following the end of slavery in the U.S. The latter refers to records, maintained by a number of states prior to 1865, of slaves and free(d) persons of color. Also available to the user are contacts to a community of genealogy researchers, who can provide assistance and mentoring. The readability of the documents available here can vary. Some are too faded to read easily, even with magnification, and others are handwritten, which can make them difficult to interpret. Navigating, enlarging, and reducing documents can be done without difficulty, though the range of movement and magnification is somewhat limited. Documents can be browsed and/or searched for by title, author, publisher, date, subject, language (although, at present, English is the only language available), surname and personal name, and location.The search and browse options here are understandable and can produce useful results, though the number produced by any one query is usually not extensive, so multiple queries may be needed for any research project. Pricing for this database is determined by library or school size and the number of potential users, and consortia discounts are available (contact ProQuest for a specific price quote). Its licensing agreement is the same as those used for all ProQuest databases, and in its length and composition is quite average. The quality and quantity of content in this resource is not exceptional, but it will certainly be of use to those researching African American family history, and more generally Africana Studies, especially in the states indicated in this review.

Download Full-text

DAF: An adaptive computing framework for multimedia data streams analysis

Intelligent Data Analysis ◽

10.3233/ida-194640 ◽

2020 ◽

Vol 24 (6) ◽

pp. 1441-1453

Author(s):

Jun Li ◽

Chao Li ◽

Bin Tian ◽

Yanzhao Liu ◽

Chengxiang Si

Keyword(s):

Large Scale ◽

Multimedia Data ◽

Analytic Hierarchy ◽

Optimal Sequence ◽

Multiple Queries ◽

Multimedia Stream ◽

Multimedia Streams ◽

Forecasting Method ◽

Significant Performance ◽

Hierarchy Process

We consider the problem of efficiently online computing/filtering or analysis multimedia streams. In this scenario, we register a large scale of continuous analysis queries to filter pornographic stream items. Each query is a conjunction of filters. For instance, the query “does this image contain a people basking in the beach?” can be resolved by applying the conjunction of water, people, sand, sea filters successively on the stream item. However, the online evaluation of multimedia filters is indeed very expensive, fortunately there usually exist multiple filters shared among a lot of queries. In other words, each filter may occur in multiple queries. An open problem in such a filtering scenario is how to order the filters in an optimal sequence to achieve significant performance. Existing methods are based on a greedy strategy which orders the filters according to three factors (selectivity, popularity, cost). Although all these methods achieve good results, there are still some problems that haven’t addressed yet. First, the selectivity factor is set empirically, which can not adaptively adjust with multimedia stream. Second, the proportion relationships among the three factors (selectivity, cost, popularity) were not considerably explored. Under these observations,in this paper, we propose a Dynamic-Analytic hierarchy process Framework (DAF) which use a time-based compositional forecasting method, which is based on the idea of exponential smoothing, to deal with the factors’ proportion relationships dynamics. Experiments on both synthetic and real lift multimedia streams demonstrate that our proposed framework (DAF) provides much great adaptability in modeling the factors proportion relationships changing over multimedia stream environment.

Download Full-text

Exploiting Sharing Join Opportunities in Big Data Multiquery Optimization with Flink

Complexity ◽

10.1155/2020/6617149 ◽

2020 ◽

Vol 2020 ◽

pp. 1-25

Author(s):

Xiao-Yan Gao ◽

Radhya Sahal ◽

Gui-Xiu Chen ◽

Mohammed H. Khafagy ◽

Fatma A. Omara

Keyword(s):

Big Data ◽

Execution Time ◽

Large Scale ◽

Query Execution ◽

Multiple Queries ◽

Intermediate Data ◽

Large Scale Data ◽

Join Queries ◽

Multiquery Optimization ◽

Data Granularity

Multiway join queries incur high-cost I/Os operations over large-scale data. Exploiting sharing join opportunities among multiple multiway joins could be beneficial to reduce query execution time and shuffled intermediate data. Although multiway join optimization has been carried out in MapReduce, different design principles (i.e., in-memory Big Data platforms, Flink) are not considered. To bridge the gap of not considering the optimization of Big Data platforms, an end-to-end multiway join over Flink, which is called Join-MOTH system (J-MOTH), is proposed to exploit sharing data granularity, sharing join granularity, and sharing implicit sorts within multiple join queries. For sharing data, our previous work, Multiquery Optimization using Tuple Size and Histogram (MOTH) system, has been introduced to consider the granularity of sharing data opportunities among multiple queries. For sharing sort, our previous work, Sort-Based Optimizer for Big Data Multiquery (SOOM), has been introduced to consider the implicit sorts among join queries. For sharing join, additional modules have been tailored to the J-MOTH optimizer to optimize sharing work by exploiting shared pipelined multiway join among multiple multiway join queries. The experimental evaluation has demonstrated that the J-MOTH system outperforms the naive and the state-of-the-art techniques by 44% for query execution time using TPC-H queries. Also, the proposed J-MOTH system introduces maximal intermediate data size reduction by 30% in average over Hadoop-like infrastructures.

Download Full-text