scholarly journals Sparsifying to optimize over multiple information sources: an augmented Gaussian process based algorithm

Author(s):  
Antonio Candelieri ◽  
Francesco Archetti

AbstractOptimizing a black-box, expensive, and multi-extremal function, given multiple approximations, is a challenging task known as multi-information source optimization (MISO), where each source has a different cost and the level of approximation (aka fidelity) of each source can change over the search space. While most of the current approaches fuse the Gaussian processes (GPs) modelling each source, we propose to use GP sparsification to select only “reliable” function evaluations performed over all the sources. These selected evaluations are used to create an augmented Gaussian process (AGP), whose name is implied by the fact that the evaluations on the most expensive source are augmented with the reliable evaluations over less expensive sources. A new acquisition function, based on confidence bound, is also proposed, including both cost of the next source to query and the location-dependent approximation of that source. This approximation is estimated through a model discrepancy measure and the prediction uncertainty of the GPs. MISO-AGP and the MISO-fused GP counterpart are compared on two test problems and hyperparameter optimization of a machine learning classifier on a large dataset.

2021 ◽  
Author(s):  
Antonio Candelieri ◽  
Riccardo Perego ◽  
Francesco Archetti

AbstractSearching for accurate machine and deep learning models is a computationally expensive and awfully energivorous process. A strategy which has been recently gaining importance to drastically reduce computational time and energy consumed is to exploit the availability of different information sources, with different computational costs and different “fidelity,” typically smaller portions of a large dataset. The multi-source optimization strategy fits into the scheme of Gaussian Process-based Bayesian Optimization. An Augmented Gaussian Process method exploiting multiple information sources (namely, AGP-MISO) is proposed. The Augmented Gaussian Process is trained using only “reliable” information among available sources. A novel acquisition function is defined according to the Augmented Gaussian Process. Computational results are reported related to the optimization of the hyperparameters of a Support Vector Machine (SVM) classifier using two sources: a large dataset—the most expensive one—and a smaller portion of it. A comparison with a traditional Bayesian Optimization approach to optimize the hyperparameters of the SVM classifier on the large dataset only is reported.


2021 ◽  
Author(s):  
Bo Shen ◽  
Raghav Gnanasambandam ◽  
Rongxuan Wang ◽  
Zhenyu Kong

In many scientific and engineering applications, Bayesian optimization (BO) is a powerful tool for hyperparameter tuning of a machine learning model, materials design and discovery, etc. BO guides the choice of experiments in a sequential way to find a good combination of design points in as few experiments as possible. It can be formulated as a problem of optimizing a “black-box” function. Different from single-task Bayesian optimization, Multi-task Bayesian optimization is a general method to efficiently optimize multiple different but correlated “black-box” functions. The previous works in Multi-task Bayesian optimization algorithm queries a point to be evaluated for all tasks in each round of search, which is not efficient. For the case where different tasks are correlated, it is not necessary to evaluate all tasks for a given query point. Therefore, the objective of this work is to develop an algorithm for multi-task Bayesian optimization with automatic task selection so that only one task evaluation is needed per query round. Specifically, a new algorithm, namely, multi-task Gaussian process upper confidence bound (MT-GPUCB), is proposed to achieve this objective. The MT-GPUCB is a two-step algorithm, where the first step chooses which query point to evaluate, and the second step automatically selects the most informative task to evaluate. Under the bandit setting, a theoretical analysis is provided to show that our proposed MT-GPUCB is no-regret under some mild conditions. Our proposed algorithm is verified experimentally on a range of synthetic functions as well as real-world problems. The results clearly show the advantages of our query strategy for both design point and task.


2019 ◽  
Vol 66 ◽  
pp. 151-196 ◽  
Author(s):  
Kirthevasan Kandasamy ◽  
Gautam Dasarathy ◽  
Junier Oliva ◽  
Jeff Schneider ◽  
Barnabás Póczos

In many scientific and engineering applications, we are tasked with the maximisation of an expensive to evaluate black box function f. Traditional settings for this problem assume just the availability of this single function. However, in many cases, cheap approximations to f may be obtainable. For example, the expensive real world behaviour of a robot can be approximated by a cheap computer simulation. We can use these approximations to eliminate low function value regions cheaply and use the expensive evaluations of f in a small but promising region and speedily identify the optimum. We formalise this task as a multi-fidelity bandit problem where the target function and its approximations are sampled from a Gaussian process. We develop MF-GP-UCB, a novel method based on upper confidence bound techniques. In our theoretical analysis we demonstrate that it exhibits precisely the above behaviour and achieves better bounds on the regret than strategies which ignore multi-fidelity information. Empirically, MF-GP-UCB outperforms such naive strategies and other multi-fidelity methods on several synthetic and real experiments.


2021 ◽  
Author(s):  
Bo Shen ◽  
Raghav Gnanasambandam ◽  
Rongxuan Wang ◽  
Zhenyu Kong

In many scientific and engineering applications, Bayesian optimization (BO) is a powerful tool for hyperparameter tuning of a machine learning model, materials design and discovery, etc. BO guides the choice of experiments in a sequential way to find a good combination of design points in as few experiments as possible. It can be formulated as a problem of optimizing a “black-box” function. Different from single-task Bayesian optimization, Multi-task Bayesian optimization is a general method to efficiently optimize multiple different but correlated “black-box” functions. The previous works in Multi-task Bayesian optimization algorithm queries a point to be evaluated for all tasks in each round of search, which is not efficient. For the case where different tasks are correlated, it is not necessary to evaluate all tasks for a given query point. Therefore, the objective of this work is to develop an algorithm for multi-task Bayesian optimization with automatic task selection so that only one task evaluation is needed per query round. Specifically, a new algorithm, namely, multi-task Gaussian process upper confidence bound (MT-GPUCB), is proposed to achieve this objective. The MT-GPUCB is a two-step algorithm, where the first step chooses which query point to evaluate, and the second step automatically selects the most informative task to evaluate. Under the bandit setting, a theoretical analysis is provided to show that our proposed MT-GPUCB is no-regret under some mild conditions. Our proposed algorithm is verified experimentally on a range of synthetic functions as well as real-world problems. The results clearly show the advantages of our query strategy for both design point and task.


Author(s):  
Katharina Kreffter ◽  
Simon Götz ◽  
Stefanie Lisak-Wahl ◽  
Thuy Ha Nguyen ◽  
Nico Dragano ◽  
...  

Abstract Aim Practicing physicians have a special position as disseminators of community-based prevention for children. However, it is unclear to what extent physicians inform parents about programs. The study investigated: To what extent do physicians disseminate information about community-based prevention for children aged 0–7? Do differences exist along family’s socioeconomic position (SEP) and immigrant background? Subject and methods We conducted a retrospective cohort study in a German school entrance examination. Parents were invited to participate in a survey on community-based prevention with information about their awareness and information source. SEP was measured by parental education, immigrant background by country of birth. For nine services types, we counted how often parents named physicians and other professional groups as information sources. To estimate social differences, we calculated adjusted odds ratios (OR) with 95% confidence interval (CI). Results Survey participants included 6480 parents (response 65.49%). Compared to other information sources, physicians were mentioned less frequently. For example, regarding language therapy, 31.2% of parents were informed by healthcare/social services, and 4.4% by physicians. Lower educated parents were less frequently informed by physicians about counseling services (OR 0.58; 95% CI 0.46–0.73) compared to higher educated parents. Parents with immigrant background were informed less often about parenting skills courses (OR 0.79; 95% CI 0.70–0.90) compared to parents without immigrant background, but more often about language therapy (OR 1.47; 95% CI 1.13–1.91). No further social differences were observed. Conclusion The role of physicians as disseminators for community-based prevention is expandable. They should promote parenting skills courses in a socially sensitive way.


2021 ◽  
pp. 096100062199280
Author(s):  
Nafiz Zaman Shuva

This study explores the employment-related information seeking behaviour of Bangladeshi immigrants in Canada. Using a mixed-methods approach, the study conducted semi-structured interviews with 60 Bangladeshi immigrants in Ontario, Canada, and obtained 205 survey responses. The study highlights the centrality of employment-related settlement among Bangladeshi immigrants in Ontario and reports many immigrants not being able to utilize their education and skills after arrival in Canada. The results show that Bangladeshi immigrants utilize various information sources for their employment in Canada, including friends and professional colleagues, online searchers, and settlement agencies. Although Bangladeshi immigrants utilized a large array of information sources for meeting their employment-related information needs, many interview participants emphasized that the employment-related benefits they received was because of their access to friends and professional colleagues in Canada. The survey results echoed the interview findings. The cross-tabulation results on post-arrival information sources and occupation status as well as first job information sources and occupational status in Canada show a significant association among the use of the information source “friends and professional colleagues in Canada” and immigrants’ occupational status. The study highlights the benefits of professional colleagues among immigrants in employment-related settlement contexts. It also reports the challenges faced by many immigrant professionals related to employment-related settlement because of the lack of access to their professional friends and colleagues in Canada. The author urges the Federal Government of Canada, provincial governments, and settlement agencies working with newcomers to offer services that would connect highly skilled immigrants with their professional networks in Canada, in order to get proper guidance related to obtaining a professional job or alternative career. The author calls for further studies on employment-related information seeking by immigrants to better understand the role information plays in their settlement in a new country.


2012 ◽  
Vol 22 (2) ◽  
pp. 125-131 ◽  
Author(s):  
Niko Jelušić ◽  
Mario Anžek ◽  
Božidar Ivanković

Advanced automatic traffic control systems and various other ITS (Intelligent Transport Systems) applications and services rely on real-time information from the traffic system. This paper presents the overview and general functions of different information sources which provide real-time information that are used or could be used in ITS. The objective is to formally define the quality of information sources suitable for ITS based on formal models of the traffic system and information sources. The definition of quality encompasses these essential factors: traffic system information that exists or may be requested, user requirements and attributes that describe the information sources. This provides the framework and guidelines for the evaluation of information sources that accounts for relevant factors that influence their selection for specific ITS applications. KEY WORDS: information source, information source quality, Intelligent Transport Systems (ITS), automatic traffic control


1990 ◽  
Vol 12 (1) ◽  
pp. 56-65 ◽  
Author(s):  
Vicki Ebbeck

This study examined the sources of information used by adult exercisers to judge performance. Of particular interest was the investigation of gender differences. Subjects, 271 adults (174 males, 97 females) who were enrolled in a university weight training program, completed a questionnaire designed to evaluate the importance of 12 information sources in judging weight training performance: instructor feedback, student feedback, student comparison, changes noticed outside the gym, personal attraction toward the activity, degree of perceived effort exerted in the workout, performance in workout, feedback from others not in the class, goal setting, muscle development, workout improvement over time, and ease in learning new skills. Results revealed a significant discriminant function analysis for gender, with six information sources entering the stepwise procedure: goal setting, student feedback, learning, effort, improvement, and changes noticed outside the gym differentiated the gender groups. Males relied more than females on student feedback as an information source to judge performance. Alternatively, females used effort, goal setting, improvement, and learning as information sources more than males.


Author(s):  
George H. Cheng ◽  
Adel Younis ◽  
Kambiz Haji Hajikolaei ◽  
G. Gary Wang

Mode Pursuing Sampling (MPS) was developed as a global optimization algorithm for optimization problems involving expensive black box functions. MPS has been found to be effective and efficient for problems of low dimensionality, i.e., the number of design variables is less than ten. A previous conference publication integrated the concept of trust regions into the MPS framework to create a new algorithm, TRMPS, which dramatically improved performance and efficiency for high dimensional problems. However, although TRMPS performed better than MPS, it was unproven against other established algorithms such as GA. This paper introduces an improved algorithm, TRMPS2, which incorporates guided sampling and low function value criterion to further improve algorithm performance for high dimensional problems. TRMPS2 is benchmarked against MPS and GA using a suite of test problems. The results show that TRMPS2 performs better than MPS and GA on average for high dimensional, expensive, and black box (HEB) problems.


Sign in / Sign up

Export Citation Format

Share Document