Towards large scale automated algorithm design by integrating modular benchmarking frameworks

Abstract We present the South Galactic Pole (SGP) data release from the GaLactic and Extragalactic All-sky Murchison Widefield Array (GLEAM) survey. These data combine both years of GLEAM observations at 72–231 MHz conducted with the Murchison Widefield Array (MWA) and cover an area of 5 113 $\mathrm{deg}^{2}$ centred on the SGP at $20^{\mathrm{h}} 40^{\mathrm{m}} < \mathrm{RA} < 05^{\mathrm{h}} 04^{\mathrm{m}}$ and $-48^{\circ} < \mathrm{Dec} < -2^{\circ} $ . At 216 MHz, the typical rms noise is ${\approx}5$ mJy beam–1 and the angular resolution ${\approx}2$ arcmin. The source catalogue contains a total of 108 851 components above $5\sigma$ , of which 77% have measured spectral indices between 72 and 231 MHz. Improvements to the data reduction in this release include the use of the GLEAM Extragalactic catalogue as a sky model to calibrate the data, a more efficient and automated algorithm to deconvolve the snapshot images, and a more accurate primary beam model to correct the flux scale. This data release enables more sensitive large-scale studies of extragalactic source populations as well as spectral variability studies on a one-year timescale.

Download Full-text

A web-based, branching logic questionnaire for the automated classification of migraine

10.1101/369827 ◽

2018 ◽

Author(s):

Eric A. Kaiser ◽

Aleksandra Igdalova ◽

Geoffrey K. Aguirre ◽

Brett Cucchiara

Keyword(s):

Migraine Without Aura ◽

Large Scale ◽

Automated Analysis ◽

Automated Classification ◽

Online Questionnaire ◽

Demographic Differences ◽

Web Based ◽

Online Evaluation ◽

Automated Algorithm

AbstractObjectiveTo identify migraineurs and headache-free individuals with an online questionnaire and automated analysis algorithm.MethodsWe created a branching-logic, web-based questionnaire—the Penn Online Evaluation of Migraine (POEM)—to obtain standardized headache history from a previously studied cohort. Responses were analyzed with an automated algorithm to assign subjects to one of several categories based on ICHD-3 (beta) criteria. Following a pre-registered protocol, this result was compared to prior diagnostic classification by a neurologist following a direct interview.ResultsOf 118 subjects contacted, 90 (76%) completed the questionnaire; of these 31 were headache-free, 29 migraine without aura (MwoA), and 30 migraine with aura (MwA). Mean age was 41 ± 6 years and 76% were female. There were no significant demographic differences between groups. The median time to complete the questionnaire was 2.5 minutes. Sensitivity of the POEM tool was 42%, 59%, and 70%, and specificity was 100%, 84%, and 94% for headache-free, MwoA, and MwA, respectively. Sensitivity and specificity of the POEM tool for migraine overall (with or without aura), was 83% and 90%, respectively.ConclusionsThe POEM web-based questionnaire, and associated analysis routines, identifies headache-free and migraine subjects with good specificity. It may be useful for classifying subjects for large-scale research studies.Trial Registration:https://osf.io/sq9ef

Download Full-text

Bilevel Traffic Evacuation Model and Algorithm Design for Large-Scale Activities

Mathematical Problems in Engineering ◽

10.1155/2017/5049657 ◽

2017 ◽

Vol 2017 ◽

pp. 1-13 ◽

Cited By ~ 1

Author(s):

Danwen Bao ◽

Jiayu Gu ◽

Junhua Jia

Keyword(s):

Large Scale ◽

Algorithm Design ◽

Arrival Rate ◽

Planning Model ◽

Objective Functions ◽

Swarm Optimization ◽

Evacuation Time ◽

Classic Model ◽

The Road

This paper establishes a bilevel planning model with one master and multiple slaves to solve traffic evacuation problems. The minimum evacuation network saturation and shortest evacuation time are used as the objective functions for the upper- and lower-level models, respectively. The optimizing conditions of this model are also analyzed. An improved particle swarm optimization (PSO) method is proposed by introducing an electromagnetism-like mechanism to solve the bilevel model and enhance its convergence efficiency. A case study is carried out using the Nanjing Olympic Sports Center. The results indicate that, for large-scale activities, the average evacuation time of the classic model is shorter but the road saturation distribution is more uneven. Thus, the overall evacuation efficiency of the network is not high. For induced emergencies, the evacuation time of the bilevel planning model is shortened. When the audience arrival rate is increased from 50% to 100%, the evacuation time is shortened from 22% to 35%, indicating that the optimization effect of the bilevel planning model is more effective compared to the classic model. Therefore, the model and algorithm presented in this paper can provide a theoretical basis for the traffic-induced evacuation decision making of large-scale activities.

Download Full-text

Research on Parallel DBSCAN Algorithm Design Based on MapReduce

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.301-303.1133 ◽

2011 ◽

Vol 301-303 ◽

pp. 1133-1138 ◽

Cited By ~ 17

Author(s):

Yan Xiang Fu ◽

Wei Zhong Zhao ◽

Hui Fang Ma

Keyword(s):

Data Clustering ◽

Large Scale ◽

Clustering Algorithm ◽

Clustering Algorithms ◽

Algorithm Design ◽

Document Retrieval ◽

Commodity Hardware ◽

Dbscan Clustering ◽

Dbscan Algorithm ◽

Parallel Clustering

Data clustering has been received considerable attention in many applications, such as data mining, document retrieval, image segmentation and pattern classification. The enlarging volumes of information emerging by the progress of technology, makes clustering of very large scale of data a challenging task. In order to deal with the problem, more researchers try to design efficient parallel clustering algorithms. In this paper, we propose a parallel DBSCAN clustering algorithm based on Hadoop, which is a simple yet powerful parallel programming platform. The experimental results demonstrate that the proposed algorithm can scale well and efficiently process large datasets on commodity hardware.

Download Full-text

A Parallel Programming Pattern Based on Directed Acyclic Graph

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.303-306.2165 ◽

2013 ◽

Vol 303-306 ◽

pp. 2165-2169

Author(s):

Zheng Meng ◽

Ying Lin ◽

Yan Kang ◽

Qian Yu

Keyword(s):

Parallel Programming ◽

Computer Technology ◽

Directed Acyclic Graph ◽

Large Scale ◽

Algorithm Design ◽

Batch Processing ◽

Acyclic Graph ◽

Static Data ◽

Definition Of

With the development of computer technology, multi-core programming is now becoming hot issues. Based on directed acyclic graph, this paper gives definition of a number of executable operations and establishes a parallel programming pattern. Using verticies to represent tasks and edges to represent communication between vertex, this parallel programming pattern let the programmers easily to identify the available concurrency and expose it for use in the algorithm design. The proposed pattern can be used for large-scale static data batch processing in multi-core environments and can bring lots of convenience when deal with complex issues.

Download Full-text

Integrating modeling, algorithm design, and computational implementation to solve a large-scale non-linear mixed integer programming problem

Annals of Operations Research ◽

10.1007/bf02022082 ◽

1986 ◽

Vol 5 (2) ◽

pp. 395-411

Author(s):

F. Glover ◽

D. Klingman ◽

N. Phillips ◽

G. T. Ross

Keyword(s):

Integer Programming ◽

Programming Problem ◽

Mixed Integer Programming ◽

Large Scale ◽

Algorithm Design ◽

Mixed Integer ◽

Integer Programming Problem ◽

Computational Implementation ◽

Non Linear ◽

Mixed Integer Programming Problem

Download Full-text

Large-Scale Physical Modeling Synthesis, Parallel Computing, and Musical Experimentation: The NESS Project in Practice

Computer Music Journal ◽

10.1162/comj_a_00517 ◽

2020 ◽

Vol 43 (2-3) ◽

pp. 31-47 ◽

Cited By ~ 1

Author(s):

Stefan Bilbao ◽

James Perry ◽

Paul Graham ◽

Alan Gray ◽

Kostas Kavoussanakis ◽

...

Keyword(s):

Physical Modeling ◽

Large Scale ◽

Algorithm Design ◽

Musical Instruments ◽

Synthesis System ◽

Professional Musicians ◽

Musical Acoustics ◽

Modeling Systems ◽

Computational Resources ◽

Theoretical Developments

Sound synthesis using physical modeling, emulating systems of a complexity approaching and even exceeding that of real-world acoustic musical instruments, is becoming possible, thanks to recent theoretical developments in musical acoustics and algorithm design. Severe practical difficulties remain, both at the level of the raw computational resources required, and at the level of user control. An approach to the first difficulty is through the use of large-scale parallelization, and results for a variety of physical modeling systems are presented here. Any progress with regard to the second difficulty requires, necessarily, the experience and advice of professional musicians. A basic interface to a parallelized large-scale physical modeling synthesis system is presented here, accompanied by first-hand descriptions of the working methods of five composers, each of whom generated complete multichannel pieces using the system.

Download Full-text

Towards a fully automated algorithm driven platform for biosystems design

Nature Communications ◽

10.1038/s41467-019-13189-z ◽

2019 ◽

Vol 10 (1) ◽

Cited By ~ 19

Author(s):

Mohammad HamediRad ◽

Ran Chao ◽

Scott Weisberg ◽

Jiazhang Lian ◽

Saurabh Sinha ◽

...

Keyword(s):

Large Scale ◽

Optimization Problems ◽

Machine Learning Algorithms ◽

Successful Implementation ◽

Large Scale Data ◽

Automated Algorithm ◽

Design Build ◽

Cost Variability ◽

Scale Data ◽

Random Screening

Abstract Large-scale data acquisition and analysis are often required in the successful implementation of the design, build, test, and learn (DBTL) cycle in biosystems design. However, it has long been hindered by experimental cost, variability, biases, and missed insights from traditional analysis methods. Here, we report the application of an integrated robotic system coupled with machine learning algorithms to fully automate the DBTL process for biosystems design. As proof of concept, we have demonstrated its capacity by optimizing the lycopene biosynthetic pathway. This fully-automated robotic platform, BioAutomata, evaluates less than 1% of possible variants while outperforming random screening by 77%. A paired predictive model and Bayesian algorithm select experiments which are performed by Illinois Biological Foundry for Advanced Biomanufacturing (iBioFAB). BioAutomata excels with black-box optimization problems, where experiments are expensive and noisy and the success of the experiment is not dependent on extensive prior knowledge of biological mechanisms.

Download Full-text

Abstract 2597: Centralized and Study-specific Phenotyping in a Large-scale Stroke Genetics Study: An Analysis from the NINDS Stroke Genetics Network (SiGN)

Stroke ◽

10.1161/str.43.suppl_1.a2597 ◽

2012 ◽

Vol 43 (suppl_1) ◽

Author(s):

Bradford B Worrall ◽

Alejandro Rabinstein ◽

Dale M Gamble ◽

Kevin M Barrett ◽

Shaneela Malik ◽

...

Keyword(s):

Ischemic Stroke ◽

Large Scale ◽

Association Studies ◽

Web Based ◽

Stroke Treatment ◽

Automated Algorithm ◽

Genome Association ◽

Whole Genome Association ◽

Toast Classification ◽

User Friendly

Background: The Stroke Genetics Network (SiGN) funded by the NINDS aims to identify genetic risk factors in ischemic stroke using whole-genome association studies (GWAS). High quality phenotyping is crucial to successful application of GWAS. As a heterogenous disorder, stroke poses specific challenges. The Trial of Org 10172 in Acute Stroke Treatment (TOAST) classification is a broadly used, but its validity is challenged especially when performed by multiple investigators with differing interpretations of the system. The Causative Classification System for Ischemic Stroke (CCS) system is a new, web-based, and computerized algorithm that integrates clinical, diagnostic, and etiologic stroke characteristics in an evidence-based manner ( ccs.mgh.harvard.edu ) to generate subtypes. Methods: In planning the SiGN proposal, a sample of 20 coded charts were collected from a subset of participating studies to assess feasibility of central adjudication and comparability to study-specific TOAST. Two central adjudicators reviewed all records and generated TOAST and CCS subtypes. These were compared to study-specific TOAST subtype and the CCS phenotype generated for SiGN by local trained adjudicators. CCS data is now available for 7134 included cases using both a 5 and a 7 category system as defined in the table . Results: All 4 phenotypes were available for 115 ischemic stroke cases from 6 studies in SiGN. Basic demographics were 54% women, 63% white, and median age between 65-74. Table 1 provides the agreement between the various subtypes. Table 2 describes the types of disagreement. Conclusions: Central adjudication with only two adjudicators and curated medical records yielded more consistent subtyping independent of phenotyping system. The agreement for TOAST was higher than published rates by independent groups (∼0.50). In contrast, the agreement for CCS was lower than previously published (0.85-0.95). Site adjudicators' familiarity with TOAST and inexperience with CCS may contribute. Although CCS is an automated algorithm and has a number of user friendly features, our findings suggest that formal training and certification process before starting to use CCS may be worthwhile to achieve optimal benefit from the system.

Download Full-text

An Importance Sampling Approach to the Estimation of Algorithm Performance in Automated Algorithm Design

Lecture Notes in Computer Science - Learning and Intelligent Optimization ◽

10.1007/978-3-319-69404-7_1 ◽

2017 ◽

pp. 3-17 ◽

Cited By ~ 1

Author(s):

Steven Adriaensen ◽

Filip Moons ◽

Ann Nowé

Keyword(s):

Importance Sampling ◽

Algorithm Design ◽

Algorithm Performance ◽

Automated Algorithm ◽

Sampling Approach

Download Full-text