Accelerating Neural Architecture Search via Proxy Data

Despite the increasing interest in neural architecture search (NAS), the significant computational cost of NAS is a hindrance to researchers. Hence, we propose to reduce the cost of NAS using proxy data, i.e., a representative subset of the target data, without sacrificing search performance. Even though data selection has been used across various fields, our evaluation of existing selection methods for NAS algorithms offered by NAS-Bench-1shot1 reveals that they are not always appropriate for NAS and a new selection method is necessary. By analyzing proxy data constructed using various selection methods through data entropy, we propose a novel proxy data selection method tailored for NAS. To empirically demonstrate the effectiveness, we conduct thorough experiments across diverse datasets, search spaces, and NAS algorithms. Consequently, NAS algorithms with the proposed selection discover architectures that are competitive with those obtained using the entire dataset. It significantly reduces the search cost: executing DARTS with the proposed selection requires only 40 minutes on CIFAR-10 and 7.5 hours on ImageNet with a single GPU. Additionally, when the architecture searched on ImageNet using the proposed selection is inversely transferred to CIFAR-10, a state-of-the-art test error of 2.4% is yielded. Our code is available at https://github.com/nabk89/NAS-with-Proxy-data.

Download Full-text

Statistical and Machine Learning Link Selection Methods for Brain Functional Networks: Review and Comparison

Brain Sciences ◽

10.3390/brainsci11060735 ◽

2021 ◽

Vol 11 (6) ◽

pp. 735

Author(s):

Ilinka Ivanoska ◽

Kire Trivodaliev ◽

Slobodan Kalajdziski ◽

Massimiliano Zanin

Keyword(s):

Machine Learning ◽

Computational Cost ◽

Learning Task ◽

Functional Networks ◽

Large Set ◽

Selection Methods ◽

Machine Learning Methods ◽

Brain Functional Networks ◽

The Cost ◽

The Brain

Network-based representations have introduced a revolution in neuroscience, expanding the understanding of the brain from the activity of individual regions to the interactions between them. This augmented network view comes at the cost of high dimensionality, which hinders both our capacity of deciphering the main mechanisms behind pathologies, and the significance of any statistical and/or machine learning task used in processing this data. A link selection method, allowing to remove irrelevant connections in a given scenario, is an obvious solution that provides improved utilization of these network representations. In this contribution we review a large set of statistical and machine learning link selection methods and evaluate them on real brain functional networks. Results indicate that most methods perform in a qualitatively similar way, with NBS (Network Based Statistics) winning in terms of quantity of retained information, AnovaNet in terms of stability and ExT (Extra Trees) in terms of lower computational cost. While machine learning methods are conceptually more complex than statistical ones, they do not yield a clear advantage. At the same time, the high heterogeneity in the set of links retained by each method suggests that they are offering complementary views to the data. The implications of these results in neuroscience tasks are finally discussed.

Download Full-text

The Effectiveness of the Fused Weighted Filter Feature Selection Method to Improve Software Fault Prediction

Journal of Communications Technology Electronics and Computer Science ◽

10.22385/jctecs.v8i0.96 ◽

2016 ◽

Vol 8 ◽

pp. 5 ◽

Cited By ~ 1

Author(s):

Fatemeh Alighardashi ◽

Mohammad Ali Zare Chahooki

Keyword(s):

Feature Selection ◽

Feature Selection Method ◽

Selection Method ◽

Machine Learning Algorithms ◽

Fault Prediction ◽

Filter Method ◽

Selection Methods ◽

Software Projects ◽

Software Fault Prediction ◽

Software Fault

Improving the software product quality before releasing by periodic tests is one of the most expensive activities in software projects. Due to limited resources to modules test in software projects, it is important to identify fault-prone modules and use the test sources for fault prediction in these modules. Software fault predictors based on machine learning algorithms, are effective tools for identifying fault-prone modules. Extensive studies are being done in this field to find the connection between features of software modules, and their fault-prone. Some of features in predictive algorithms are ineffective and reduce the accuracy of prediction process. So, feature selection methods to increase performance of prediction models in fault-prone modules are widely used. In this study, we proposed a feature selection method for effective selection of features, by using combination of filter feature selection methods. In the proposed filter method, the combination of several filter feature selection methods presented as fused weighed filter method. Then, the proposed method caused convergence rate of feature selection as well as the accuracy improvement. The obtained results on NASA and PROMISE with ten datasets, indicates the effectiveness of proposed method in improvement of accuracy and convergence of software fault prediction.

Download Full-text

A fuzzy gaussian rank aggregation ensemble feature selection method for microarray data

International Journal of Knowledge-based and Intelligent Engineering Systems ◽

10.3233/kes-190134 ◽

2021 ◽

Vol 24 (4) ◽

pp. 289-301

Author(s):

B. Venkatesh ◽

J. Anuradha

Keyword(s):

Feature Selection ◽

Microarray Data ◽

Classification Accuracy ◽

Performance Metrics ◽

Feature Selection Method ◽

Selection Method ◽

Support Vector ◽

Svm Classifier ◽

Binary Particle Swarm Optimization ◽

Selection Methods

In Microarray Data, it is complicated to achieve more classification accuracy due to the presence of high dimensions, irrelevant and noisy data. And also It had more gene expression data and fewer samples. To increase the classification accuracy and the processing speed of the model, an optimal number of features need to extract, this can be achieved by applying the feature selection method. In this paper, we propose a hybrid ensemble feature selection method. The proposed method has two phases, filter and wrapper phase in filter phase ensemble technique is used for aggregating the feature ranks of the Relief, minimum redundancy Maximum Relevance (mRMR), and Feature Correlation (FC) filter feature selection methods. This paper uses the Fuzzy Gaussian membership function ordering for aggregating the ranks. In wrapper phase, Improved Binary Particle Swarm Optimization (IBPSO) is used for selecting the optimal features, and the RBF Kernel-based Support Vector Machine (SVM) classifier is used as an evaluator. The performance of the proposed model are compared with state of art feature selection methods using five benchmark datasets. For evaluation various performance metrics such as Accuracy, Recall, Precision, and F1-Score are used. Furthermore, the experimental results show that the performance of the proposed method outperforms the other feature selection methods.

Download Full-text

Perfusion index in healthy newborns during critical congenital heart disease screening at 24 hours: retrospective observational study from the USA

BMJ Open ◽

10.1136/bmjopen-2017-017580 ◽

2017 ◽

Vol 7 (12) ◽

pp. e017580 ◽

Cited By ~ 5

Author(s):

Priya Jegatheesan ◽

Matthew Nudelman ◽

Keshav Goel ◽

Dongli Song ◽

Balaji Govindaswami

Keyword(s):

Congenital Heart Disease ◽

Heart Disease ◽

Observational Study ◽

Oxygen Saturation ◽

Congenital Heart ◽

Selection Method ◽

Perfusion Index ◽

Data Selection ◽

Retrospective Observational Study ◽

Critical Congenital Heart Disease

ObjectiveTo describe the distribution of perfusion index (PI) in asymptomatic newborns at 24 hours of life when screening for critical congenital heart disease (CCHD) using an automated data selection method.DesignThis is a retrospective observational study.SettingNewborn nursery in a California public hospital with ~3500 deliveries annually.MethodsWe developed an automated programme to select the PI values from CCHD screens. Included were term and late preterm infants who were screened for CCHD from November 2013 to January 2014 and from May 2015 to July 2015. PI measurements were downloaded every 2 s from the pulse oximeter and median PI were calculated for each oxygen saturation screen in our cohort.ResultsWe included data from 2768 oxygen saturation screens. Each screen had a median of 29 data points (IQR 17 to 49). The median PI in our study cohort was 1.8 (95% CI 1.8 to 1.9) with IQR 1.2 to 2.7. The median preductal PI was significantly higher than the median postductal (1.9 vs 1.8, p=0.03) although this difference may not be clinically significant.ConclusionUsing an automated data selection method, the median PI in asymptomatic newborns at 24 hours of life is 1.8 with a narrow IQR of 1.2 to 2.7. This automated data selection method may improve accuracy and precision compared with manual data collection method. Further studies are needed to establish external validity of this automated data selection method and its clinical application for CCHD screening.

Download Full-text

Optimization of machining parameters during Drilling by Taguchi based Design of Experiments and Validation by Neural Network

Brazilian Journal of Operations & Production Management ◽

10.14488/bjopm.2018.v15.n2.a11 ◽

2018 ◽

Vol 15 (2) ◽

pp. 294-301

Author(s):

Reddy Sreenivasulu ◽

Chalamalasetti SrinivasaRao

Keyword(s):

Neural Network ◽

Design Of Experiments ◽

Computational Cost ◽

Machining Parameters ◽

Burr Size ◽

Taguchi Design Of Experiments ◽

The Neural Network ◽

Input Parameters ◽

Hole Making ◽

The Cost

Drilling is a hole making process on machine components at the time of assembly work, which are identify everywhere. In precise applications, quality and accuracy play a wide role. Nowadays’ industries suffer due to the cost incurred during deburring, especially in precise assemblies such as aerospace/aircraft body structures, marine works and automobile industries. Burrs produced during drilling causes dimensional errors, jamming of parts and misalignment. Therefore, deburring operation after drilling is often required. Now, reducing burr size is a serious topic. In this study experiments are conducted by choosing various input parameters selected from previous researchers. The effect of alteration of drill geometry on thrust force and burr size of drilled hole was investigated by the Taguchi design of experiments and found an optimum combination of the most significant input parameters from ANOVA to get optimum reduction in terms of burr size by design expert software. Drill thrust influences more on burr size. The clearance angle of the drill bit causes variation in thrust. The burr height is observed in this study. These output results are compared with the neural network software @easy NN plus. Finally, it is concluded that by increasing the number of nodes the computational cost increases and the error in nueral network decreases. Good agreement was shown between the predictive model results and the experimental responses.

Download Full-text

One Proxy Device Is Enough for Hardware-Aware Neural Architecture Search

Proceedings of the ACM on Measurement and Analysis of Computing Systems ◽

10.1145/3491046 ◽

2021 ◽

Vol 5 (3) ◽

pp. 1-34

Author(s):

Bingqian Lu ◽

Jianyi Yang ◽

Weiwen Jiang ◽

Yiyu Shi ◽

Shaolei Ren

Keyword(s):

State Of The Art ◽

Autonomous Driving ◽

Pareto Optimal ◽

Video Content ◽

Fast Evaluation ◽

Video Content Analysis ◽

Search Spaces ◽

Neural Architecture ◽

Real World Applications ◽

Prohibitive Cost

Convolutional neural networks (CNNs) are used in numerous real-world applications such as vision-based autonomous driving and video content analysis. To run CNN inference on various target devices, hardware-aware neural architecture search (NAS) is crucial. A key requirement of efficient hardware-aware NAS is the fast evaluation of inference latencies in order to rank different architectures. While building a latency predictor for each target device has been commonly used in state of the art, this is a very time-consuming process, lacking scalability in the presence of extremely diverse devices. In this work, we address the scalability challenge by exploiting latency monotonicity --- the architecture latency rankings on different devices are often correlated. When strong latency monotonicity exists, we can re-use architectures searched for one proxy device on new target devices, without losing optimality. In the absence of strong latency monotonicity, we propose an efficient proxy adaptation technique to significantly boost the latency monotonicity. Finally, we validate our approach and conduct experiments with devices of different platforms on multiple mainstream search spaces, including MobileNet-V2, MobileNet-V3, NAS-Bench-201, ProxylessNAS and FBNet. Our results highlight that, by using just one proxy device, we can find almost the same Pareto-optimal architectures as the existing per-device NAS, while avoiding the prohibitive cost of building a latency predictor for each device.

Download Full-text

Inference of Genetic Networks From Time-Series and Static Gene Expression Data: Combining a Random-Forest-Based Inference Method With Feature Selection Methods

Frontiers in Genetics ◽

10.3389/fgene.2020.595912 ◽

2020 ◽

Vol 11 ◽

Author(s):

Shuhei Kimura ◽

Ryo Fukutomi ◽

Masato Tokuhisa ◽

Mariko Okada

Keyword(s):

Gene Expression ◽

Feature Selection ◽

Random Forest ◽

Gene Expression Data ◽

Computational Cost ◽

Expression Data ◽

Selection Methods ◽

Inference Method ◽

Combined Application ◽

Inference Methods

Several researchers have focused on random-forest-based inference methods because of their excellent performance. Some of these inference methods also have a useful ability to analyze both time-series and static gene expression data. However, they are only of use in ranking all of the candidate regulations by assigning them confidence values. None have been capable of detecting the regulations that actually affect a gene of interest. In this study, we propose a method to remove unpromising candidate regulations by combining the random-forest-based inference method with a series of feature selection methods. In addition to detecting unpromising regulations, our proposed method uses outputs from the feature selection methods to adjust the confidence values of all of the candidate regulations that have been computed by the random-forest-based inference method. Numerical experiments showed that the combined application with the feature selection methods improved the performance of the random-forest-based inference method on 99 of the 100 trials performed on the artificial problems. However, the improvement tends to be small, since our combined method succeeded in removing only 19% of the candidate regulations at most. The combined application with the feature selection methods moreover makes the computational cost higher. While a bigger improvement at a lower computational cost would be ideal, we see no impediments to our investigation, given that our aim is to extract as much useful information as possible from a limited amount of gene expression data.

Download Full-text

The Application of Computer Image Analysis Based on Textural Features for the Identification of Barley Kernels Infected with Fungi of the Genus Fusarium

Agricultural Engineering ◽

10.1515/agriceng-2018-0026 ◽

2018 ◽

Vol 22 (3) ◽

pp. 49-56 ◽

Cited By ~ 1

Author(s):

Ewa Ropelewska

Keyword(s):

Image Analysis ◽

Selection Method ◽

Color Model ◽

Textural Features ◽

Discriminative Power ◽

Selection Methods ◽

Computer Image Analysis ◽

Genus Fusarium ◽

Color Models ◽

Lab Color Model

AbstractThe aim of this study was to develop discrimination models based on textural features for the identification of barley kernels infected with fungi of the genus Fusarium and healthy kernels. Infected barley kernels with altered shape and discoloration and healthy barley kernels were scanned. Textures were computed using MaZda software. The kernels were classified as infected and healthy with the use of the WEKA application. In the case of RGB, Lab and XYZ color models, the classification accuracies based on 10 selected textures with the highest discriminative power ranged from 95 to 100%. The lowest result (95%) was noted in XYZ color model and Multi Class Classifier for the textures selected using the Ranker method and the OneR attribute evaluator. Selected classifiers were characterized by 100% accuracy in the case of all color models and selection methods. The highest number of 100% results was obtained for the Lab color model with Naive Bayes, LDA, IBk, Multi Class Classifier and J48 classifiers in the Best First selection method with the CFS subset evaluator.

Download Full-text

Using systematic sampling selection for Monte Carlo solutions of Feynman-Kac equations

Advances in Applied Probability ◽

10.1239/aap/1214950212 ◽

2008 ◽

Vol 40 (2) ◽

pp. 454-472 ◽

Cited By ~ 1

Author(s):

Ivan Gentil ◽

Bruno Rémillard

Keyword(s):

Monte Carlo ◽

Markov Chains ◽

Selection Method ◽

Systematic Sampling ◽

Selection Methods ◽

Convergence Properties ◽

Main Motivation ◽

Empirical Basis ◽

Convergence Results ◽

Selection For

While the convergence properties of many sampling selection methods can be proven, there is one particular sampling selection method introduced in Baker (1987), closely related to ‘systematic sampling’ in statistics, that has been exclusively treated on an empirical basis. The main motivation of the paper is to start to study formally its convergence properties, since in practice it is by far the fastest selection method available. We will show that convergence results for the systematic sampling selection method are related to properties of peculiar Markov chains.

Download Full-text

Effect of selection methods on seed potato quality in the following season

IOP Conference Series Earth and Environmental Science ◽

10.1088/1755-1315/922/1/012015 ◽

2021 ◽

Vol 922 (1) ◽

pp. 012015

Author(s):

N Gunadi ◽

A Pronk ◽

A A Kartasih ◽

L Prabaningrum ◽

T K Moekasan ◽

...

Keyword(s):

Seed Potato ◽

Block Design ◽

Growing Season ◽

Selection Method ◽

Selection Methods ◽

Plant Selection ◽

West Java ◽

Potato Quality ◽

Randomized Complete Block Design

Abstract Most potato farmers in Indonesia select the small tubers at harvest for planting in the following season, the so-called farmers’ practice (FP). This propagation method is cheap, but the small tubers may come from less healthy plants, which increases the build-up of diseases with accelerated yield decreases over the seasons. Alternatively, farmers may identify healthy plants within the growing season and select those for propagation, the so-called positive plant selection method (PPSM). An experiment was carried out to evaluate the effects of PPSM compared to FP on yields in the following season in the two main potato growing areas of West Java, i.e., Pangalengan and Garut. Generations G2 and G3 of cv. Granola and one generation of the imported cv. Atlantic were used. Selected seeds using PPSM and FP were planted in the second season in a randomized complete block design. Results show that yields of seeds selected through PPSM were significantly higher compared to seeds selected through FP, over both locations, on average, 7.4, 5.5 and 1.2 ton ha−1 for Granola G2 and G3, and the Atlantic, respectively. These yield increases represent an increase in the gross revenue of 30.8 to 51.8 million IDR ha−1 for Granola and 1.9 to 7.8 million IDR ha−1 for Atlantic at a farm gate price of 7,000 and 6,500 IDR kg−1, respectively. This study confirms that PPSM is superior to FP and improves the quality of the farms saved seeds.

Download Full-text