scholarly journals Stochastic Second-Order Method for Large-Scale Nonconvex Sparse Learning Models

Author(s):  
Hongchang Gao ◽  
Heng Huang

Sparse learning models have shown promising performance in the high dimensional machine learning applications. The main challenge of sparse learning models is how to optimize it efficiently. Most existing methods solve this problem by relaxing it as a convex problem, incurring large estimation bias. Thus, the sparse learning model with nonconvex constraint has attracted much attention due to its better performance. But it is difficult to optimize due to the non-convexity. In this paper, we propose a linearly convergent stochastic second-order method to optimize this nonconvex problem for large-scale datasets. The proposed method incorporates second-order information to improve the convergence speed. Theoretical analysis shows that our proposed method enjoys linear convergence rate and guarantees to converge to the underlying true model parameter. Experimental results have verified the efficiency and correctness of our proposed method.

Author(s):  
Jaya Pratha Sebastiyar ◽  
Martin Sahayaraj Joseph

Distributed joint congestion control and routing optimization has received a significant amount of attention recently. To date, however, most of the existing schemes follow a key idea called the back-pressure algorithm. Despite having many salient features, the first-order sub gradient nature of the back-pressure based schemes results in slow convergence and poor delay performance. To overcome these limitations, the present study was made as first attempt at developing a second-order joint congestion control and routing optimization framework that offers utility-optimality, queue-stability, fast convergence, and low delay.  Contributions in this project are three-fold. The present study propose a new second-order joint congestion control and routing framework based on a primal-dual interior-point approach and established utility-optimality and queue-stability of the proposed second-order method. The results of present study showed that how to implement the proposed second-order method in a distributed fashion.


2021 ◽  
Author(s):  
Norberto Sánchez-Cruz ◽  
Jose L. Medina-Franco

<p>Epigenetic targets are a significant focus for drug discovery research, as demonstrated by the eight approved epigenetic drugs for treatment of cancer and the increasing availability of chemogenomic data related to epigenetics. This data represents a large amount of structure-activity relationships that has not been exploited thus far for the development of predictive models to support medicinal chemistry efforts. Herein, we report the first large-scale study of 26318 compounds with a quantitative measure of biological activity for 55 protein targets with epigenetic activity. Through a systematic comparison of machine learning models trained on molecular fingerprints of different design, we built predictive models with high accuracy for the epigenetic target profiling of small molecules. The models were thoroughly validated showing mean precisions up to 0.952 for the epigenetic target prediction task. Our results indicate that the herein reported models have considerable potential to identify small molecules with epigenetic activity. Therefore, our results were implemented as freely accessible and easy-to-use web application.</p>


Author(s):  
YongAn LI

Background: The symbolic nodal analysis acts as a pivotal part of the very large scale integration (VLSI) design. Methods: In this work, based on the terminal relations for the pathological elements and the voltage differencing inverting buffered amplifier (VDIBA), twelve alternative pathological models for the VDIBA are presented. Moreover, the proposed models are applied to the VDIBA-based second-order filter and oscillator so as to simplify the circuit analysis. Results: The result shows that the behavioral models for the VDIBA are systematic, effective and powerful in the symbolic nodal circuit analysis.</P>


Author(s):  
Mark Endrei ◽  
Chao Jin ◽  
Minh Ngoc Dinh ◽  
David Abramson ◽  
Heidi Poxon ◽  
...  

Rising power costs and constraints are driving a growing focus on the energy efficiency of high performance computing systems. The unique characteristics of a particular system and workload and their effect on performance and energy efficiency are typically difficult for application users to assess and to control. Settings for optimum performance and energy efficiency can also diverge, so we need to identify trade-off options that guide a suitable balance between energy use and performance. We present statistical and machine learning models that only require a small number of runs to make accurate Pareto-optimal trade-off predictions using parameters that users can control. We study model training and validation using several parallel kernels and more complex workloads, including Algebraic Multigrid (AMG), Large-scale Atomic Molecular Massively Parallel Simulator, and Livermore Unstructured Lagrangian Explicit Shock Hydrodynamics. We demonstrate that we can train the models using as few as 12 runs, with prediction error of less than 10%. Our AMG results identify trade-off options that provide up to 45% improvement in energy efficiency for around 10% performance loss. We reduce the sample measurement time required for AMG by 90%, from 13 h to 74 min.


2021 ◽  
Vol 502 (3) ◽  
pp. 3976-3992
Author(s):  
Mónica Hernández-Sánchez ◽  
Francisco-Shu Kitaura ◽  
Metin Ata ◽  
Claudio Dalla Vecchia

ABSTRACT We investigate higher order symplectic integration strategies within Bayesian cosmic density field reconstruction methods. In particular, we study the fourth-order discretization of Hamiltonian equations of motion (EoM). This is achieved by recursively applying the basic second-order leap-frog scheme (considering the single evaluation of the EoM) in a combination of even numbers of forward time integration steps with a single intermediate backward step. This largely reduces the number of evaluations and random gradient computations, as required in the usual second-order case for high-dimensional cases. We restrict this study to the lognormal-Poisson model, applied to a full volume halo catalogue in real space on a cubical mesh of 1250 h−1 Mpc side and 2563 cells. Hence, we neglect selection effects, redshift space distortions, and displacements. We note that those observational and cosmic evolution effects can be accounted for in subsequent Gibbs-sampling steps within the COSMIC BIRTH algorithm. We find that going from the usual second to fourth order in the leap-frog scheme shortens the burn-in phase by a factor of at least ∼30. This implies that 75–90 independent samples are obtained while the fastest second-order method converges. After convergence, the correlation lengths indicate an improvement factor of about 3.0 fewer gradient computations for meshes of 2563 cells. In the considered cosmological scenario, the traditional leap-frog scheme turns out to outperform higher order integration schemes only when considering lower dimensional problems, e.g. meshes with 643 cells. This gain in computational efficiency can help to go towards a full Bayesian analysis of the cosmological large-scale structure for upcoming galaxy surveys.


2021 ◽  
Vol 28 (1) ◽  
pp. e100251
Author(s):  
Ian Scott ◽  
Stacey Carter ◽  
Enrico Coiera

Machine learning algorithms are being used to screen and diagnose disease, prognosticate and predict therapeutic responses. Hundreds of new algorithms are being developed, but whether they improve clinical decision making and patient outcomes remains uncertain. If clinicians are to use algorithms, they need to be reassured that key issues relating to their validity, utility, feasibility, safety and ethical use have been addressed. We propose a checklist of 10 questions that clinicians can ask of those advocating for the use of a particular algorithm, but which do not expect clinicians, as non-experts, to demonstrate mastery over what can be highly complex statistical and computational concepts. The questions are: (1) What is the purpose and context of the algorithm? (2) How good were the data used to train the algorithm? (3) Were there sufficient data to train the algorithm? (4) How well does the algorithm perform? (5) Is the algorithm transferable to new clinical settings? (6) Are the outputs of the algorithm clinically intelligible? (7) How will this algorithm fit into and complement current workflows? (8) Has use of the algorithm been shown to improve patient care and outcomes? (9) Could the algorithm cause patient harm? and (10) Does use of the algorithm raise ethical, legal or social concerns? We provide examples where an algorithm may raise concerns and apply the checklist to a recent review of diagnostic imaging applications. This checklist aims to assist clinicians in assessing algorithm readiness for routine care and identify situations where further refinement and evaluation is required prior to large-scale use.


Sign in / Sign up

Export Citation Format

Share Document