Private Stochastic Non-convex Optimization with Improved Utility Rates

We study the differentially private (DP) stochastic nonconvex optimization with a focus on its under-studied utility measures in terms of the expected excess empirical and population risks. While the excess risks are extensively studied for convex optimization, they are rarely studied for nonconvex optimization, especially the expected population risk. For the convex case, recent studies show that it is possible for private optimization to achieve the same order of excess population risk as to the nonprivate optimization under certain conditions. It still remains an open question for the nonconvex case whether such ideal excess population risk is achievable. In this paper, we progress towards an affirmative answer to this open problem: DP nonconvex optimization is indeed capable of achieving the same excess population risk as to the nonprivate algorithm in most common parameter regimes, under certain conditions (i.e., well-conditioned nonconvexity). We achieve such improved utility rates compared to existing results by designing and analyzing the stagewise DP-SGD with early momentum algorithm. We obtain both excess empirical risk and excess population risk to achieve differential privacy. Our algorithm also features the first known results of excess and population risks for DP-SGD with momentum. Experiment results on both shallow and deep neural networks when respectively applied to simple and complex real datasets corroborate the theoretical results.

Download Full-text

A Diffusion Approximation Theory of Momentum Stochastic Gradient Descent in Nonconvex Optimization

Stochastic Systems ◽

10.1287/stsy.2021.0083 ◽

2021 ◽

Author(s):

Tianyi Liu ◽

Zhehui Chen ◽

Enlu Zhou ◽

Tuo Zhao

Keyword(s):

Neural Networks ◽

Nonconvex Optimization ◽

Gradient Descent ◽

Deep Neural Networks ◽

Optimization Problems ◽

Saddle Points ◽

Stochastic Gradient ◽

Stochastic Gradient Descent ◽

Nonconvex Optimization Problems ◽

Empirical Success

Momentum stochastic gradient descent (MSGD) algorithm has been widely applied to many nonconvex optimization problems in machine learning (e.g., training deep neural networks, variational Bayesian inference, etc.). Despite its empirical success, there is still a lack of theoretical understanding of convergence properties of MSGD. To fill this gap, we propose to analyze the algorithmic behavior of MSGD by diffusion approximations for nonconvex optimization problems with strict saddle points and isolated local optima. Our study shows that the momentum helps escape from saddle points but hurts the convergence within the neighborhood of optima (if without the step size annealing or momentum annealing). Our theoretical discovery partially corroborates the empirical success of MSGD in training deep neural networks.

Download Full-text

Manipulation Attacks in Local Differential Privacy

Journal of Privacy and Confidentiality ◽

10.29012/jpc.754 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Albert Cheu ◽

Adam Smith ◽

Jonathan Ullman

Keyword(s):

Systematic Study ◽

Differential Privacy ◽

Domain Size ◽

User Data ◽

Large Systems ◽

Cryptographic Techniques ◽

Fundamental Limitation ◽

Theoretical Results ◽

Different Levels ◽

Privacy Level

Local differential privacy is a widely studied restriction on distributed algorithms that collect aggregates about sensitive user data, and is now deployed in several large systems. We initiate a systematic study of a fundamental limitation of locally differentially private protocols: they are highly vulnerable to adversarial manipulation. While any algorithm can be manipulated by adversaries who lie about their inputs, we show that any noninteractive locally differentially private protocol can be manipulated to a much greater extent---when the privacy level is high, or the domain size is large, a small fraction of users in the protocol can completely obscure the distribution of the honest users' input. We also construct protocols that are optimally robust to manipulation for a variety of common tasks in local differential privacy. Finally, we give simple experiments validating our theoretical results, and demonstrating that protocols that are optimal without manipulation can have dramatically different levels of robustness to manipulation. Our results suggest caution when deploying local differential privacy and reinforce the importance of efficient cryptographic techniques for the distributed emulation of centrally differentially private mechanisms.

Download Full-text

Segregation of Heterogeneous Robotics Swarms via Convex Optimization

Volume 1: Advances in Control Design Methods, Nonlinear and Optimal Control, Robotics, and Wind Energy Systems; Aerospace Applications; Assistive and Rehabilitation Robotics; Assistive Robotics; Battery and Oil and Gas Systems; Bioengineering Applications; Biomedical and Neural Systems Modeling, Diagnostics and Healthcare; Control and Monitoring of Vibratory Systems; Diagnostics and Detection; Energy Harvesting; Estimation and Identification; Fuel Cells/Energy Storage; Intelligent Transportation ◽

10.1115/dscc2016-9653 ◽

2016 ◽

Cited By ~ 2

Author(s):

Victoria Edwards ◽

Paulo Rezeck ◽

Luiz Chaimowicz ◽

M. Ani Hsieh

Keyword(s):

Convex Optimization ◽

Division Of Labor ◽

Optimization Approach ◽

Control Methods ◽

Natural Phenomena ◽

The Past ◽

Robotic Swarms ◽

Theoretical Results

The division of labor amongst a heterogeneous swarm of robots increases the range and sophistication of the tasks the swarm can accomplish. To efficiently execute a task the swarm of robots must have some starting organization. Over the past decade segregation of robotic swarms has grown as a field of research drawing inspiration from natural phenomena such as cellular segregation. A variety of different approaches have been undertaken to devise control methods to organize a heterogeneous swarm of robots. In this work, we present a convex optimization approach to segregate a heterogeneous swarm into a set of homogeneous collectives. We present theoretical results that show our approach is guaranteed to achieve complete segregation and validate our strategy in simulation and experiments.

Download Full-text

Convex Optimization for Linear Query Processing under Approximate Differential Privacy

Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD '16 ◽

10.1145/2939672.2939818 ◽

2016 ◽

Cited By ~ 4

Author(s):

Ganzhao Yuan ◽

Yin Yang ◽

Zhenjie Zhang ◽

Zhifeng Hao

Keyword(s):

Convex Optimization ◽

Query Processing ◽

Differential Privacy ◽

Linear Query

Download Full-text

DERIVATIONS OF FRÉCHET NUCLEAR GB-ALGEBRAS

Bulletin of the Australian Mathematical Society ◽

10.1017/s0004972715000404 ◽

2015 ◽

Vol 92 (2) ◽

pp. 290-301 ◽

Cited By ~ 1

Author(s):

M. WEIGT ◽

I. ZARAKAS

Keyword(s):

Affirmative Answer ◽

Open Question

It is an open question whether every derivation of a Fréchet GB$^{\ast }$-algebra $A[{\it\tau}]$ is continuous. We give an affirmative answer for the case where $A[{\it\tau}]$ is a smooth Fréchet nuclear GB$^{\ast }$-algebra. Motivated by this result, we give examples of smooth Fréchet nuclear GB$^{\ast }$-algebras which are not pro-C$^{\ast }$-algebras.

Download Full-text

Open questions concerning antiautomorphisms of division rings with quasi-generalized Engel conditions

Journal of Algebra and Its Applications ◽

10.1142/s0219498819501676 ◽

2019 ◽

Vol 18 (09) ◽

pp. 1950167 ◽

Cited By ~ 1

Author(s):

M. Chacron ◽

T.-K. Lee

Keyword(s):

Finite Order ◽

Division Ring ◽

Affirmative Answer ◽

Major Result ◽

Division Rings ◽

Open Questions ◽

Engel Condition ◽

Open Question ◽

Positive Integers

Let [Formula: see text] be a noncommutative division ring with center [Formula: see text], which is algebraic, that is, [Formula: see text] is an algebraic algebra over the field [Formula: see text]. Let [Formula: see text] be an antiautomorphism of [Formula: see text] such that (i) [Formula: see text], all [Formula: see text], where [Formula: see text] and [Formula: see text] are positive integers depending on [Formula: see text]. If, further, [Formula: see text] has finite order, it was shown in [M. Chacron, Antiautomorphisms with quasi-generalised Engel condition, J. Algebra Appl. 17(8) (2018) 1850145 (19 pages)] that [Formula: see text] is commuting, that is, [Formula: see text], all [Formula: see text]. Posed in [M. Chacron, Antiautomorphisms with quasi-generalised Engel condition, J. Algebra Appl. 17(8) (2018) 1850145 (19 pages)] is the question which asks as to whether the finite order requirement on [Formula: see text] can be dropped. We provide here an affirmative answer to the question. The second major result of this paper is concerned with a nonnecessarily algebraic division ring [Formula: see text] with an antiautomorphism [Formula: see text] satisfying the stronger condition (ii) [Formula: see text], all [Formula: see text], where [Formula: see text] and [Formula: see text] are fixed positive integers. It was shown in [T.-K. Lee, Anti-automorphisms satisfying an Engel condition, Comm. Algebra 45(9) (2017) 4030–4036] that if, further, [Formula: see text] has finite order then [Formula: see text] is commuting. We show here, that again the finite order assumption on [Formula: see text] can be lifted answering thus in the affirmative the open question (see Question 2.11 in [T.-K. Lee, Anti-automorphisms satisfying an Engel condition, Comm. Algebra 45(9) (2017) 4030–4036]).

Download Full-text

Every Local Minimum Value Is the Global Minimum Value of Induced Model in Nonconvex Machine Learning

Neural Computation ◽

10.1162/neco_a_01234 ◽

2019 ◽

Vol 31 (12) ◽

pp. 2293-2323 ◽

Cited By ~ 1

Author(s):

Kenji Kawaguchi ◽

Jiaoyang Huang ◽

Leslie Pack Kaelbling

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Local Minimum ◽

Deep Neural Networks ◽

Representation Learning ◽

Minimum Value ◽

Special Cases ◽

Optimal Value ◽

Special Case ◽

Theoretical Results

For nonconvex optimization in machine learning, this article proves that every local minimum achieves the globally optimal value of the perturbable gradient basis model at any differentiable point. As a result, nonconvex machine learning is theoretically as supported as convex machine learning with a handcrafted basis in terms of the loss at differentiable local minima, except in the case when a preference is given to the handcrafted basis over the perturbable gradient basis. The proofs of these results are derived under mild assumptions. Accordingly, the proven results are directly applicable to many machine learning models, including practical deep neural networks, without any modification of practical methods. Furthermore, as special cases of our general results, this article improves or complements several state-of-the-art theoretical results on deep neural networks, deep residual networks, and overparameterized deep neural networks with a unified proof technique and novel geometric insights. A special case of our results also contributes to the theoretical foundation of representation learning.

Download Full-text

Data Processing with Combined Homotopy Methods for a Class of Nonconvex Optimization Problems

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.1046.403 ◽

2014 ◽

Vol 1046 ◽

pp. 403-406 ◽

Cited By ~ 1

Author(s):

Yun Feng Gao ◽

Ning Xu

Keyword(s):

Data Processing ◽

Nonconvex Optimization ◽

Normal Cone ◽

Optimization Problems ◽

Feasible Region ◽

Specific Class ◽

Homotopy Methods ◽

Cone Condition ◽

Nonconvex Optimization Problems ◽

Theoretical Results

On the existing theoretical results, this paper studies the realization of combined homotopy methods on optimization problems in a specific class of nonconvex constrained region. Contraposing to this nonconvex constrained region, we give the structure method of the quasi-normal, prove that the chosen mappings on constrained grads are positive independent and the feasible region on SLM satisfies the quasi-normal cone condition. And we construct combined homotopy equation under the quasi-normal cone condition with numerical value and examples, and get a preferable result by data processing.

Download Full-text

An extension of the Kegel–Wielandt theorem to locally finite groups

Glasgow Mathematical Journal ◽

10.1017/s0017089500031402 ◽

1996 ◽

Vol 38 (2) ◽

pp. 171-176

Author(s):

Silvana Franciosi ◽

Francesco de Giovanni ◽

Yaroslav P. Sysak

Keyword(s):

Finite Group ◽

Affirmative Answer ◽

Locally Finite Group ◽

Linear Groups ◽

Arbitrary Group ◽

Famous Theorem ◽

Locally Finite ◽

Locally Nilpotent ◽

Finite Products ◽

Open Question

A famous theorem of Kegel and Wielandt states that every finite group which is the product of two nilpotent subgroups is soluble (see [1], Theorem 2.4.3). On the other hand, it is an open question whether an arbitrary group factorized by two nilpotent subgroups satisfies some solubility condition, and only a few partial results are known on this subject. In particular, Kegel [6] obtained an affirmative answer in the case of linear groups, and in the same article he also proved that every locally finite group which is the product of two locally nilpotent FC-subgroups is locally soluble. Recall that a group G is said to be an FC-group if every element of G has only finitely many conjugates. Moreover, Kazarin [5] showed that if the locally finite group G = AB is factorized by an abelian subgroup A and a locally nilpotent subgroup B, then G is locally soluble. The aim of this article is to prove the following extension of the Kegel–Wielandt theorem to locally finite products of hypercentral groups.

Download Full-text

Distributed convex optimization via proportional-integral-differential algorithm

Measurement and Control ◽

10.1177/00202940211029332 ◽

2021 ◽

pp. 002029402110293

Author(s):

Wei Zhu ◽

Haibao Tian

Keyword(s):

Convex Optimization ◽

Utility Function ◽

Convex Optimization Problem ◽

Proportional Integral ◽

Locally Lipschitz ◽

Differential Algorithm ◽

Distributed Convex Optimization ◽

The Individual ◽

Local Utility Functions ◽

Theoretical Results

This paper studies the distributed convex optimization problem, where the global utility function is the sum of local cost functions associated to the individual agents. Only using the local information, a novel continuous-time distributed algorithm based on proportional-integral-differential (PID) control strategy is proposed. Under the assumption that the global utility function is strictly convex and local utility functions have locally Lipschitz gradients, the exponential convergence of the proposed algorithm is established with undirected and connected graph among these agents. Finally, numerical simulations are presented to illustrate the effectiveness of theoretical results.

Download Full-text