Quasi-Equivalence of Width and Depth of Neural Networks
Abstract While classic studies proved that wide networks allow universal approximation, recent research and successes of deep learning demonstrate the power of the network depth. Based on a symmetric consideration, we investigate if the design of artificial neural networks should have a directional preference, and what the mechanism of interaction is between the width and depth of a network. We address this fundamental question by establishing a quasi-equivalence between the width and depth of ReLU networks. Specifically, we formulate a transformation from an arbitrary ReLU network to a wide network and a deep network for either regression or classification so that an essentially same capability of the original network can be implemented. That is, a deep regression/classification ReLU network has a wide equivalent, and vice versa, subject to an arbitrarily small error. Interestingly, the quasi-equivalence between wide and deep classification ReLU networks is a data-driven version of the DeMorgan law.