Multi-Layer Perceptron (MLP) neural networks have been used extensively for classification tasks. Typically, the MLP network is trained explicitly to produce the correct classification as its output. For speech recognition, however, several investigators have recently experimented with an indirect approach: a unique MLP predictive network is trained for each class of data, and classification is accomplished by determining which predictive network serves as the best model for samples of unknown speech. Results from this approach have been mixed. In this report, we compare the direct and indirect approaches to classification from a more fundamental perspective. We show how recent advances in nonlinear dimensionality reduction can be incorporated into the indirect approach, and we show how the two approaches can be integrated in a novel MLP framework. We further show how these new MLP networks can be usefully viewed as generalizations of Learning Vector Quantization (LVQ) and of subspace methods of pattern recognition. Lastly, we show that applying these ideas to the classification of temporal trajectories can substantially improve performance on simple tasks.