The World Health Organization estimates that 50 million people are currently living with dementia worldwide and this figure will almost triple by 2050. Current pharmacological treatments are only symptomatic, and drugs or other therapies are ineffective in slowing down or curing the neurodegenerative process at the basis of dementia. Therefore, early detection of cognitive decline is of the utmost importance to respond significantly and deliver preventive interventions. Recently, the researchers showed that speech alterations might be one of the earliest signs of cognitive defect, observable well in advance before other cognitive deficits become manifest. In this article, we propose a full automated method able to classify the audio file of the subjects according to the progress level of the pathology. In particular, we trained a specific type of artificial neural network, called autoencoder, using the visual representation of the audio signal of the subjects, that is, the spectrogram. Moreover, we used a data augmentation approach to overcome the problem of the large amount of annotated data usually required during the training phase, which represents one of the most major obstacles in deep learning. We evaluated the proposed method using a dataset of 288 audio files from 96 subjects: 48 healthy controls and 48 cognitively impaired participants. The proposed method obtained good classification results compared to the state-of-the-art neuropsychological screening tests and, with an accuracy of 90.57%, outperformed the methods based on manual transcription and annotation of speech.