Deep neural networks have achieved high performance in image classification, image generation, voice recognition, natural language processing, etc.; however, they still have confronted several open challenges that need to be solved such as incremental learning problem, overfitting in neural networks, hyperparameter optimization, lack of flexibility and multitasking, etc. In this paper, we focus on the incremental learning problem which is related with machine learning methodologies that continuously train an existing model with additional knowledge. To the best of our knowledge, a simple and direct solution to solve this challenge is to retrain the entire neural network after adding the new labels in the output layer. Besides that, transfer learning can be applied only if the domain of the new labels is related to the domain of the labels that have already been trained in the neural network. In this paper, we propose a novel network architecture, namely Brick Assembly Network (BAN), which allows a trained network to assemble (or dismantle) a new label to (or from) a trained neural network without retraining the entire network. In BAN, we train labels with a sub-network (i.e., a simple neural network) individually and then we assemble the converged sub-networks that have trained for a single label together to form a full neural network. For each label to be trained in a sub-network of BAN, we introduce a new loss function that minimizes the loss of the network with only one class data. Applying one loss function for each class label is unique and different from standard neural network architectures (e.g., AlexNet, ResNet, InceptionV3, etc.) which use the values of a loss function from multiple labels to minimize the error of the network. The difference of between the loss functions of previous approaches and the one we have introduced is that we compute a loss values from node values of penultimate layer (we named it as a characteristic layer) instead of the output layer where the computation of the loss values occurs between true labels and predicted labels. From the experiment results on several benchmark datasets, we evaluate that BAN shows a strong capability of adding (and removing) a new label to a trained network compared with a standard neural network and other previous work.