Building extraction from high-resolution remote sensing images is of great significance in urban planning, population statistics, and economic forecast. However, automatic building extraction from high-resolution remote sensing images remains challenging. On the one hand, the extraction results of buildings are partially missing and incomplete due to the variation of hue and texture within a building, especially when the building size is large. On the other hand, the building footprint extraction of buildings with complex shapes is often inaccurate. To this end, we propose a new deep learning network, termed Building Residual Refine Network (BRRNet), for accurate and complete building extraction. BRRNet consists of such two parts as the prediction module and the residual refinement module. The prediction module based on an encoder–decoder structure introduces atrous convolution of different dilation rates to extract more global features, by gradually increasing the receptive field during feature extraction. When the prediction module outputs the preliminary building extraction results of the input image, the residual refinement module takes the output of the prediction module as an input. It further refines the residual between the result of the prediction module and the real result, thus improving the accuracy of building extraction. In addition, we use Dice loss as the loss function during training, which effectively alleviates the problem of data imbalance and further improves the accuracy of building extraction. The experimental results on Massachusetts Building Dataset show that our method outperforms other five state-of-the-art methods in terms of the integrity of buildings and the accuracy of complex building footprints.