Nowadays, object detection methods based on deep learning are applied more and more to the interpretation of optical remote sensing images. However, the complex background and the wide range of object sizes in remote sensing images increase the difficulty of object detection. In this paper, we improve the detection performance by combining the attention information, and generate adaptive anchor boxes based on the attention map. Specifically, the attention mechanism is introduced into the proposed method to enhance the features of the object regions while reducing the influence of the background. The generated attention map is then used to obtain diverse and adaptable anchor boxes using the guided anchoring method. The generated anchor boxes can match better with the scene and the objects, compared with the traditional proposal boxes. Finally, the modulated feature adaptation module is applied to transform the feature maps to adapt to the diverse anchor boxes. Comprehensive evaluations on the DIOR dataset demonstrate the superiority of the proposed method over the state-of-the-art methods, such as RetinaNet, FCOS and CornerNet. The mean average precision of the proposed method is 4.5% higher than the feature pyramid network. In addition, the ablation experiments are also implemented to further analyze the respective influence of different blocks on the performance improvement.