With the rapid development of drones, many problems have arisen, such as invasion of privacy and endangering security. Inspired by biology, in order to achieve effective detection and robust tracking of small targets such as unmanned aerial vehicles, a binocular vision detection system is designed. The system is composed of long focus and wide-angle dual cameras, servo pan tilt, and dual processors for detecting and identifying targets. In view of the shortcomings of spatio-temporal context target tracking algorithm that cannot adapt to scale transformation and easy to track failure in complex scenes, the scale filter and loss criterion are introduced to make an improvement. Qualitative and quantitative experiments show that the designed system can adapt to the scale changes and partial occlusion conditions in the detection, and meets the real-time requirements. The hardware system and algorithm both have reference value for the application of anti-unmanned aerial vehicle systems.