Motion estimation (ME) accounts for the major part of computational complexity of any video coding standard. The diamond search (DS) algorithm is widely used as a fast search technique to perform motion estimation. In this paper, a novel architecture for the diamond search technique is proposed that efficiently handles memory addressing and reduces hardware complexity. The proposed architecture meets the speed requirements for real-time video processing without compromising the area. The design when implemented in Verilog HDL on Virtex-5 technology and synthesized using Xilinx ISE Design Suite 12.4, gives rise to a critical path delay of 3.25 ns and the equivalent area is calculated to be 3.5[Formula: see text]K gate equivalent. Working at a frequency of 308 MHz, the proposed design can process 128 CIF frames per second. So, the proposed architecture can be incorporated in a video codec targeted for commercial devices like smart-phones, camcorders and video conferencing system.