Integration of different functional components such as level two (L2) cache memory, high-speed I/O interfaces, and memory controller has enhanced microprocessor performance. In this architecture, certain functional units on the microprocessor dissipate a significant fraction of the total power while other functional units dissipate little or no power. This highly nonuniform power distribution results in a large temperature gradient with localized hot spots that may have detrimental effects on computer performance, product reliability, and yield. Moving the functional units may reduce the junction temperature but can affect performance by a factor as much as 30%. In this paper, a multi-objective optimization is performed to minimize the junction temperature without significantly altering the computer performance. The analysis was performed for 90 nm Pentium IV Northwood architecture operating at 3 GHz clock speed. Each functional unit on the die has a specific role, so functional units with similar roles were grouped together. Thus, the actual Pentium IV die was divided into four groups (front end, execution cores, bus and L2, and out-of-order engine). Repositioning constraints were determined using circuit delay models of major functional units in a micro-architectural simulator. Thus, depending on the scenario, relocating functional units can result in virtually no performance loss (less than 2% is assumed to be minimal and is reported as 0%) to as much as 30% performance loss. From the results, the minimum and the maximum temperatures were 56.6°C and 62.2°C. This ΔT corresponds to thermal design power of 60.2 W. For microprocessors with higher thermal design power (115 W) and operating at higher clock speed, higher ΔT can be realized. Based on this paper’s analysis, the optimized scenario resulted in a junction temperature of 56.6°C at the cost of a 14% performance loss.