Optimum design of power distribution system via clock ... - IEEE Xplore

3 downloads 0 Views 276KB Size Report
Optimum Design of Power Distribution System via Clock Modulation. Roger Weekly, Sundun Chun, h a n d Haridass, Colm O'Reilly, James Jordan, Frank.
Optimum Design of Power Distribution System via Clock Modulation Roger Weekly, Sundun Chun, h a n d Haridass, Colm O’Reilly, James Jordan, Frank O’Connell IBM, 11400 Bumet Road, Austin, TX 78758 Tel: 512,838-6468, fax: 512-838-1717 Email: {weekly, sungjun, anandh,colmor, jdjordan, oconnell)@us.ibm.com Abstract

This paper presents a method for extracting current excitations, which a microprocessor (UP) can present to its power distribution system (PDS) as a function of frequency. The method uses a clock modulation technique to measure the impedance seen by the UP. I. Introduction In modem computer systems, one of the most critical elements in successfully designing a PDS is in appropriately defining the decoupling capacitors to be used at different stages (i.e. chip/modulelcard)to provide low impedance paths to the chips. Traditionally, an impedance target has been set based on the assumption that the current transients from the Up can exist at any time scale, i.e. in the order of mSec, uSec, and nSec. Consequently, the decoupling capacitors have been specified to meet a flat impedance target over the correspondingfrequency bands, from DC to harmonics of clock frequency. However, as cost becomes a driving factor of the design, understanding the impact of each stage of decoupling capacitors on UP performance is becoming more significant. In [I], [Z], an empirical approach has been used to validate the benefit of the on-module capacitors and on-chip capacitance. Although this approach has demonstrated the gross sensitivity of the maximum operating frequency of a processor to the inclusion of decoupling capacitors of certain stages, it has not been useful as a guide to identifying the optimum quantity and value of capacitance to include. In this paper, an attempt to understand the current profiles from the UP has been made. First, the clock modulation method, which can measure the impedance seen by the Up, is introduced. The measwment method has been applied to an IBM POWER4Rchip with dual cores and the results have been correlated with simulations. Then this measured impedance has been used in conjunction with measured voltages to extract the current excitations possible out of the UP under certain test code conditions. By characterizinga UPwith a plethora of test code an impedance criteria, not necessarily flat over the frequency band, may be defined to optimize a PDS decoupling design.

IL Clock Modulation Method The central idea of the clock modulation method is to provide one or more means in a set of electronics to be able to gate (via the ‘Gating signal’ in Figure 1 below) the switching activity of that set of electronics (i.e. ‘Clock‘ in Figure 1 below). This gating switches the electronics from a minimal set of activity to a higher level activity such that the current draw from the voltage source powering those circuits is changed significantly between the gated and non-gated states. Voltage

t Current Clkg

Gating signal (Modulation Source)

Circuits

-

Ground

Figure 1. Clock Gating Mechanism A modulation source is then used to control the gating function to.the set of electronics, such that the current drain from the voltage source at the circuits is excited with a binary pattern, with complete

0-7803-812&9/03/$17.00 0 2003 IEEE

45

control over the pattem, period, and modulation frequency by the modulation source. For the simple case, Figure 2 shows a 'Clock' source waveform and a 'Modulation' waveform corresponding to a circuit as represented in Figure I . It also shows the resulting 'CLKg' waveform driving into the circuits. For CMOS circuits, the current draw from the voltage source would look something like the 'Current' waveform shown. Depending upon the power distribution impedance, the Voltage to Ground potential at the circuits could look like the 'Voltage' waveform shown in Figure 2.

I Tlme

Figure 2. Clock gated waveforms

In many logic systems designed today, the clock to the circuits is multi-phase. In this more complex case, there may be advantages to gating only some of the clock phases. Figure 3 below illustrates this case when there are two clock phases. The result is that the modulation is no longer 100%. but is a ratio of the power dissipated during one clock phase compared to both phases active. In some systems, allowing a background power dissipation is more realistic to production operating conditions as it will result in the damping effect for any package resonances which may be excited. Prime D ClOCK

Phase 1 Modulatlon

e -

CLKg

l

CtXi-Z"f

----.-

Voltage

' 9

J

l

-*-

n

-

46

"

m

n

w

fundamental frequency content out of the excitation, which is a binary pattern. The power of the high and low states during the modulation has been determined by measuring the DC currents under the two cases where the clocks are and are not gated. This delta in DC current has been used to translate the measured ac voltage to the impedance as a function of frequency. While this paper focuses on impedance magnitude, since the gating is in sync with the current excitation the phase of the impedance can also be determined if so desired. Figure 4 below shows the impedance seen at FPU and LZCNTL from loOKHz to 200MHz. Both waveforms in Figure 4 clearly show the effects of decoupling capacitors at each stage. The dips in the impedance below 30MHz are due to the on-card capacitorsand the dip between 3OMHz and lOOMHz is due to the on-module capacitors. On the POWER4' chip, a larger portion of the on-chip capacitancewas added to the L2CNTL area than the FPU area. This has been also captured in the waveforms, showing the lower impedance for the LZCNTL than for the FPU above lOOMHz as shown in Figure 4.

Figure 4. Measured Z(f) at FPU and L2CNTL FPU (Solid Line), L2CNTL (Dashed Line)

Figure 5. Measured Z(fj vs. Simulated Z(f) at FPU Measurement (Solid Line), Simulation (Dashed Line)

A modeling technique from [3], which can model the power distribution with planes accurately, has been used to simulate the impedance to validate the measurements. Figure 5 above shows the correlation between the measurement and simulation at FPU, showing excellent agreement.

IV.Extraction of Currenyf) Excitations from UP AAer obtaining the impedance as seen by the chip, the chip currents under production workloads can be determined by measuring the voltage under these conditions and dividing by the measured impedance. Worst case currents at different frequencies can be measured by peak detecting voltage measurements via a spectrum analyzer while running code on the system and dividing by the impedance. To simulate a system workload exhibiting a wide spectrum of transient and steady-state electrical responses. a set of 127 different computational Fortran kernels were created, from simple kernels such as a data copy @(i) = a(i), for i=l to n) to more complex loops that mimic various equation solvers. The kernels were written such that the size of their data arrays could be easily changed from 100 to 300000, thus changing the cache reuse pattems for different runs of the same kernel. For example, the copy loop with a data array size of n=100 will contain all of its data within the LI data cache of the Up, but at some larger n, the two data arrays will no longer be containablewith the LI data cache and will have to reload its data from the LZ cache. This change in the cache behavior will cause UP activity to occur in a lower frequency band. Elaborating on the example, Figure 6 and Figure 7 shows the maximum voltage observed under the situation described above at the FPU and LZCNTL sense points, respectively. Thus, the magnitude of the current from the UP as a function of frequency can be extracted as,

W )= (')' where, Vf$ and Zf$are the magnitude values at the UP.

ztn

Figure 8 and Figure 9 shows this extracted current profile at FPU and LZCNTL, respectively. Note that the current excitations from the UP are not distributed uniformly, as shown in Figure 8 and Figure 9.

47

Figure 7.Measured V(f) at L2CNTL

Figure 6. Measured V(f) at FPU

Figure 8. Extracted I(f) at FPU

Figure 9. Extracted I(f) at L2CNTL

Instead, there are specific frequency hands where a lot of UP activities occur and there are some bands where there is very little activity. From this, impedance criteria can be optimized to vary with frequency to minimize cost while resulting in an effectivedesign of the PDS.

V. Conclusion In this paper, a method to extract the current profiles from the UP was introduced. The clock modulation method was introduced to measure the impedance seen by the Up and the result was correlated with simulation to ensure the validity of the measurements. The maximum voltage on the Up was measured under the realistic system workload, exhibiting a wide spectrum of transient and steady-state electrical responses. Based on the measurement data from the impedance and voltage, the current, which the UPcan present to the PDS, was extracted. This current data showed different Up activities as a function of frequency. Comprehensive workload analysis could justify a impedance criteria which is not flat, but targets the decoupling to those frequency bands where a Up shows activity.

VI. Acknowledgements We would like to acknowledge Matthew Rankin for all the invaluable discussions & Marty Hipolito for all the help in procuring equipment. References [l] Tawfik Rahal-Arabi, Greg Taylor, Matthew Ma, and Clair Webb, “Design &Validation of the Pentium‘ I11 and PentiumR4 Processors Power Delivery,” Symposium on VLSI Circuits Digest of Technical Papers, pp. 220-223, June 2002. [2] Tawfik Rahal-Arahi, Greg Taylor, Matthew Ma, Jeff Jones, and Clair Webb, “Design and Validation of the Con and 10s decoupling of the Pentiud 111 and P e n t i d 4 Processors,” Proceedings of the 11” Topical Meeting on Electrical Performance of Electronic Packaging, pp. 249-252, October 2002. 131 Sundun Chun, Madhavan Swaminathan, Lany D. Smith, Jegganathan Srinivasan, Zhang Jin and Mahadevan K. Iyer, “Modeling of Simultaneous Switching Noise in High Speed Systems,” IEEE Transactions on Advanced Packaging, Vol. 24, No. 2, pp. 132-142, May 2001.

48