A High-Resolution Time-to-Digital Converter Implemented in Field ...

236

IEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 53, NO. 1, FEBRUARY 2006

A High-Resolution Time-to-Digital Converter Implemented in Field-Programmable-Gate-Arrays Jian Song, Qi An, and Shubin Liu

Abstract—A high-resolution time-to-digital converter (TDC) implemented in a general purpose field-programmable-gate-array (FPGA) is presented. Dedicated carry lines of an FPGA are used as delay cells to perform time interpolation within the system clock period and to realize the fine time measurement. Two Gray-code counters, working on in-phase and out-of-phase system clocks respectively, are designed to get the stable value of the coarse time measurement. The fine time code and the coarse time counter value, along with the channel identifier, are then written into a first-in first-out (FIFO) buffer. Tests have been done to verify the performance of the TDC. The resolution after calibration can reach 50 ps. Index Terms—Field programmable gate arrays (FPGAs), time-to-digital converter (TDC), time measurement, code density test.

I. INTRODUCTION

H

IGH-resolution time measurement is essential in many applications [1]–[3], and particularly in high-energy physics experiments [4]–[7]. The efficiency of particle identification using the time-of-flight (TOF) technique is directly related to the precision of time measurement. For example, in the BESIII project, the total time resolution of TOF counter should be better than 85 ps, and the total time resolution of the main drift chamber should not be more than 500 ps [8]. There are many approaches to improving the resolution of time measurements, such as time interpolation, time stretching, the vernier method, etc. [9]–[11]. Among them, time interpolation is a straightforward way to get high resolution and a long dynamic range. Time interpolation requires delay elements, which may be delay cells of a delay locked loop (DLL), gate delay cells of an application-specific-integrated- circuit (ASIC), R-C delay lines of an ASIC, basic resources of a field-programmable-gate-array (FPGA), dedicated resources of an FPGA, etc. All of these delay elements are either resources of an ASIC [1]–[7], [11]–[14] or resources of an FPGA [15]–[22]. Taking cost, development time, and flexibility into consideration, it is desirable to use an FPGA to implement a time-to-digital converter (TDC). Much work has been done in this field [15]–[22]. In 1995, an FPGA-based approach was proposed by Kalisz, et al. [15]. They made use of the difference between

Manuscript received September 1, 2005; revised November 4, 2005. This work was supported in part by the National Natural Science Foundation of China (NSFC) under Grant 10405023 and the Graduate Innovation Foundation of USTC under Grant KD2004011. Jian Song, Qi An, and Shubin Liu are with the Fast Electronics Laboratory, Department of Modern Physics, University of Science and Technology of China (USTC), Hefei 230026, China (e-mail: [email protected]). Digital Object Identifier 10.1109/TNS.2006.869820

a latch delay and a buffer delay of QuickLogic’s FPGA and achieved a time resolution of 100 ps [16]. In 2002, Andaloussi, et al. reported a novel TDC architecture based on a two-dimensional time delay matrix [17]. The architecture was implemented in a XCV3000 Virtex FPGA from Xilinx. The resulting TDC allowed for time measurements over a 21 ms range with a 150 ps resolution. In the same year, Fries, et al. realized a TDC in an FPGA using a 192 MHz quadrature clock [18]. This implementation achieved a resolution of better than 1.4 ns. In 2003, a TDC was implemented in an ACEX 1 K FPGA from Altera by Wu, et al. [19]. This TDC used cascade chains of the FPGA and offered a time resolution of about 400 ps. In the same year, Zielinski, et al. implemented a high-resolution time interval measuring system in a single FPGA device [20]. The bin size of the system was 500 ps. In 2004, a TDC with 75 ps single shot resolution was implemented in an FPGA from Xilinx by Xie, et al. [21]. The TDC was based on a counter and a two step cascading delay line. In the same year, Szymanowski, et al. implemented a high-resolution TDC with two stage interpolators in a QL12X16B from QuickLogic [22]. The TDC had 200 ps resolution and standard measurement uncertainty below 140 ps. Some previous work on time measurement has been done in our laboratory [23]–[28]. In 2002, a TDC was implemented in a Virtex-II FPGA from Xilinx by using twelve digital clock managers (DCM) for phase shifting. In 2003, a prototype board for the TOF system of the BESIII project, with a resolution reaching 17 ps, was designed using a high-performance time-todigital converter designed by CERN/EP-MIC [13]. Now we are designing the final boards for the TOF system of the BESIII project. In 2004, a TDC was implemented for these in a general purpose FPGA device by using dedicated carry lines in the FPGA to perform time interpolation. II. ARCHITECTURE In the general purpose FPGAs from Altera and Xilinx, there are many dedicated carry lines, which connect adjacent basic logic elements. These dedicated carry lines are normally used to form dedicated carry chains to implement arithmetic functions such as fast adders, counters, and comparators. The delay of each carry line is short and can be considered fixed for a particular physical technology, rail voltage, and temperature range. Using these carry lines as delay cells, a high-resolution TDC can be implemented in an FPGA. To verify our idea of time interpolation within one clock period using dedicated carry lines, a simplified TDC based on counter and time interpolation methods was implemented in an FPGA. The block diagram of the simplified TDC is shown in Fig. 1.

0018-9499/$20.00 © 2006 IEEE

SONG et al.: HIGH-RESOLUTION TIME-TO-DIGITAL CONVERTER IMPLEMENTED IN FPGAs

237

Fig. 1. Block diagram of the time-to-digital converter implemented in a single FPGA device.

Fig. 2.

Fig. 3. Coarse time counter. (a) Architecture of the coarse time counter. (b) Timing of the coarse time counter.

Diagram of the carry chain of a multibit adder.

A. Fine Time Measurement One of the simplest forms used to combine the dedicated carry lines into a carry chain is a multibit adder. The Boolean equations of each adder cell are: (1) where A and B are inputs for the adder, Ci is a carry-in bit, Co is a carry-out bit, and Sum is a sum bit. The diagram of the carry chain using a multibit adder is shown in Fig. 2. The delay time of the whole chain must be longer than one period of the system clock. We set all A to logic one and all B except the least significant bit (LSB) to logic zero. The LSB of B is the hit signal. If there is no hit signal, all Sum would be logic one. When there is a hit signal, each bit of the sum, from the LSB to the most significant bit (MSB), will change to logic zero step by step. The changed bits indicate the elapsed time of the hit signal passing along the carry chain. At the next rising edge of the system clock the sum bits will be latched. This is the fine time measurement in a thermometer code. While trying to latch the adder’s output bits at the rising edge of the system clock, we use dual synchronizers to reduce the probability of metastability [29]–[31]. There are many kinds of conversion schemes to convert a thermometer code to a natural binary code for the fine time measurement. The binary-search encoder is chosen for its simplicity and easy implementation [32]. In the fine time measurement, it is very important to keep a uniform delay between the bits of the sum to the input of their corresponding register. A basic logic element that contains a look-up table (LUT) and a programmable register is used to form the 1-bit adder which generates the sum bit and the register for latching the sum bit. In addition, constraints must be set in the design tool and sometimes logic cells must be placed manually. B. Coarse Time Counter A synchronous counter is designed to realize the coarse time measurement. The counter may change its state while the hit ar-

rives. To avoid ambiguous states, two Gray-code counters running at the system clock rate (one in phase and another out of phase) are used as shown in Fig. 3, [11], [13], [33]. Depending on the phase of the system clock at the arrival moment of the hit signal, one of the two counter’s outputs is selected and encoded with a binary code as the coarse time measurement code. of the system clock peIf the hit arrives in the first half riod, the output of the “in phase” counter is selected. Otherwise the output of the “out of phase” counter is selected. The result of the fine time measurement reflects the phase of the system clock. Using the result of the fine time measurement, a stable coarse time count value is always obtained. C. Read-Out Buffer The result of the complete time measurement (the fine time and the coarse time measurement) is written into a first-in first-out (FIFO) buffer along with a channel identifier. The total time measurement can be expressed as follows: (2) is the result of the complete time measurement, is the result of the coarse time measurement, is is the period of the the result of the fine time measurement, is the coarse time measurement code, is system clock, the bin size (the LSB value) of the fine time measurement, and is the fine time measurement code.

where

III. TEST RESULTS The delay time of a dedicated carry line is different in FPGA’s from different companies, series, capacity, and speed-grade. An EP1K50TC144-1 ACEX 1K FPGA device from Altera [34] and a XC2V4000-6BF957 Virtex-II FPGA device from Xilinx [35] were selected to implement TDC design. A test board based on a Versa Module Eurocard (VME) platform was designed to test the performance of the TDCs. The test was made at an ambient temperature of around 20 C and with nominal supply voltages. We used the typical statistical test method based on a large number of measurements [11], [36]. Characterization of the differential and integral nonlinearities

238


was performed using the statistical code density test [11]–[13], [36]. The measurement of cable delay was used to evaluate the overall resolution [3], [13], [16]. The maximum standard deviation value of different cable delay measurements is the minimum measurable time interval resolution of the TDC. Active carry lines are lines for which the total delay time until that line is less than or equal to one clock period. The numbers of active carry lines at different clock rates were used to estimate the propagation delay time per delay cell, i.e., the bin size of the fine time measurement. For instance, the number of active when the clock period is , and the number delay cells is when the clock period is . Then, of active delay cells is . the bin size would be The coarse time counters run circularly at the system clock and without external synchronization the coarse time counter value is a relative value, not an absolute value. Furthermore, the differential linearity of the coarse counter can reach 0.05% in the dynamic range according to the test. So tests are focused on the fine time measurements. A. Performance of TDC Implemented in FPGA of Altera A 6-channel TDC was designed in an Altera ACEX 1 K FPGA. The number of active carry lines is 69 at 128 MHz clock frequency, and the number of active carry lines is 44 at 200 MHz clock frequency. So the bin size is 112.5 ps. The clock frequency in the performance tests is 136 MHz. The linearity of one of the channels is shown in Figs. 4(a) and (b) from a data set of more than 500 000 random hits. Other channels’ performances are similar. The time interval resolution from a data set of more than 50 000 measurement times is shown in Fig. 4(c). The obtained differential nonlinearity (DNL) is and bins, and the obtained integral between bins. nonlinearity (INL) is in the range of The time interval resolution is 129.4 ps. So the resolution of the TDC is ps. From Fig. 4(a) and (b), a periodical phenomenon of INL and DNL can be noticed. It can be explained by some features in the architecture of this FPGA [34]. In Altera ACEX 1 K FPGAs, a single logic array block (LAB) is automatically used to implement a carry chain of up to eight logic elements (LE). For longer chains, several LABs are automatically linked. Also, linking every other LAB in a row to form a long carry chain as the default setting of the compiler can enhance fitting. That is to say, a carry chain longer than one LAB skips either from evennumbered LAB to even-numbered LAB, or from odd-numbered LAB to odd-numbered LAB. The delay time of the carry line between LABs is longer than the one between logic elements, which leads to the nonuniformity of the delay cells. This feature is inherent to the architecture, and a calibration scheme to compensate for this systematic nonuniform delay could be envisaged. B. Performance of TDC Implemented in FPGA of Xilinx A 32-channel TDC was designed in a Xilinx Virtex-II FPGA. Here, two carry lines are regarded as one delay cell due to the characteristics of the FPGA structure. The number of active delay cells is 144 at 96 MHz clock frequency, and the number of

Fig. 4. Performance of the fine time measurement of the TDC implemented in the FPGA of Altera. (a) Differential nonlinearity. (b) Integral nonlinearity. (c) Time interval resolution.

active delay cells is 138 at 100 MHz clock frequency. So the bin size is 69.5 ps. The clock frequency in the performance tests


239

is 96 MHz. The linearity of one of the channels is shown in Fig. 5(a) and (b) from a data set of more than 2 000 000 random hits. Other channels’ performances are similar. The time interval resolution from a data set of more than 50 000 measurement times is shown in Fig. 5(c). The obtained DNL is between and bins, and the obtained INL is in the range of bins. The time interval resolution is 93.1 ps. ps. So the resolution of the TDC is The nonlinearity is mainly caused by the architecture of the Xilinx FPGA, more exactly, the different distribution delays of the clock [35]. The clock distribution of a XC2V4000-6 is shown in Fig. 6. The clock signal is first transferred from a clock pin to a buffer in the center of the FPGA, and then fanned out by the buffer to local nodes which provide the clock to nearby slices. The delay from the buffer to each node is not uniform, and the load on each local clock tree net is different. So there may be a significant difference of the distribution delay time between neighboring slices whose clocks are from different nodes. For example, there is 39 ps distribution delay time difference between the 47th slice (1.474 ns) and the 48th slice (1.435 ns). This is inevitable when using a high-density FPGA, yet the error of linearity by the distribution delay is systematic and can be partly compensated for in calibration. A low-density FPGA could also implement a TDC if we only consider the implementation. But the length of the carry chain, which is restricted by the low density, will demand an increase of the system clock frequency and therefore cause other difficulties in system design. IV. DISCUSSIONS A. Temperature and Voltage The devices are commercial and operate in a temperature range between 0 C and 85 C. The performance of the TDCs was examined over the temperature range from 10 C to 30 C. The performance almost doesn’t vary with temperature. Also, % range of the normal the performance doesn’t vary within supply voltage. It is expected that the performance of the TDCs will change little with temperatures and supply voltages inside the operating ranges [5], [15], [16], [19], [21]. Further tests will be carried out later. B. Calibration and Resolution Many effects could influence the performance of the TDCs. Some of them are systematic and inherent to the architecture of the device being used. As pointed out before, we can partly compensate for them and perform calibration to obtain higher time resolutions by using the results of the statistical code density test [9], [11]–[13], [37]. The time interval resolution after calibration of the Altera ACEX 1 K device is 91.9 ps. So the resps. The time interval resolution of the TDC is olution after calibration of the Xilinx Virtex-II device is 65.3 ps. ps. The caliSo the resolution of the TDC is bration constants can be considered stable at temperatures from 10 C to 30 C and % supply voltage variations. Better resolutions would be obtained if comparator devices or other means had been used to get a higher slew rate and lower jitter of the input signals [29], [37], [38].

Fig. 5. Performance of the fine time measurement of the TDC implemented in the FPGA of Xilinx. (a) Differential nonlinearity. (b) Integral nonlinearity. (c) Time interval resolution.

C. Dead Time The architecture of a TDC implemented in an FPGA can be adapted to meet various dead time goals under multihit situations. The dead time can be reduced to one period of the

240

Fig. 6.


Clock distribution of a XC2V4000-6.

system clock by using pipeline techniques. The dead time of the Altera ACEX 1 K device is about 5 ns, and the dead time of the Xilinx Virtex-II device is about 10 ns. D. Characteristics Because of the programmable characteristic of an FPGA, a TDC implemented in an FPGA has flexible characteristics. Here, a simplified TDC is designed to verify the idea of time interpolation using dedicated carry lines. In fact, more functions can be integrated into the TDC, such as a trigger function. The length of the counters can be flexible to satisfy a dynamic range of applications. The appropriate number of channels in a TDC can be chosen according to the requirements of each application and is limited only by the capacity of the selected FPGA. Current FPGAs also offer a wide variety of interface options. V. CONCLUSION A high-resolution TDC implemented in an FPGA was designed to verify our idea of time interpolation by using dedicated carry lines. The resolution of the TDC implemented in an EP1K50TC144-1 device is 65.0 ps, and that of the TDC implemented in a XC2V4000-6BF957 device is 46.2 ps. With the continued development of semiconductor technology, the delay time of a dedicated carry line will shorten, so the resolution of the TDC will increase. Because of the performance and excellent characteristics of this high-resolution TDC, it can be used in many applications. ACKNOWLEDGMENT The authors would like to thank Jinyuan Wu of the Fermi National Accelerator Laboratory and Hongfang Chen and Yanfang Wang of the Department of Modern Physics of USTC for helpful discussions regarding this work. REFERENCES [1] B. K. Swann, B. J. Blalock, L. G. Clonts, D. M. Binkley, J. M. Rochelle, E. Breeding, and K. M. Baldwin, “A 100 ps time-resolution CMOS time-to-digital converter for positron emission tomography imaging applications,” IEEE J. Solid-State Circuits, vol. 39, no. 11, pp. 1839–1852, Nov. 2004. [2] K. Karadamoglou, N. P. Paschalidis, E. Sarris, N. Stamatopoulos, G. Kottaras, and V. Paschalidis, “An 11-bit high-resolution and adjustablerange CMOS time-to-digital converter for space science instruments,” IEEE J. Solid-State Circuits, vol. 39, no. 1, pp. 214–222, Jan. 2004. [3] K. Maatta and J. Kostamovaara, “A high-precision time-to-digital converter for pulsed time-of-flight laser radar applications,” IEEE Trans. Instrum. Meas., vol. 47, no. 2, pp. 521–536, Apr. 1998.

[4] H. Matsumoto, O. Sasaki, K. Anraku, and M. Nozaki, “Low power high resolution TDC with fast data conversion for balloon-borne experiments,” IEEE Trans. Nucl. Sci., vol. 43, pp. 2195–2198, Aug. 1996. [5] F. Bigongiari, R. Roncella, R. Saletti, and P. Terreni, “A 250-ps time resolution CMOS multihit time-to-digital converter for nuclear physics experiments,” IEEE Trans. Nucl. Sci., vol. 46, no. 4, pp. 73–77, Apr. 1999. [6] S. Minutoli and E. Robutti, “A 96-channel, 500 ps resolution TDC board for the BaBar experiment at SLAC,” IEEE Trans. Nucl. Sci., vol. 47, no. 2, pp. 147–150, Apr. 2000. [7] R. Arcidiacono et al., “A new drift chamber TDC readout for the high intensity program of the NA48 experiment,” Nucl. Instum. Methods, vol. A518, pp. 493–494, Feb. 2004. [8] “Preliminary design report of BEPCII—BESIII detector,” in Institute of High Energy Physics of Chinese Academy of Sciences: BESIII International Collaboration, 2004 , pp. 99–145. [9] J. Kalisz, “Review of methods for time interval measurements with picosecond resolution,” Metrologis, vol. 41, pp. 17–32, 2004. [10] Q. An, Review of methods and techniques of precise time interval measurements for high energy physics experiments, in Nuclear Techniques. to be published. [11] “Dept. Elect. Eng. Comput.,” Ph.D. dissertation, Tech. Univ. Lisbon, Lisbon, Portugal, Oct. 2000. [12] M. Mota and J. Christiansen, “A high-resolution time interpolator based on a delay locked loop and an R-C delay line,” IEEE J. Solid-State Circuits, vol. 34, no. 10, pp. 1360–1366, Oct. 1999. [13] J. Christiansen, “High performance time-to-digital converter (version 2.2),” in CERN Digital Microelectronics Group, Mar. 2004. CERN. [14] P. Dudek, S. Szczepanski, and J. V. Hatfield, “A high-resolution CMOS time-to-digital converter utilizing a vernier delay line,” IEEE Trans. Solid-State Circuits, vol. 35, no. 2, pp. 240–247, Feb. 2000. [15] J. Kalisz, R. Szplet, and A. Poniecki, “Field programmable gate array based time-to-digital converter with 200-ps resolution,” IEEE Trans. Instrum. Meas., vol. 46, no. 1, pp. 51–55, Feb. 1997. [16] R. Szplet, J. Kalisz, and R. Szymanowski, “Interpolating time counter with 100 ps resolution on a single FPGA device,” IEEE Trans. Instrum. Meas., vol. 49, no. 4, pp. 879–883, Aug. 2000. [17] M. S. Andaloussi, M. Boukadoum, and E.-M. Aboulhamid, “A novel time-to-digital converter with 150 ps time resolution and 2.5 ns pulsepair resolution,” in 14th Int. Conf. Microelectron., 2002, pp. 123–126. [18] M. D. Fries and J. J. Williams, “High-precision TDC in an FPGA using a 192-MHz quadrature clock,” in Proc. IEEE Conf. Rec. NSS., vol. 1, 2002, pp. 580–584. [19] J. Wu, Z. Shi, and I. Y. Wang, “Firmware-only implementation of time-to-digital converter in field programmable gate array,” in Proc. IEEE Conf. Rec. NSS., vol. 1, 2003, pp. 177–181. [20] M. Zielinski, D. Chaberski, M. Kowalski, R. Frankowski, and S. Grzelak, “High-resolution time-interval measuring system implemented in single FPGA device,” Measurement, vol. 35, pp. 311–317, 2004. [21] D. K. Xie, Q. C. Zhang, G. S. Qi, and D. Y. Xu, “Cascading delay line time-to-digital converter with 75 ps resolution and a reduced number of delay cells,” Rev. Sci. Instrum., vol. 76, 2005. 014 701. [22] R. Szymanowski and J. Kalisz, “Field programmable gate array time counter with two-stage interpolation,” Rev. Sci. Instrum., vol. 76, 2005. 045 104. [23] H. Chen, Q. An, P. Zhang, and Y. Wang, “High precision short timeinterval measurement system,” Nucl. Elect. Dete. Technol., vol. 19, pp. 51–54, Jan. 1999. [24] J. Liu and Q. An, “Analysis of the factors that influence the time resolution of TOF in BESIII detector,” High Energy Phys. Nucl.Phys., vol. 28, pp. 1170–1175, Nov. 2004.


[25] J. Song, Q. An, S. Liu, C. Ye, N. Zhang, and J. Li, “Design of precision time measure system based on virtual instrumentation,” in Proc. Laser Interactional Matter, 2005, pp. 1–5. [26] W. Shen, S. Liu, J. Liu, and Q. An, “Research on a high precision synchronous time stretcher in BESIII,” J. Univ. Sci. Technol. China, vol. 35, Dec. 2005. [27] J. Song, Q. An, and S. Liu, “High-precision time interval measure instrument based on PCI,” J. Elect. Meas. Instrum., vol. 20, Jun. 2006. [28] S. Liu, J. Song, and Q. An, “Test of data driven TDC application in high energy physics experiments,” in Nucl. Techniques. unpublished. [29] J. U. Horstmann, H. W. Eichel, and R. L. Coates, “Metastability behavior of CMOS ASIC flip-flops in theory and test,” IEEE J. Solid- State Circuits, vol. 24, no. 2, pp. 146–157, Feb. 1989. [30] C. Foley, “Characterizing metastability,” in Proc. 2nd IEEE Symp. Adv. Res. Asynchronous Circuits Syst., vol. 18–21, 1996, pp. 175–184. [31] R. Szymanowski, “Metastability effects in a two-stage time interpolator,” Metrology Meas. Syst., vol. x, pp. 319–329, 2003.

241

[32] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein, Introduction to Algorithms, 2nd ed. New York: McGraw-Hill, 2001, pt. 12. [33] C. Ljuslin, J. Christiansen, A. Marchioro, and O. Klingsheim, “An integrated 16-channel CMOS time to digital converter,” IEEE Trans. Nucl. Sci., vol. 41, no. 4, pp. 1104–1108, Aug. 1994. [34] ACEX 1 K Programmable Logic Device Family Data Sheet. San Jose, CA: Altera Corp., May 2003. [35] Virtex-II Platform FPGAs: complete Data Sheet. San Jose, CA: Xilinx Inc., Jun. 2004. [36] J. Doernberg, H.-S. Lee, and D. A. Hodges, “Full-speed testing of A/D converters,” IEEE J. Solid-State Circuits, vol. SSC-19, no. 6, pp. 820–827, Dec. 1984. [37] A. Mantyniemi, T. Rahkonen, and J. Kostamovaara, “A nonlinearitycorrected CMOS time digitizer IC with 20 ps single-shot precision,” in Proc. IEEE Int. Symp. Circuits Syst., 2002, pp. I-513–I-516. [38] J. Kalisz, M. Pawlowski, and R. Pelka, “A simple, precise, and low jitter delay/gate generator,” Rev. Sci. Instrum., vol. 74, pp. 3507–3509, 2003.