# DESIGN AND IMPLEMENTATION OF CMOS TIME-TO-DIGITAL CONVERTER ASIC FOR INO EXPERIMENT

By

POOJA SAXENA

### ENROLMENT NO.ENGG01201004001

CONSTITUENT INSTITUTION BARC, MUMBAI

A THESIS SUBMITTED TO THE BOARD OF STUDIES IN ENGINEERING SCIENCES

In Partial Fulfillment of Requirements for the degree of

### DOCTOR OF PHILOSOPHY

OF

### Homi Bhabha National Institute, Mumbai



July 25, 2016

# Homi Bhabha National Institute

Recommendations of the Viva Voce Board

As members of Viva Voce Board, we certify that we have read the dissertation prepared by *Pooja Saxena* entitled '**Design and Implementation of CMOS Time-to-Digital Converter ASIC for INO experiment**' and recommended that it may be accepted as fulfilling the dissertation requirement for the Degree of Doctor of Philosophy.

|                                              | Date: |  |
|----------------------------------------------|-------|--|
| <b>Chairman</b> – Dr. Archana Sharma         |       |  |
|                                              | Date: |  |
| Guide/Convener– Dr. Vivek Datar              |       |  |
|                                              | Date: |  |
| <b>Co-Guide</b> – Prof. Naba Modal           |       |  |
|                                              | Date: |  |
| Technology Adviser: –Dr. V.B.Chandratre      |       |  |
|                                              | Date: |  |
| <b>Member</b> –Dr. Vaibhav Hanamant Patankar |       |  |
|                                              | Date: |  |
| Member – Dr. B.Dikshit                       |       |  |

Date:

Member- Dr. Vinod Gopika

Date:

Examiner –.....

Final approval and acceptance for this dissertation is contingent upon the candidate's submission of the final copies of the dissertation to HBNI.

I hereby certify that I have read this dissertation prepared under my direction and recommended that it may be accepted as fulfilling the dissertation requirement.

Date:

Place:

Guide – Dr. Vivek Datar

Co-Guide – Dr. Naba Mondal

### STATEMENT BY THE AUTHOR

This dissertation has been submitted in partial fulfillment of requirements for an advanced degree at Homi Bhabha National Institute (HBNI) and is deposited in the Library to be made available to borrowers under rules of the HBNI.

Brief quotations from this dissertation are allowable without special permission, provided that accurate acknowledgment of source is made. Requests for permission for extended quotation from or reproduction of this manuscript in whole or in part may be granted by the Competent Authority of HBNI when in his or her judgment, the proposed use of the material is in the interests of scholarship. In all other instances, however, permission must be obtained from the author.

Mumbai July 25, 2016

Pooja Saxena

#### DECLARATION

I, hereby declare that the thesis entitled 'Design and Implementation of CMOS Time-to-Digital Converter ASIC for INO experiment' submitted to Homi Bhabha National Institute (HBNI), Mumbai, India, for the award of Doctor of Philosophy in Engineering Science, is the record of work carried out by me during the period from March 2010 to August 2015 under the guidance of Dr. V.B.Chandratre, Head, Micro-electronics section, ED, BARC, Mumbai, Dr. V.M. Datar, TIFR, Mumbai and Prof. Naba Mondal, 'TIFR, Mumbai'. The work is original and has not been submitted earlier as a whole or in part of degree/diploma at this or any other Institute/University of higher learning.

Mumbai July 25, 2016

Pooja Saxena

# LIST OF PUBLICATIONS ARISING FROM THESIS

### **Peer Reviewed Research Papers**

- K. Hari Prasad, Menka Sukhwani, Pooja Saxena, V. B. Chandratre, 'A Four Channel Time-to-digital Converter ASIC within-built Calibration and SPI Interface', *Nuclear Instruments and Methods in Physics Research A*, vol. 737, pp.117-121, 2014.
- 2. **Pooja Saxena**, Sudheer K. M, V. B. Chandratre, 'Design of Novel Current balanced Voltage controlled delay element', *International Journal of VLSI design* & *Communication Systems (VLSICS)* vol.5, no.3, June 2014.
- 3. **Pooja Saxena**, K. Hari Prasad, V. B. Chandratre, 'A Current Balanced Logic buffer based Time-to-Digital Converter with improved resolution', *Global Journal of Researches in Engineering: F Electrical and Electronics Engineering* vol. 15 issue 4, 2015.
- 4. **Pooja Saxena**, K. Hari Prasad, V. B. Chandratre, 'Design of Multi-hit TDC using CBL delay element', *International Journal of Electronics and Electrical Engineering*, vol.5, no.1, July 2016 (under Print).

# **Conference Proceedings**

- Pooja Saxena, K. Hari Prasad, V. B. Chandratre, 'Comparative Analysis of tapped delay line architectures used in time stamping', 60<sup>th</sup> DAE-BRNS National Symposium on Nuclear Physics, December 7-11, 2015, Sri Sathya Sai Institute of higher learning, Prasanti Nilayam.
- Pooja Saxena, K. Hari Prasad, V. B. Chandratre, 'Implementation of Multihit Time-to-Digital Converter using Tapped Delay Line', *DAE-BRNS National Symposium on Nuclear Instrumentation*, November 19-21, 2013, Training school-BARC, Anushakti Nagar, Mumbai.
- K. Hari Prasad, Menka Sukhwani, Pooja Saxena, V. B. Chandratre, 'A CMOS standard cell based TDC', Proceedings of International Conference on VLSI, Communication, Advanced Devices, Signals & Systems and Networking (VCASAN), vol.258, pp.87-93, July 17-19, 2013, B.N.M.Institute of technology, Bangalore.

- K. Hari Prasad, V. B. Chandratre, Pooja Saxena, C. K. Pithawa, 'FPGA based Time-to-Digital Converter', *Proceedings of DAE symposium on Nuclear Physics*, vol.56, December 26-30, 2011, Andhra University, Visakhapatham.
- 5. Bheesette Satyanarayana, Sudeshna Dasgupta, Sonal Dhuldhaj, Naba Mondal, Nagaraj Panyam, Shobha Rao, Deepak Samuel, Mandar Saraf, Ravindra Shinde, Suresh Upadhya, Vinay Chandratre, Veena Salodia, Pooja Saxena, Menka Tewani, Satyajit Saha, Yogendra Viyogi, 'Electronics and data acquisition systems for RPC based INO ICAL detector', proceedings of XI Workshop on Resistive Plate Chambers and Related Detectors, POS (RPC2012), February 5-10, 2012, laboratori Nazionali di Frascati, Italy.

# ACKNOWLEDGMENT

First of all, I am thankful to God for giving me strength and perseverance throughout my research period. It was His gracious wish to accommodate me among the cordial, helpful and understanding people in Mumbai during my research journey.

I would like to express my sincere gratitude to my guides Dr. V.M.Datar, TIFR, Mumbai and Prof. Naba Mondal, TIFR, Mumbai for their kind advice and help. I am equally thankful to my technical adviser, Dr. V.B.Chandratre, Head, Microelectronics section, ED, BARC, Mumbai for his persistent support and meticulous supervision. It was because of his constant encouragement, consistent guidance and accurate direction that I was able to successfully publish papers at different platforms and complete this work. His mentor-ship played a significant role towards the completion of this journey as he was always determined to re-direct my focus towards the work against my difficult personal fronts.

I am indebted to my doctoral committee Chairman Dr. Kallol Roy, who stood as a pillar of support at all the times during the course of this work. It was his enthusiasm, dedication towards research and supporting nature for students that had immensely inspired me throughout my research journey. I am also thankful to my doctoral committee members Dr. Vaibhav Hanamant Patankar, Dr. B. Satyanarayana, Dr. Archana Sharma and Dr. Vinod Gopika for their eagerness towards understanding my work during the annual progress reviews. Their open queries on my work and related discussions always helped me in improving my technical understanding of my research topic.

This work has been accomplished owing to the valuable contributions of Mr. Hari Prasad Kolla in the design and development as well as testing of TDCs. I am also very much thankful to Ms. Menka Sukhwani and Mr. Sudheer K.Mohammad, for preparation of publications, literature survey till the completion of the thesis. Their useful suggestions as well as detailed technical and academic discussions has always been a boon in disguise throughout this work. I am also thankful to Mr. Sourav Mukhopadhyay, Mr. Shiv kumar, Ms. Veena Sahlodhiya and Ms. Megha Thomas, who were with me like good and helping friends during this journey.

A big thanks to Ms.Kulkerni Madam, Warden of Amrapali hostel, who has provided me the secure and academic environment in the hostel during this journey. Last but not the least, it is because of the blessings, support, love and encouragement of my parents Dr. Hari Om Saxena and Ms. Lata Saxena and my brother Amit that I am able to reach this '*milestone*'. I am also very much obliged to my husband Abhishek, who patiently supported me during every step of my research.

# Abstract

This thesis starts with the literature survey of various existing time interval measurement techniques along with their merits and limitations, which paved a way to design innovative TDCs with specifications provided by INO group. In the first endeavor, a low power 4-channel Vernier TDC ASIC with SPI based readout and in-built digital calibrator is designed and fabricated using AMS 0.35  $\mu$ m CMOS technology. The developed ASIC is functionally characterized and the achieved resolution (LSB) is 127 ps over 1.4  $\mu$ s range. It is demonstrated to be a low power (43 mW @ 3.3V) as compared to earlier reported one (150 mW) in the same technology. This ASIC was subsequently interfaced with RPC detector FE electronics providing timing resolution (~ 2.6 ns) similar to that of commercial HPTDC.

The Vernier TDC ASIC design is further enhanced with multi-hit capability and incorporation of four operating modes (time interval,common start, common stop and calibration) to achieve its utilization in INO experiment for multi-hit requirements as well as in other HEP experiments. It is carried out by conceptualizing a novel architecture of Vernier multi-hit TDC and successfully analyzing the scope of Vernier technique in negative time interval measurement. This 8-channel multi-hit Vernier TDC ASIC is designed for resolution of 100 ps over user selectable dynamic range one from 10  $\mu$ s/20 $\mu$ s/30 $\mu$ s/60  $\mu$ s. It can measure the minimum pulse width ~ 1 ns of multi-hit signal.

Moreover, for high event rate HEP experiments, a Flash technique based multi-hit TDC using a novel current balanced logic (CBL) delay element with the aim of improved resolution (<200 ps) in  $0.35\mu$ m CMOS technology is designed. This four channel TDC is designed in time interval and common stop mode for resolution 150 ps over selectable dynamic range of 10  $\mu$ s to 40  $\mu$ s in steps of 10  $\mu$ s and multi-hit pulse width measurement of ~ 1 ns. The impact of PVT variations over the resolution is circumvented with the help of CBL delay element based delay lock loop. The scope of multi-hit Vernier and Flash TDC ASIC designs in this thesis is limited to their performance validation with the help of simulation results. The work carried out is discussed in eight chapters.

**Chapter-1** consists of a brief history of fundamental neutrino particle along with its existence in the Standard Model of particle physics. The major experiments that have been carried out to study this particle are reviewed. To answer the unexplained queries pertaining to neutrino and further precise study of its oscillation parameters, the physics potential of iron calorimeter detector (ICAL) to be used in the proposed India based Neutrino Observatory (INO) experiment is discussed.

Also, the requirement and precipitated specifications of time interval measurement for this experiment are discussed.

**Chapter-2** contains the review of various time interval measurement techniques used for design and implementation of TDC. It also highlights their merits, demerits, and limitations. In addition, the scope of TDC in other (medical, industrial, and consumer) applications is included. This chapter summarizes with the chronological reporting of these techniques since 1960's with the help of time line flow diagram.

**Chapter-3** discusses about the suitability of CMOS technology for meeting the TDC specifications, provided by INO group. Thereafter, in CMOS technology, various factors impacting the resolution and precision of TDC are discussed. Further, to compensate for CMOS process variations, need of delay lock loop (DLL) with the help of its operating principle in the development of TDC is discussed. The key design blocks of DLL are also described in this chapter. This chapter summarizes with the CMOS ASIC design steps, followed for TDC developments.

**Chapter-4** presents a low power design and implementation of 4-channel CMOS standard cell based Vernier TDC ASIC with SPI. The key design blocks such as ring oscillators, leading edge phase detector and in-built digital calibrator to meet the low power and area requirements are described. Further, layout methods to avoid noise coupling and to achieve the requisite performance for TDC are included. This chapter ends with the performance validation of this conceptualized Vernier TDC ASIC design using simulation results.

**Chapter-5** presents the requirement of multi-hit TDC. In this aspect, a design of novel architecture for 8-channel multi-hit TDC with pulse width measurement of  $\sim$  1 ns using Vernier technique is conceptualized and designed. The proposed ASIC is designed to work in different modes (trigger and calibration) with optional (serial or parallel) readout to enhance its usability. This chapter ends with the performance validation of this conceptualized Vernier multi-hit TDC design using simulation results.

**Chapter-6** presents 4-channel multi-hit TDC ASIC based on time stamping (Flash technique) using a novel current balanced logic (CBL) based delay element with improved performance in terms of resolution ( $\sim 150$  ps) and pulse width measurement ( $\sim 1$  ns) in 0.35  $\mu$ m CMOS technology. The objective of this prototype design is of academic interest for future performance comparison with Vernier TDC in terms of measurement rate and linearity. The performance validation of this CBL based time stamper using simulation results concludes this chapter.

Chapter-7 presents the design and implementation of analog delay lock loop

(DLL) using novel CBL delay element. The objective of this design is to circumvent the impact of CMOS process induced variation on resolution of time stamping based TDC, which is discussed in chapter-6. The mathematical analysis for key design blocks such as CBL delay element and bias circuit is also included. The performance validation of DLL across various process corners and operating conditions concludes this chapter.

**Chapter-8** presents the performance of our developed 4-channel Vernier ASIC with the help of test results. It also presents the salient features of the ASIC along with the achieved specifications, which sufficiently match with those required in HEP INO experiment.

The work is finally summarized and discussed along with the future scope in **chapter-9**.The answers to the questions asked by the referees are discussed in Appendix-B. The publications produced as the outcome of this work, which include peer-reviewed research articles and conference proceedings are listed in page no-V.

# Contents

| Ι                                  | Int | Introduction |                                                |    |
|------------------------------------|-----|--------------|------------------------------------------------|----|
| 1 Study of Neutrino:INO Experiment |     |              |                                                | 2  |
|                                    | 1.1 | The N        | leutrino                                       | 2  |
|                                    | 1.2 | Neutr        | ino in standard model (SM)                     | 4  |
|                                    | 1.3 | Neutr        | ino mass oscillation experiments               | 7  |
|                                    | 1.4 | India        | based Neutrino Observatory (INO) experiment    | 14 |
|                                    |     | 1.4.1        | Iron calorimeter (ICAL) detector               | 15 |
|                                    |     | 1.4.2        | Need of time-to-digital converter (TDC) in INO | 17 |
|                                    |     |              |                                                |    |

20

# II Literature Survey

| 2 | Rev | iew of ' | Time Interval Measurement Methods and Techniques             | 21 |
|---|-----|----------|--------------------------------------------------------------|----|
|   | 2.1 | Introd   | uction and applications of TDC                               | 21 |
|   |     | 2.1.1    | High energy physics (HEP) & nuclear experiments              | 21 |
|   |     | 2.1.2    | TOF laser range finder                                       | 22 |
|   |     | 2.1.3    | Ultrasonic imaging and thickness measurement of metal layers | 22 |
|   |     | 2.1.4    | Ultrasonic density meter                                     | 22 |
|   |     | 2.1.5    | Positron emission tomography (PET) medical imaging           | 22 |
|   |     | 2.1.6    | Analog to digital converter (ADC)                            | 23 |
|   |     | 2.1.7    | Frequency synthesis in RF communications systems             | 23 |
|   |     | 2.1.8    | In test & measurement instrumentation                        | 23 |
|   | 2.2 | Perfor   | mance parameters of TDC                                      | 24 |
|   |     | 2.2.1    | Resolution (LSB)                                             | 24 |
|   |     | 2.2.2    | Precision (RMS resolution)                                   | 24 |
|   |     | 2.2.3    | Accuracy                                                     | 25 |
|   |     | 2.2.4    | Dynamic range                                                | 26 |
|   |     | 2.2.5    | Dead time                                                    | 26 |
|   | 2.3 | Metho    | ods of time interval measurement                             | 26 |

|   |     | 2.3.1    | Time interval measurement                          | 26  |
|---|-----|----------|----------------------------------------------------|-----|
|   |     | 2.3.2    | Time stamping or Tagging                           | 26  |
|   | 2.4 | TDC i    | mplementation techniques                           | 27  |
|   |     | 2.4.1    | Analog time interval measurement techniques        | 29  |
|   |     | 2.4.2    | Digital time interval measurement techniques       | 37  |
|   | 2.5 | Summ     | nary                                               | 79  |
| 3 | TDO | C Desig  | gn Aspects in CMOS Technology                      | 85  |
|   | 3.1 | Introd   | luction of CMOS Technology                         | 85  |
|   |     | 3.1.1    | Structure of MOS device                            | 86  |
|   | 3.2 | MOS      | operating regions and logic styles                 | 87  |
|   | 3.3 | Impac    | ct of various parameters on speed of CMOS inverter | 89  |
|   |     | 3.3.1    | Capacitive load effect on propagation delay        | 90  |
|   |     | 3.3.2    | Temperature effect on speed of MOSFET              | 91  |
|   |     | 3.3.3    | Threshold voltage effect on inverter delay         | 92  |
|   | 3.4 | Jitter s | sources in time interval measurement               | 92  |
|   |     | 3.4.1    | Types of noise in MOS transistor                   | 92  |
|   |     | 3.4.2    | Substrate and power supply noise                   | 95  |
|   | 3.5 | Interc   | onnect noise                                       | 96  |
|   |     | 3.5.1    | Capacitive coupling                                | 97  |
|   |     | 3.5.2    | Inductive coupling                                 | 98  |
|   | 3.6 | CMOS     | S process variation                                | 99  |
|   |     | 3.6.1    | Impact of process variation on performance of TDC  | 101 |
|   | 3.7 | Aspec    | et of delay lock loop (DLL) in TDC                 | 103 |
|   | 3.8 | Descri   | iption of DLL design blocks                        | 104 |
|   |     | 3.8.1    | Voltage controlled delay element                   | 104 |
|   |     | 3.8.2    | Phase detector                                     | 107 |
|   |     | 3.8.3    | Charge pump and filter capacitor                   | 110 |
|   | 3.9 | Full cu  | ustom ASIC design flow                             | 112 |
|   |     |          |                                                    |     |

## **III** Research Work

| 4 | CMOS Standard Cell based Vernier TDC |        |                          |       |  |
|---|--------------------------------------|--------|--------------------------|-------|--|
|   | 4.1                                  | Introd | luction                  | . 115 |  |
|   | 4.2                                  | Archit | tecture of TDC ASIC      | . 117 |  |
|   |                                      | 4.2.1  | Working of TDC channel   | . 118 |  |
|   |                                      | 4.2.2  | Front-end design aspects | . 120 |  |

**114** 

|   | 4.3  | Descr | iption of design blocks                                                  | 121 |
|---|------|-------|--------------------------------------------------------------------------|-----|
|   |      | 4.3.1 | Ring oscillator                                                          | 121 |
|   |      | 4.3.2 | Leading edge phase detector                                              | 121 |
|   |      | 4.3.3 | Calibration block                                                        | 122 |
|   |      | 4.3.4 | SPI based read-out logic                                                 | 124 |
|   | 4.4  | Layou | ut design aspects                                                        | 127 |
|   |      | 4.4.1 | Matching of oscillator channels                                          | 127 |
|   |      | 4.4.2 | Substrate coupling noise                                                 | 128 |
|   |      | 4.4.3 | Supply rail routing                                                      | 128 |
|   |      | 4.4.4 | Crosstalk                                                                | 129 |
|   |      | 4.4.5 | Reduction of clock skew for coarse counter                               | 129 |
|   | 4.5  | Simul | lation results                                                           | 129 |
|   |      | 4.5.1 | Time period of oscillators and resolution across design pro-             |     |
|   |      |       | cess corners                                                             | 130 |
|   |      | 4.5.2 | Calibration of time periods of reference start and stop oscil-           |     |
|   |      |       | lators                                                                   | 130 |
|   |      | 4.5.3 | Functional verification of Vernier TDC channel                           | 130 |
|   |      | 4.5.4 | Frequency stability plot over number of cycles for slow and              |     |
|   |      |       | fast oscillators                                                         | 131 |
|   |      | 4.5.5 | Channel-to-channel variation across process corners                      | 131 |
|   |      | 4.5.6 | Output versus input time interval characteristic                         | 133 |
|   | 4.6  | Sumn  | nary                                                                     | 134 |
| 5 | 8-ch | annel | Multi-hit TDC using Vernier Technique                                    | 135 |
|   | 5.1  | Archi | tecture of multi-hit TDC ASIC                                            | 136 |
|   |      | 5.1.1 | Normal mode (mode-0)                                                     | 137 |
|   |      | 5.1.2 | Common start mode (mode-1)                                               | 138 |
|   |      | 5.1.3 | Common stop mode (mode-2)                                                | 138 |
|   |      | 5.1.4 | Calibration mode (mode-3)                                                | 139 |
|   | 5.2  | Verni | er multi-hit TDC channel                                                 | 141 |
|   | 5.3  | Stopp | oing scheme of start oscillator in four operating modes $$               | 143 |
|   | 5.4  | Desig | n of calibration block                                                   | 144 |
|   |      | 5.4.1 | LSB calibration scheme                                                   | 145 |
|   | 5.5  | TDC o | channel interface with memory controller and read-out interface <b>(</b> | 146 |
|   |      | 5.5.1 | Read-out logic interface                                                 | 148 |
|   | 5.6  | Layou | ut design aspects                                                        | 149 |
|   | 5.7  | Simul | lation results                                                           | 151 |

|   |      | 5.7.1    | Variation in time periods of ring oscillators and resolution of |       |
|---|------|----------|-----------------------------------------------------------------|-------|
|   |      |          | time interval measurement across five design process corners    | 151   |
|   |      | 5.7.2    | Impact of temperature over time period of clocks and LSB of     |       |
|   |      |          | time interval measurement                                       | 152   |
|   |      | 5.7.3    | Variation in resolution across stop (transition) channels due   |       |
|   |      |          | to local mismatch                                               | 153   |
|   |      | 5.7.4    | Time period and resolution (LSB) calibration                    | 154   |
|   |      | 5.7.5    | Jitter analysis for ring oscillators                            | 155   |
|   |      | 5.7.6    | Output versus input time interval characteristics               | 157   |
|   | 5.8  | Negat    | ive time interval measurement in Vernier technique: suitabil-   |       |
|   |      | ity of ' | Vernier multi-hit TDC in INO experiment                         | 158   |
|   | 5.9  | Simula   | ation result                                                    | 161   |
|   | 5.10 | Summ     | nary                                                            | 163   |
| 6 | Des  | ign of 🛛 | FDC using CBL delay element based tapped delay line             | 166   |
|   | 6.1  | Archit   | ecture of TDC ASIC                                              | 168   |
|   | 6.2  | Design   | n aspects of time stamping                                      | 170   |
|   |      | 6.2.1    | Coarse time measurement                                         | 171   |
|   |      | 6.2.2    | Fine time measurement                                           | 173   |
|   | 6.3  | Timing   | g critical paths                                                | 177   |
|   | 6.4  | TDL u    | nit delay calibration                                           | 179   |
|   | 6.5  | Layou    | t design aspects                                                | 180   |
|   |      | 6.5.1    | Matching aspect                                                 | 181   |
|   |      | 6.5.2    | Skew reduction in the latch signal                              | 183   |
|   | 6.6  | Simula   | ation results                                                   | 184   |
|   |      | 6.6.1    | Verification of post layout delay characteristic of CBL delay   |       |
|   |      |          | element across process corners                                  | 184   |
|   |      | 6.6.2    | Calculation of theoretical RMS error                            | 185   |
|   |      | 6.6.3    | Test for resolution (LSB) adjustability                         | 188   |
|   |      | 6.6.4    | Input versus output time interval measurement characteristics   | s 189 |
|   | 6.7  | Summ     | nary                                                            | 191   |
| 7 | Desi | ign and  | l Implementation of CBL Delay Element based DLL                 | 193   |
|   | 7.1  | Introd   | uction                                                          | 193   |
|   | 7.2  | Archit   | ecture of DLL                                                   | 193   |
|   |      | 7.2.1    | Description of design blocks                                    | 195   |
|   | 7.3  | Initiali | ization and lock range                                          | 210   |

|                                | 7.4                                                    | Analytical equation for loop dynamics                                                                                                                                                                                                                                                                                                                                                                                                                                                       | 212                                                                                                                                                                                                            |
|--------------------------------|--------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|                                | 7.5                                                    | Floor plan and layout of DLL                                                                                                                                                                                                                                                                                                                                                                                                                                                                | 214                                                                                                                                                                                                            |
|                                | 7.6                                                    | Performance evaluation                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | 215                                                                                                                                                                                                            |
|                                |                                                        | 7.6.1 Characterization across process corners                                                                                                                                                                                                                                                                                                                                                                                                                                               | 218                                                                                                                                                                                                            |
|                                |                                                        | 7.6.2 Static phase error                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | 219                                                                                                                                                                                                            |
|                                |                                                        | 7.6.3 Lock time                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | 220                                                                                                                                                                                                            |
|                                | 7.7                                                    | Summary                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | 221                                                                                                                                                                                                            |
| 8                              | Cha                                                    | racterization and Testing of Vernier TDC ASIC                                                                                                                                                                                                                                                                                                                                                                                                                                               | 222                                                                                                                                                                                                            |
|                                | 8.1                                                    | Experimental results                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | 222                                                                                                                                                                                                            |
|                                |                                                        | 8.1.1 Test for functionality of time interval measurement                                                                                                                                                                                                                                                                                                                                                                                                                                   | circuit                                                                                                                                                                                                        |
|                                |                                                        | and readout logic                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | 223                                                                                                                                                                                                            |
|                                |                                                        | 8.1.2 Test for resolution by calibrating $T_{oscst}$ and $T_{oscsp}$                                                                                                                                                                                                                                                                                                                                                                                                                        | 224                                                                                                                                                                                                            |
|                                |                                                        | 8.1.3 Test for precision                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | 224                                                                                                                                                                                                            |
|                                |                                                        | 8.1.4 Test for linearity                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | 226                                                                                                                                                                                                            |
|                                | 8.2                                                    | Highlights of standard cell based Vernier TDC ASIC                                                                                                                                                                                                                                                                                                                                                                                                                                          | 227                                                                                                                                                                                                            |
|                                | 8.3                                                    | Achieved specifications                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | 227                                                                                                                                                                                                            |
|                                | 8.4                                                    | Summary                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | 228                                                                                                                                                                                                            |
|                                |                                                        |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |                                                                                                                                                                                                                |
| IV                             | 7 <b>C</b>                                             | Conclusions                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 | 229                                                                                                                                                                                                            |
| I\<br>9                        | 7 C<br>Sun                                             | Conclusions nmary and Future Scope                                                                                                                                                                                                                                                                                                                                                                                                                                                          | <b>229</b><br>230                                                                                                                                                                                              |
| I\<br>9                        | 7 C<br>Sum<br>9.1                                      | Conclusions<br>nmary and Future Scope<br>The Work                                                                                                                                                                                                                                                                                                                                                                                                                                           | <b>229</b><br><b>230</b><br>230                                                                                                                                                                                |
| I <b>\</b><br>9                | 7 C<br>Sum<br>9.1<br>9.2                               | Conclusions<br>nmary and Future Scope<br>The Work                                                                                                                                                                                                                                                                                                                                                                                                                                           | <b>229</b><br>230<br>230<br>235                                                                                                                                                                                |
| I\<br>9<br>Ај                  | 7 C<br>Sum<br>9.1<br>9.2<br>opend                      | Conclusions nmary and Future Scope The Work                                                                                                                                                                                                                                                                                                                                                                                                                                                 | <b>229</b><br>230<br>230<br>235<br><b>23</b> 7                                                                                                                                                                 |
| IV<br>9<br>Aj                  | 7 C<br>Sum<br>9.1<br>9.2<br>opend                      | Conclusions nmary and Future Scope The Work                                                                                                                                                                                                                                                                                                                                                                                                                                                 | 229<br>230<br>230<br>230<br>235<br>237<br>238                                                                                                                                                                  |
| IV<br>9<br>Aj<br>A             | 7 C<br>Sum<br>9.1<br>9.2<br>opend<br>A.1               | Conclusions         nmary and Future Scope         The Work         Future scope         Future scope         dices         Neutrino flavor oscillation                                                                                                                                                                                                                                                                                                                                     | 229<br>230<br>230<br>235<br>235<br>237<br>238<br>238                                                                                                                                                           |
| IV<br>9<br>Aj<br>A             | 7 C<br>Sum<br>9.1<br>9.2<br>ppend<br>A.1<br>A.2        | Conclusions         nmary and Future Scope         The Work         Future scope         Future scope         dices         Neutrino flavor oscillation         Sources of neutrino particle                                                                                                                                                                                                                                                                                                | 229<br>230<br>230<br>235<br>235<br>237<br>238<br>238<br>238<br>238<br>238                                                                                                                                      |
| IV<br>9<br>Aj<br>A             | 7 C<br>Sum<br>9.1<br>9.2<br>Openo<br>A.1<br>A.2        | Conclusions         nmary and Future Scope         The Work         Future scope         Future scope         dices         Neutrino flavor oscillation         Sources of neutrino particle         A.2.1                                                                                                                                                                                                                                                                                  | 229<br>230<br>230<br>235<br>235<br>237<br>238<br>238<br>238<br>238<br>238<br>238<br>238<br>238<br>239<br>240                                                                                                   |
| IV<br>9<br>Aj<br>A             | 7 C<br>Sum<br>9.1<br>9.2<br>opend<br>A.1<br>A.2        | Conclusions         nmary and Future Scope         The Work         Future scope         Future scope         dices         Neutrino flavor oscillation         Sources of neutrino particle         A.2.1         Supernovae         A.2.2         Solar neutrinos                                                                                                                                                                                                                         | 229<br>230<br>230<br>235<br>235<br>237<br>238<br>238<br>238<br>238<br>238<br>239<br>239<br>239<br>239<br>239<br>239                                                                                            |
| I\<br>9<br>Ај<br>А             | 7 C<br>Sum<br>9.1<br>9.2<br>Openo<br>A.1<br>A.2        | Conclusions         nmary and Future Scope         The Work         Future scope         Future scope         dices         Neutrino flavor oscillation         Sources of neutrino particle         A.2.1         Supernovae         A.2.2         Solar neutrinos         A.2.3         Geologically produced neutrinos                                                                                                                                                                   | 229<br>230<br>230<br>235<br>235<br>237<br>238<br>238<br>238<br>238<br>238<br>238<br>238<br>238<br>239<br>240<br>240<br>240<br>241                                                                              |
| IV<br>9<br>Aj<br>A             | 7 C<br>Sum<br>9.1<br>9.2<br>Openo<br>A.1<br>A.2        | Conclusions         nmary and Future Scope         The Work         Future scope         Future scope         dices         Neutrino flavor oscillation         Sources of neutrino particle         A.2.1         Supernovae         A.2.2         Solar neutrinos         A.2.3         Geologically produced neutrinos         A.2.4                                                                                                                                                     | <b>229</b> 230 237 237 238 237 238 238 239 239 240 240 240 240 241 241 242                                                                                                                                     |
| IV<br>9<br>A <sub>1</sub><br>A | 7 <b>C</b><br>Sum<br>9.1<br>9.2<br>Opend<br>A.1<br>A.2 | Conclusions         nmary and Future Scope         The Work         Future scope         Future scope         dices         Neutrino flavor oscillation         Sources of neutrino particle         A.2.1         Supernovae         A.2.2         Solar neutrinos         A.2.3         Geologically produced neutrinos         A.2.4         Nuclear reactors         A.2.5                                                                                                              | <b>229</b> 230 237 237 238 237 238 238 239 239 239 240 240 240 241 241 242 242 242                                                                                                                             |
| I\<br>9<br>Ај<br>А             | 7 C<br>Sum<br>9.1<br>9.2<br>opend<br>A.1<br>A.2        | Conclusions         nmary and Future Scope         The Work         Future scope         Future scope         dices         Neutrino flavor oscillation         Sources of neutrino particle         A.2.1         Supernovae         A.2.2         Solar neutrinos         A.2.3         Geologically produced neutrinos         A.2.4         Nuclear reactors         A.2.5         Atmospheric neutrinos         A.2.6         Conventional beams (particle accelerators)               | <b>229</b> 230 237 237 238 237 238 238 239 238 239 239 240 240 240 240 241 242 242 242 242 242 242 242 242 242                                                                                                 |
| IV<br>9<br>Aj<br>A             | 7 C<br>Sum<br>9.1<br>9.2<br>Openo<br>A.1<br>A.2        | Conclusions         nmary and Future Scope         The Work         Future scope         Future scope         dices         Neutrino flavor oscillation         Sources of neutrino particle         A.2.1         Supernovae         A.2.2         Solar neutrinos         A.2.3         Geologically produced neutrinos         A.2.4         Nuclear reactors         A.2.5         Atmospheric neutrinos         A.2.6         Conventional beams (particle accelerators)         A.2.7 | 229<br>230<br>230<br>235<br>235<br>237<br>238<br>237<br>238<br>238<br>238<br>238<br>239<br>238<br>239<br>238<br>239<br>240<br>240<br>240<br>240<br>241<br>241<br>242<br>242<br>242<br>242<br>242<br>242<br>242 |

#### xvi

| A.     | 3 Struct | ture and Physics of RPC                                     | 244 |
|--------|----------|-------------------------------------------------------------|-----|
|        | A.3.1    | Operating modes of RPC                                      | 245 |
|        | A.3.2    | Parameters of RPC                                           | 247 |
|        | A.3.3    | Introduction to multi-gap RPC                               | 249 |
| A.     | 4 Types  | of RPC                                                      | 250 |
| п      |          |                                                             | 051 |
| В      |          |                                                             | 251 |
| В.     | Answ     | ers to Questions Asked by the Referees                      |     |
|        | (Dr.Ja   | yanta Mukherjee, IIT, Bombay and Dr.Yasuo Aria, KEK, Japan) | 251 |
|        |          |                                                             |     |
| Biblic | graphy   |                                                             | 253 |

# **List of Figures**

| 1.1  | Particle spectrum in standard model                                   | 4  |
|------|-----------------------------------------------------------------------|----|
| 1.2  | Constitutional details of ICAL along with RPC. These figures are      |    |
|      | adopted from Ref.[46, 47, 49]                                         | 15 |
| 1.3  | Schematic representation of use of TDC in INO experiment              | 18 |
| 2.1  | Timing diagram of single-hit and multi-hit event time stamping        | 27 |
| 2.2  | Classification of time interval measurement techniques                | 28 |
| 2.3  | Schematic diagram of (a) TAC method (b) 8-channel TAC based ana-      |    |
|      | log memory                                                            | 30 |
| 2.4  | Differential TAC (a) schematic diagram (b) timing diagram             | 31 |
| 2.5  | Dual slope techniques using Wilkinson principle (a) with single ca-   |    |
|      | pacitor and current source(b) with separate capacitors and current    |    |
|      | sources                                                               | 33 |
| 2.6  | Dual slope using clock synchronized discharging of capacitor          | 34 |
| 2.7  | Vernier charging method                                               | 34 |
| 2.8  | Timing diagram of TAC conversion chain                                | 36 |
| 2.9  | Time interval measurement technique using TAC with TDL                | 37 |
| 2.10 | (a) Timing diagram of direct counting method (b) circuit used in syn- |    |
|      | chronized gating (c) time interval measurement using startable ring   |    |
|      | oscillator                                                            | 38 |
| 2.11 | Nutt's interpolation technique (a) timing diagram (b) block diagram   | 40 |
| 2.12 | Use of multi-phases of ring oscillator with calibration in Nutt's in- |    |
|      | terpolation technique                                                 | 41 |
| 2.13 | Timing diagram of reference clock asynchronous time interpolation     |    |
|      | technique                                                             | 42 |
| 2.14 | Time interval measurement using tapped delay line                     | 43 |
| 2.15 | Pseudo differential delay line (a) block diagram (b) timing diagram . | 46 |
| 2.16 | Input interchangeable tapped delay line (a) block diagram (b) timing  |    |
|      | diagram                                                               | 47 |

| 2.17 | Time stamping using TDL (a) block diagram (b) timing diagram                   | 49 |
|------|--------------------------------------------------------------------------------|----|
| 2.18 | Schematic diagram of digital memory based time interval measure-               |    |
|      | ment                                                                           | 51 |
| 2.19 | Differential delay line technique (a) block diagram including (LSB)            |    |
|      | stabilization using DLL (b) timing diagram                                     | 54 |
| 2.20 | 2D differential delay line technique for TDC                                   | 55 |
| 2.21 | Cyclic differential delay line implemented using buffers (a) schematic         |    |
|      | diagram (b) timing diagram                                                     | 56 |
| 2.22 | Cyclic differential delay line implemented using odd number of NAND            |    |
|      | gates                                                                          | 58 |
| 2.23 | Parallel delay line with load capacitor scaling based delay element            |    |
|      | (a) schematic diagram (b) delay vs. scaling factor characteristic              | 59 |
| 2.24 | Timing diagram for hierarchical technique                                      | 60 |
| 2.25 | Block diagram of (a) dual loop dual DLL (b) nested DLL scheme                  | 62 |
| 2.26 | Cyclic DLL based coarse interpolator (a) block diagram (b) timing              |    |
|      | diagram                                                                        | 63 |
| 2.27 | Timing diagram of 2-D fine interpolator in three stage interpolation .         | 64 |
| 2.28 | Timing diagram of pulse shrinking delay line technique                         | 66 |
| 2.29 | Cyclic pulse shrinking delay line (a) block diagram (b) schematic              |    |
|      | diagram with temperature compensation                                          | 67 |
| 2.30 | (a) Block diagram of ADLL based time interval measurement (b)                  |    |
|      | edge representation of delayed clocks for 2-D matrix ( $4 \times 5$ ) of delay |    |
|      | elements in ADLL                                                               | 70 |
| 2.31 | Time stamping using DLL and RC delay line with sub-gate delay                  |    |
|      | resolution                                                                     | 71 |
| 2.32 | Block diagram of time amplifier using DLL                                      | 73 |
| 2.33 | Vernier technique (a) block diagram (b) timing diagram                         | 75 |
| 2.34 | (a) PFD based phase detector (b) PFD implemented using dynamic                 |    |
|      | logic to reduce its resetting time                                             | 76 |
| 2.35 | (a) Edge representation of start oscillator coincidence with reference         |    |
|      | clock in dual vernier technique (b) timing diagram of dual Vernier             |    |
|      | technique                                                                      | 77 |
| 2.36 | Time interval measurement using successive approximation technique             | 79 |
| 2.37 | Chronological order of TDC techniques                                          | 84 |
| 3.1  | CMOS technology with history and road-map source. This figure is               |    |
|      | adopted from Ref.[151]                                                         | 85 |

| 3.2  | (a) MOS as capacitor (b) NMOS driven by the gate and drain voltage         |     |
|------|----------------------------------------------------------------------------|-----|
|      | (c) cross sectional view of NMOS device                                    | 86  |
| 3.3  | Operating regions of typical NMOS transistor with size $10\mu m/0.35\mu m$ | 88  |
| 3.4  | Simulated typical delay versus temperature characteristic of buffer        |     |
|      | in 0.35 $\mu$ m CMOS process                                               | 91  |
| 3.5  | Noise in supply voltage due to transient current in CMOS logic. This       |     |
|      | figure is adopted from Ref.[178]                                           | 96  |
| 3.6  | Average number of dopant atoms versus CMOS process nodes. This             |     |
|      | figure is adopted from Ref.[178]                                           | 100 |
| 3.7  | Standard deviation in the delay for chain of buffer due to local mis-      |     |
|      | match                                                                      | 101 |
| 3.8  | Effect of global variations on the gate delay in $0.35\mu m$ CMOS process  |     |
|      | on 27 <sup>0</sup> C and 3.3V using Monte Carlo simulator                  | 102 |
| 3.9  | Block diagram of DLL                                                       | 103 |
| 3.10 | Full swing delay elements (a) CSI with delay in rising edge transi-        |     |
|      | tion (b) CSI with delay in both edge transitions (c) CSI realized by       |     |
|      | NMOS transistors with delay in both edge transition (d) RC loaded          |     |
|      | inverter                                                                   | 105 |
| 3.11 | Pre-layout delay versus control voltage characteristics of full swing      |     |
|      | delay elements                                                             | 105 |
| 3.12 | Differential delay elements (a) diode connected loads (b) linear loads     | 107 |
| 3.13 | PFD based phase detector (a) schematic diagram (b) timing diagram          | 108 |
| 3.14 | XOR logic based PD (a) schematic diagram (b) timing diagram. This          |     |
|      | figure is adopted from Ref.[158]                                           | 109 |
| 3.15 | Schematic diagram of true single phase clock based PD                      | 110 |
| 3.16 | Timing diagram of TSPC based PD (a) when delayed clock lags ref-           |     |
|      | erence (b) when delayed lock leads reference clock                         | 110 |
| 3.17 | Three single ended charge pump configurations (a) drain switching          |     |
|      | (b) source switching (c) gate switching (d) suppression of skew in         |     |
|      | UP and DN signals by CMOS resistor                                         | 111 |
| 3.18 | Schematic diagram of differential charge pump                              | 112 |
| 4.1  | Block diagram of Vernier four channel TDC ASIC                             | 118 |
| 4.2  | Vernier ring oscillator method (a) block diagram (b) timing diagram        | 119 |
| 4.3  | Schematic representation of ring oscillators                               | 121 |
| 4.4  | Schematic diagram of leading edge phase detector                           | 122 |
| 4.5  | Timing diagram for time period calibration of oscillator clocks            | 122 |

| 4.6  | Block diagram for time period calibration of oscillator clocks 123      |
|------|-------------------------------------------------------------------------|
| 4.7  | Timing diagram of four modes of data transfer through SPI 124           |
| 4.8  | Block diagram of SPI based read-out logic                               |
| 4.9  | Timing diagram of data transfer through SPI                             |
| 4.10 | Data format of 24-bit SPI register                                      |
| 4.11 | Layout representation of two channels of TDC showing ring oscilla-      |
|      | tors with phase detector                                                |
| 4.12 | Layout representation of 4-channel Vernier TDC prototype ASIC 129       |
| 4.13 | (a)Waveform representing phase coincidence of ring oscillators for      |
|      | time interval measurement (b)variation in time periods over number      |
|      | of cycles                                                               |
| 4.14 | Waveform representing the SPI output over MISO line correspond-         |
|      | ing to four TDC channels                                                |
| 4.15 | Output versus input time interval characteristic                        |
| 5.1  | Block diagram of multi-hit TDC ASIC                                     |
| 5.2  | Timing diagram of normal operating mode                                 |
| 5.3  | Timing diagram in common start operating mode                           |
| 5.4  | Timing diagram in common stop operating mode                            |
| 5.5  | Block diagram of multi-hit TDC using Vernier technique 142              |
| 5.6  | Timing diagram of multi-hit TDC using Vernier technique 143             |
| 5.7  | Timing diagram of calibration window generation                         |
| 5.8  | Timing diagram representing first two consecutive phase coincidence 145 |
| 5.9  | Schematic diagram of fine calibration window generator 146              |
| 5.10 | Block diagram of FIFO based memory with interface and read-out          |
|      | logic                                                                   |
| 5.11 | Timing diagram of data transfer from TDC channels to FIFO based         |
|      | memory                                                                  |
| 5.12 | Snap shot of layout of TDC channel using first approach 150             |
| 5.13 | Snapshot of layout of TDC ASIC                                          |
| 5.14 | (a) Variation in time period of oscillator over temperature (b) LSB     |
|      | variation over temperature                                              |
| 5.15 | LSB variation due to local mismatches over 73 runs of Monte-Carlo       |
|      | simulation                                                              |
| 5.16 | Period jitter in start oscillator clock (stclk)                         |

| 5.17 | Variation in (a) start oscillator clock time period (b) stop oscillator      |
|------|------------------------------------------------------------------------------|
|      | clock time period (c) LSB (difference in time periods), over number          |
|      | of oscillation cycles                                                        |
| 5.18 | (a) Applied pattern for multi-hit signal with respect to trigger (b)         |
|      | output versus input time interval over 40 ns range on typical corner . 157   |
| 5.19 | Simulated DNL and INL plots for linearity of occurrence time of              |
|      | transition-1 measurement                                                     |
| 5.20 | Output versus input time interval over 40 ns range on (a) WP corner          |
|      | (b) WS corner                                                                |
| 5.21 | Simulated DNL and INL plots for linearity of occurrence time of              |
|      | transition-1 measurement                                                     |
| 5.22 | Edge representation of spclk and stclk rising edge coincidence in            |
|      | negative time interval                                                       |
| 5.23 | Negative time interval measurement using Vernier technique 161               |
| 5.24 | Applied test pattern for time interval (a) common start (b) common           |
|      | stop                                                                         |
| 5.25 | Waveform representing the phase coincidence for transition-1 in com-         |
|      | mon stop mode                                                                |
| 5.26 | Output versus input time intervals for applied test patterns in com-         |
|      | mon start and stop mode                                                      |
| 6.1  | TDC ASIC (a) block diagram (b) timing diagram                                |
| 6.2  | Timing diagram for generation of read command to transfer the TDC            |
|      | data to memory                                                               |
| 6.3  | (a) Generation of syn_tran signal on safe clock edge to sample and           |
|      | latch the counter status (b) schematic diagram of dual edge synchro-         |
|      | nizer (c) timing diagram of dual edge synchronizer 172                       |
| 6.4  | First architecture of TDL: delayed clock edges sample the hit (a)            |
|      | schematic diagram (b) timing diagram                                         |
| 6.5  | Second architecture of TDL: delayed clock edges sample the hit (a)           |
|      | schematic diagram (b) timing diagram 175                                     |
| 6.6  | Repetition in logic '1' to '0' transition in fine register code for bin size |
|      | greater than the value of $T_{ref}/N$                                        |
| 6.7  | Flow diagram of logic one-to-zero transition detector for 74 bits 177        |
| 6.8  | Schematic diagram of buffer tree used to avoid loading                       |
| 6.9  | Buffer loading on TDL to avoid its delay variation among the taps $179$      |

| 6.10 | Block diagram of CBL delay element based DLL with start control            |     |
|------|----------------------------------------------------------------------------|-----|
|      | circuit                                                                    | 180 |
| 6.11 | Layout diagram of CBL delay element (refer Fig.6.5(a) for schematic        |     |
|      | diagram)                                                                   | 182 |
| 6.12 | Zoomed view of layout of TDL showing adjacent placement of buffer          |     |
|      | and CBL delay element                                                      | 183 |
| 6.13 | Layout representation of TDL with 8-channels of 74-bit fine registers      | 183 |
| 6.14 | Scheme of buffer tree placement for 74-bit fine register                   | 183 |
| 6.15 | Layout diagram of 25-bit latch                                             | 184 |
| 6.16 | Delay versus control voltage characteristic of CBL delay element           | 184 |
| 6.17 | (a) Variation in delay on the taps of TDL due to process mismatch          |     |
|      | (b) accumulated non-linearity error                                        | 186 |
| 6.18 | Tapped delay variation due to systematic variation in layout draw-         |     |
|      | ing, noise coupling through parasitics and device component noise .        | 187 |
| 6.19 | (a) Standard deviation in delay for 1000 cycles of clock on the taps of    |     |
|      | TDL (b) standard deviation in accumulated delay along the length           |     |
|      | of TDL due to ripples in the control voltage provided by DLL               | 188 |
| 6.20 | (a) Applied test pattern (b) output versus input time intervals on         |     |
|      | typical corner                                                             | 189 |
| 6.21 | INL and DNL error on typical corner                                        | 190 |
| 6.22 | Plot between relative time interval versus applied time interval on        |     |
|      | (a) WP corner (b) WS corner                                                | 190 |
| 6.23 | INL and DNL error on (a) WP corner (b) WS corner                           | 191 |
| 6.24 | Plot between relative time interval versus applied time interval for       |     |
|      | transition 1 over 40 $\mu$ s range                                         | 191 |
| 7.1  | Block diagram of DLL                                                       | 195 |
| 7.2  | Current balanced logic circuit                                             | 196 |
| 7.3  | MCBL delay element (a) schematic Diagram (b) timing Diagram                | 197 |
| 7.4  | Schematic diagram of current starved inverter                              | 198 |
| 7.5  | $V_{OL}$ versus $V_{ctrl}$ (Control Voltage)                               | 201 |
| 7.6  | Delay versus control voltage characteristic for CBL delay element          | 203 |
| 7.7  | Schematic diagram of bias circuit                                          | 203 |
| 7.8  | Bias voltage $V_{bias}$ versus control voltage $(V_{ctrl})$ characteristic | 207 |
| 7.9  | Modified PDF (a) schematic diagram (b) timing diagram                      | 208 |
| 7.10 | DC characteristics of modified PFD                                         | 209 |
| 7.11 | Missing edge issue in DLL                                                  | 211 |

| 7.12 | Schematic diagram of start control circuit                                        | 211 |
|------|-----------------------------------------------------------------------------------|-----|
| 7.13 | (a) Control model of DLL (b) simplified control model                             | 213 |
| 7.14 | Snapshot of designed layout of DLL                                                | 215 |
| 7.15 | Plot of CBL delay versus control voltage on typical corner (27 <sup>o</sup> c and |     |
|      | 3.3 V)                                                                            | 216 |
| 7.16 | Delayed clocks on typical corner @ 100 MHz                                        | 216 |
| 7.17 | Control voltage profile of DLL (a) without bias circuit (b) with bias             |     |
|      | circuit                                                                           | 217 |
| 7.18 | Phase detector output signal before and after locking of DLL                      | 218 |
| 7.19 | Delay versus control voltage characteristic of CBL Delay element                  |     |
|      | across design process corners                                                     | 219 |
| 7.20 | Profile of control voltage across design process corners                          | 219 |
| 7.21 | DLL phase alignment on typical process @ 100 MHz after achieve-                   |     |
|      | ment of locking                                                                   | 220 |
| 8.1  | Test board for testing the TDC ASIC                                               | 222 |
| 8.2  | Test setup for testing the 4-channel TDC ASIC                                     | 223 |
| 8.3  | Slow and fast ring oscillator clocks and eoc signal on oscilloscope               | 223 |
| 8.4  | Serial output of SPI over MISO on each falling edge of micro-controller           |     |
|      | clock                                                                             | 224 |
| 8.5  | (a) Standard deviation characterizing the precision of TDC with FPGA              |     |
|      | based inputs (b) precision of TDC for inputs derived from cable (c)               |     |
|      | RMS resolution (2.6 ns) for $1m \times 1m$ RPC tested with Vernier TDC            |     |
|      | ASIC                                                                              | 225 |
| 8.6  | Precision of TDC over applied time intervals                                      | 226 |
| 8.7  | (a) Linearity plot of TDC (b) DNL plot for time interval (step size=5             |     |
|      | ns) from 0 to 1.25 $\mu$ s                                                        | 226 |
| A.1  | Neutrino sources with cross section versus energy. The figure is adopted          | đ   |
|      | from Ref.[55]                                                                     | 240 |
| A.2  | Solar neutrino flux predicted by SSM.The figure is adopted from                   |     |
|      | Ref.[56]                                                                          | 241 |
| A.3  | Structure of single gap RPC. This Figure is adopted from Ref. [49]                | 244 |
| A.4  | Functionality of single gap RPC. This Figure is adopted from Ref. [48]            | 245 |
| A.5  | Efficiency versus high voltage characteristics of RPC. This figure is             |     |
|      | adopted from Ref.[47]                                                             | 249 |
| A.6  | Structure of multi-gap RPC with five gaps. This figure is adopted                 |     |
|      | from Ref.[49]                                                                     | 250 |

# List of Tables

| 1.1 | Types of fundamental forces and interaction properties 5              |
|-----|-----------------------------------------------------------------------|
| 1.2 | Specifications of ICAL detector                                       |
| 2.1 | Performance comparison among hierarchical architectures 65            |
| 2.2 | Notation for position of delay elements in ADLL 69                    |
| 2.3 | Sequence of uniformly delayed clocks with respect to position (x) of  |
|     | delay element                                                         |
| 2.4 | Summary of TDC techniques 81                                          |
| 4.1 | Time-periods of ring oscillator across process corners for four chan- |
|     | nels (Ch1, Ch2, Ch3, Ch4)                                             |
| 4.2 | Counts of four channels of TDC across corners for applied time in-    |
|     | terval of 50 ns. OT stands for output time interval                   |
| 5.1 | Design specifications of Vernier multi-hit TDC                        |
| 5.2 | Description of operating modes                                        |
| 5.3 | Variation of time period and LSB across process corners               |
| 5.4 | Calibrated results for time period of start oscillator                |
| 5.5 | Calibrated resolution (LSB*) using stretching factor (S) and calcu-   |
|     | lated resolution (LSB) for time interval measurement                  |
| 6.1 | Design specifications of multi-hit TDC                                |
| 6.2 | Tuning of CBL delay (resolution) using reference voltage              |
| 7.1 | Process parameters and their values                                   |
| 7.2 | Operating regions of MOS transistors in bias circuit                  |
| 7.3 | Achieved performance specifications of CBL delay lock loop 220        |
| 8.1 | Achieved specifications of Vernier TDC ASIC                           |
| 9.1 | Performance comparison of Vernier TDC                                 |
| 9.2 | Performance comparison of CBL delay element                           |

| 9.3                                                               | Performance comparison (*TH code-Thermometer code, **CSI Cur-         |  |  |
|-------------------------------------------------------------------|-----------------------------------------------------------------------|--|--|
| rent Starved Inverter) ***comparison has been made in terms of ar |                                                                       |  |  |
|                                                                   | chitecture of basic blocks and design specification using post layout |  |  |
|                                                                   | (av-extracted) simulation results                                     |  |  |
| A.1                                                               | Nuclear reactions in sun producing neutrinos                          |  |  |
| A.2                                                               | Sources of neutrino in atmosphere and accelerator experiments 243     |  |  |
| Δ2                                                                |                                                                       |  |  |
| A.3                                                               | Comparison between operating regions of RPC                           |  |  |

# List of Abbreviations

ADC . . . . . Analog To Digital Converter ADLL . . . . Array of Delay Lock Loop AOI . . . . . AND OR Invert ASIC . . . . . Application Specific Integrated Circuit **BNL** . . . . . Brookhaven National Laboratory **BiCMOS** . . . Bipolar-CMOS **BSIM** . . . . . Berkeley Short-Channel IGFET Model CAD . . . . . Computer Added Design **CBL** . . . . . Current Balanced Logic **CC** . . . . . . Charge Current CDR . . . . . Clock and Data Recovery **CDF** . . . . . Collider Detector at Fermilab **CERN** . . . . . European Organization for Nuclear Research **CHOOZ** . . . A long baseline neutrino oscillation experiment in Chooz, France CIF . . . . . Caltech Intermediate Form **CL** . . . . . . Confidence Limit CLCC . . . . Ceramic Leadless Chip Carrier **CML** . . . . . Current Mode Logic **CMOS** . . . . Complementary Metal Oxide Semiconductor **CP** . . . . . . Charge Pump **CPL** . . . . . Complementary Pass Transistor Logic CSI . . . . . Current Starved Inverter **CSL** . . . . . . Current Steering Logic **DAQ** . . . . Data Acquisition **DCO** . . . . . Digitally Controlled Oscillator **DCVSL** . . . . Differential Cascade Voltage Switch Logic DLL . . . . . Delay Lock Loop

- **DIL** . . . . . Dual-in-Line Package
- DNL . . . . Differential Non Linearity
- DN . . . . . Down signal, Output of Phase Detector
- **DR** . . . . . Dynamic Range
- DRC . . . . Design Rule Check
- ECL . . . . . Emitter Coupled Logic
- EOC . . . . . End Of Conversion
- ES . . . . . . Elastic Scattering
- **FF** . . . . . . . Flip Flop
- **FIFO** . . . . . First-In-First-Out
- **FPGA** . . . . . Field Programmable Gate Array
- GALLEX . . . Gallium Experiment
- GDS . . . . . Graphic Database System
- HDL . . . . Hardware Description Language
- **HEP** . . . . . . High Energy Physics
- HMC . . . . . Hybrid Micro Circuit
- HP . . . . . . Hewlett Packard
- HPTDC . . . . High Performance Time-to-Digital Converter
- ICAL . . . . Iron Calorimeter
- IC . . . . . . Integrated Circuit
- INV . . . . . Inverter
- **INO** . . . . . . India based Neutrino Observatory
- INL . . . . . Integral Non-linearity
- I/O . . . . . . Input/Output
- Kamiokande . Kamioka Nucleon Decay Experiment
- KamLAND . . Kamioka Liquid Scintillator Anti-Neutrino Detector
- KCL . . . . . Kirchoff's Current Law
- KEK . . . . . High Energy Accelerator Research Organization
- K2K . . . . . KEK to Super-K

LBL . . . . . Long Base Line **LER** . . . . . . Line Edge Roughness LHC . . . . . Large Hadron Collider **LIDAR** . . . . Light Detection and Ranging LOR . . . . . Line of Response **LSB** . . . . . Least Significant Bit LUT . . . . . Look-up Table LVS . . . . . Layout versus Schematic MC . . . . . . Monte Carlo MCML . . . . MOS Current Mode Logic **MINOS** . . . . Main Injector Neutrino Oscillation Search **MISO** . . . . . Master in Slave out MOS . . . . Metal Oxide Semiconductor **MOSFET** . . . Metal Oxide Semiconductor Field Effect Transistor MOSI . . . . Master out Slave in MUX . . . . . Multiplexer NC . . . . . . Neutral Current NMOS . . . . N-chanel MOSFET **NuMI** . . . . . Neutrino in Main Injector **PCB** . . . . . Printed Circuit Board **PET** . . . . . . Positron Emission Tomography **PD** . . . . . . Phase Detector **PDK** . . . . . . Process Design Kit **PFD** . . . . . . Phase Frequency Detector **PGA** . . . . . . Pin Grid Array PLL . . . . . Phase Lock Loop **PLCC** . . . . . Plastic Leaded Chip Carrier **P&R** . . . . . . Placement and Routing **PMOS** . . . . P-channel MOSFET

PMT . . . . . Photo Multiplier Tube **PSDL** . . . . Pulse Shrinking Delay Line **PVT** . . . . . . Process, Voltage, Temperature RAM . . . . . Random Access Memory R&D . . . . . Research and Development **RENO** . . . . . Reactor Experiment for Neutrino Oscillations **RC** . . . . . . . Resistive-Capacitive **RF** . . . . . . Radio Frequency **RPC** . . . . . . Resistive Plate Chamber Register Transfer Language **RTL** . . . . . . **RDF** . . . . . . Random Dopant Fluctuation Root Mean Square **RMS** . . . . . SAR ADC . . Successive Approximation ADC Sense Amplifier based Flip-Flop SAFF . . . . . SAGE . . . . . Soviet American Gallium Experiment SCK . . . . . Master clock in SPI Protocol **SDF** . . . . . . Standard Delay Format SLAC . . . . Stanford Linear Accelerator Center SNO . . . . . Sudbury Neutrino Observatory **SNP** . . . . . . Solar Neutrino Problem **SPI** . . . . . . Serial Peripheral Interface SPICE . . . . . Simulation Program with Integrated Circuit Emphasis Standard Model SM . . . . . . Slave Select SSEL . . . . . SSM . . . . . Standard Solar Model TA . . . . . . Time Amplifier **TAC** . . . . . . Time to Amplitude Converter TACC . . . . Time to Amplitude Conversion Chain **TDC** . . . . . . Time-to-Digital Converter

| TDL        | Tapped Delay Line                              |
|------------|------------------------------------------------|
| ΤΜС        | Time Memory Chip                               |
| <b>TOF</b> | Time of Flight                                 |
| TOFMS      | Time of Flight Mass Spectrometry               |
| ΤΟΤ        | Time Over Threshold                            |
| TSPC       | True Single Phase Clock                        |
| ТҮР        | Typical, a CMOS process design corner          |
| UP         | UP signal, Output of Phase Detector            |
| VDL        | Vernier Delay Line                             |
| VCDL       | Voltage Controlled Delay Line                  |
| WO         | Worst case One, a CMOS process design corner   |
| WP         | Worst case Power, a CMOS process design corner |
| WS         | Worst case slow, a CMOS process design corner  |
| WZ         | Worst case Zero, a CMOS process design corner  |

# Part I

Introduction

# Chapter 1

# **Study of Neutrino:INO Experiment**

#### **1.1 The Neutrino**

Neutrino physics dates back to early years of the twentieth century when in 1914, 'James Chadwick' demonstrated the continuous energy spectrum of nuclear beta  $(\beta)$  decay [1]. This perplexed the scientific community as continuous energy spectrum was in contradiction to the discrete one involved in other radioactive (alpha and gamma) decays, thereby appeared to break the law of energy conservation. There were two possible ways to justify the observed continuous energy spectrum: (i) energy conservation does not hold in the nucleus, speculated by 'Niels Bohr'[2] or, alternatively, (ii) an unobserved neutral particle that carries missing energy, is also emitted together with beta particles. The second view point was postulated by 'Pauli' in December 1930 through a public letter [3] in his desperate attempt to save the law of energy conservation in nuclear  $\beta$ -decay process. He addressed this unknown particle as 'neutron' and assumed it as a constituent of nucleus with mass smaller than 0.01 of the proton mass. In 1934, James Chadwick discovered a new particle with its mass equals to that of proton and named it as 'neutron' [4]. However, this particle was not matching the Pauli's predicted particle due to its heavy mass.

The next fundamental contribution to the development of neutrino's idea was made by 'E. Fermi' in 1934 [5]. Fermi built the first theory of the  $\beta$ -decay of nuclei [6, 7]. This theory was based on the Pauli's assumption of emission of unobserved neutral particle. Subsequently, to address the Pauli's predicted particle, Fermi coined a new name 'neutrino' (from Italian-a neutral little one) and assumed that electron-neutrino is produced in  $\beta$  decay by conversion of a neutron into a proton.

Fermi's theory of nuclear beta decay was remarkably successful. However, the experimental observation of neutrino seemed impossible due to its weak interaction nature as calculated by 'Hans Bethe' and 'Rudolf Peierls' in 1934 [8]. According to their prediction, the neutrino interaction cross section ( $\approx 10^{-43}cm^2$ ) with matter is much smaller than typical electromagnetic cross sections ( $10^{-27}cm^2$ ). So, it was deduced that there is practically no feasible way of observing the neutrinos. However, in 1956, a pioneering experiment [9][10], headed by Frederick Reines and Clyde Cowan showed the existence of neutrino through the process of inverse  $\beta$ -decay, as shown in equation (1.1). Here, an electron type anti-neutrino created in a nuclear reactor is captured by a proton giving rise to a positron and a neutron.

Inverse beta decay: 
$$\bar{\nu}_e + {}_1H^1 \rightarrow {}_{+1}\beta^0 + {}_0n^1$$
 (1.1)

Apart from electron type neutrino ' $\nu_e$ ', the interesting question for the physicists in 1960s was whether the neutrino produced by the decay of charged pions  $(\Pi^{\pm})$  are identical to that of produced by  $\beta$ -decay. This problem was solved experimentally in 1962 by 'Leon Lederman' et al. by first detecting interaction of moun  $(\nu_{\mu})$  neutrino [11]. they performed this experiment with accelerator neutrino and studied the interaction of type:  $\nu_{\mu} + N \longrightarrow \mu^{-} + X$  or  $\nu_{\mu} + N \longrightarrow e^{-} + X$ . Only the first type of interactions were observed, demonstrating that electron and muon neutrinos are two different particles. Subsequently, in 2000 the first evidence of a third type of neutrino 'tau'( $\nu_{\tau}$ ) was found by the DONUT collaboration [12]. These discoveries lead to the interpretation that there exists three flavors of neutrinos called electron ' $\nu'_{e'}$  muon ' $\nu'_{\mu}$  and tau ' $\nu'_{\tau}$  with their respective anti-particles in nature. The LEP experiment [13, 14, 15] showed that there is no further generation of neutrinos (with mass less than about 45 GeV) apart from the three types. Since then, after decades of painstaking experimental and theoretical work, neutrinos have become an essential part of the quantum description of the fundamental particles and forces, namely, the Standard Model (SM) of particle physics.



#### 1.2 Neutrino in standard model (SM)

Figure 1.1: Particle spectrum in standard model

The SM, a theoretical model developed in 1970's [16, 17], describes the most fundamental particles of matter and their interactions with each other. These particles are defined as point-like, without internal structure and excited state. They are divided into fermions (half integral spin) and bosons (integral spin) as shown in Fig.1.1. The fermions are the building blocks of matter and are further subclassified into six leptons [{electron, electron neutrino}, {muon, muon neutrino}, {tau, tau neutrino}] and six quarks [{up, down}, {charm, strange}, {top, bottom}].

Out of six leptons, three (electron, muon and tau) are negatively charged particles. The other three particles are neutral known as 'neutrinos'. The electron is the stable particle and contributes in the formation of an atom. Muon and Tau are more massive and less stable as compared to electron and further decays to light particles.

Quarks have a tendency to clump together to form a colorless particle called Hadron. They are further sub-classified into mesons and baryons as per the constituent type of quark. Meson is a combination of one quark and anti-quark while baryon is the combination of three quarks. The up (u) and down (d) quarks contribute in the formation of baryons like proton (uud) and neutron (udd).
The gluons (8 in total), photon, Z and  $W^{\pm}$  are the SM bosons. They are the mediators of fundamental forces: 'strong', 'weak' and 'electromagnetic', through which matter particles interact with each other. Each fundamental force has its own corresponding boson; the strong force is carried by the 'gluon', the electromagnetic force is carried by the 'photon', and the 'W and Z' bosons are responsible for the weak force. These forces work over different range and have different strength as given in Table1.1. The weak and strong forces are effective over a short range and dominate only at the level of subatomic particles.

The strong interaction force is responsible to bind the quarks to form baryons (proton and neutron) and mesons (pions and kaons). As, this force is 100 times stronger than the electromagnetic force (exerted between two protons), so it helps to form the nucleus by binding the protons along with the neutrons.

The weak force is responsible for both the radioactive decay and nuclear fusion of subatomic particles. The neutrinos interact with the medium by the weak nuclear force. The electromagnetic force is responsible to bind the electron in atom. Also, the charged leptons interact with the medium electromagnetically.

The fundamental force gravity stays beyond the grasp of the SM till date. The corresponding force carrier, if exists, is named graviton. Although all matter interact through gravity, its effect is negligible in the microscopic domain because of its extremely low strength.

| Interaction        | Electroweak             |                            | Strong                 |
|--------------------|-------------------------|----------------------------|------------------------|
| Property           | Weak                    | Electromagnetic            |                        |
| Acts on            | Flavor                  | Electric charge            | Color<br>charge        |
| Affected particles | Quarks,<br>Leptons      | All charged parti-<br>cles | Quarks,<br>Gluons      |
| Exchange particles | W+, W-, Z               | $\gamma$                   | Gluons                 |
| Range              | $pprox 10^{-3} { m fm}$ | Infinite                   | $\approx 1 \text{ fm}$ |
| Relative strength  | 10 <sup>-5</sup>        | $10^{-2}$                  | 1                      |
| Example            | $\beta$ decay           | Atomic binding             | Nuclear<br>binding     |

Table 1.1: Types of fundamental forces and interaction properties

The fermions as well as W and Z gauge bosons acquire mass through interactions with the 'Higgs' field. The associated particle being referred to as the Higgs boson, predicted to be a massive and scalar particle. The existence of all the particles (except Higgs boson) has been proved experimentally and on 4 July 2012 the results from the LHC experiments at CERN [18] strongly indicate the discovery of a Higgs-like boson.

The Standard Model predicts the neutrinos to be massless, electrically neutral and having a spin of  $\hbar/2$ , where  $\hbar$  is Planck's constant divided by 2II. Also, neutrinos and anti-neutrinos are considered to have left-handed and right-handed helicity respectively. The interactions of neutrinos with matter is described by weak interactions, which proceeds through the exchange of bosons. The neutral current (NC) weak interaction involves the exchange of a Z boson and charged current (CC) involves the exchange of a  $W^{\pm}$  boson. Here, neutrinos emit  $W^+$  and antineutrinos emit  $W^-$ . In the NC interaction, the neutrino transfers some of its energy and momentum to a target particle. If the target particle is charged and sufficiently light (e.g. electron), it can be accelerated to a relativistic speed and consequently emit Cherenkov radiation as a signature of neutrino detection. On the other hand, if target is neutral then detection signature is produced through the radiation, emitted by secondary charged particle.

In the CC interaction, the neutrino transforms into its respective partner charged lepton (electron or muon or tau) with the characteristic change in target particle. However, if the neutrino does not have sufficient energy to create its heavier partner's mass, the CC interaction is not feasible. Solar, atmospheric, reactor and accelerator neutrinos have enough energy to undergo the CC interaction with the target.

The SM advocates the lepton flavor conservation [19], where a unique conserved number (electron number, muon number and tau number) is assigned to each lapton family. For instance, electron and the corresponding neutrino have electron number +1, positron and the anti-neutrino have electron number -1, and all other particles have electron number 0. Muon number and tau number apply analogously with the other two lepton families. The lepton number is conserved when a massive lepton decays into smaller ones and implies that neutrinos cannot undergo flavor transformation. This leads to the rejection of neutrino mass (flavor) oscillation [Appendix A.1], in which *'it changes from one flavor to other while passing through the medium'*.This phenomenon was proposed by 'B. Pontecorvo' [20, 21] in the late 1950's and is true for neutrino if it has mass and its flavor eigenstates ( $\nu_e$ ,  $\nu_\mu$ and  $\nu_\tau$ ), which participate in the weak interactions are mixtures of mass eigenstates  $(v_1, v_2, v_3)$  with different masses  $(m_1, m_2, m_3)$ . This is parameterized by mixing angles  $(\theta_{12}, \theta_{23} \text{ and } \theta_{13})$ , which defines the extant of difference in flavor and mass eigenstate. For instance, for  $\theta_{ij} = 0$ , both eigenstates are identical which leads to no flavor oscillation. However,  $\theta_{ij} = \Pi/4$ , defines the maximum probability of flavor oscillation.

Further, for flavor oscillation to occur the mass eigenstates should be different and at least one should be non-zero. Thus, this oscillation phenomenon is attributed by the mass square difference  $\Delta m_{ij}^2 = m_i^2 - m_j^2 = \delta_{ij}$  instead of absolute masses.

The neutrino mass and oscillation supported by several experimental observation, discussed in the following section, turned out to be a fundamental issue in neutrino physics and thereby paved a way to search for new physics beyond the Standard Model.

### **1.3** Neutrino mass oscillation experiments

A series of experiments, aimed at studying neutrino oscillations using various neutrino sources [A.2], with their corresponding energy ranges, have been conducted. The 1956 reactor neutrino experiment by 'Reines and Cowan' found evidence for electron anti-neutrino. Subsequently, in 1960's, 'Raymond Davis' first detected and counted the solar neutrinos at energy threshold of 5.8 MeV through the Homestake Experiment [22]. In this experiment, a radiochemical detection technique is used, which involves the separation of <sup>37</sup>*Ar* from  $C_2Cl_4$  containing <sup>37</sup>*Cl* to measure neutrinos,produced by nuclear reaction and  $\beta$  decays in the hot core of sun. Solar electron neutrino ' $\nu_e$ ' interacts with the <sup>37</sup>*Cl* through reaction given in equation(1.2) and produces radioactive <sup>37</sup>*Ar* atoms, which decays back to <sup>37</sup>*Cl* via electron capture with the emission of X-rays and Auger electrons. The <sup>37</sup>*Ar* atoms were extracted from the tank and counted using proportional counters, where 2.82 KeV auger electronics indicated the presence of <sup>37</sup>*Ar* atoms. The results of this experiment observed only a third of the neutrino flux against predicted using standard solar model (SSM)[23].

$$\nu_e + {}^{37}Cl \to e^- + {}^{37}Ar$$
 (1.2)

To explain this shortfall in the number of  $\nu_e$ , in 1999, the gallium radiochemical detector with small energy threshold of 233 KeV was deployed in GALLEX in Gran Sasso, Italy [24, 25] and SAGE in Baksan, Russia [26, 27] experiments. However, the solar neutrino flux measured by these experiments was about half of that predicted by the SSM [28, 29]. This discrepancy in the count of solar neutrinos created a solar neutrino problem (SNP) [30]. In 2001, the issue of SNP was first resolved in a definitive manner by Sudbury Neutrino Observatory (SNO) in Canada. The aim of the experiment was to test the neutrino mass oscillation phenomenon; where electron neutrino ' $\nu_e$ ' produced in solar interior changes into another flavor, which is not measurable by the radiochemical detectors.

The SNO detector used a 1000 ton heavy-water detector. Here, the target medium (heavy water) was sensitive to ' $\nu_e$ ' through CC interaction and to all flavors through the flavor independent NC interactions.

CC process: 
$$\nu_e + d \rightarrow e^- + p + p.$$
 (1.3)

NC process: 
$$\nu_{\alpha} + d \rightarrow p + n + \nu_{\alpha}$$
. ( $\alpha = e, \mu, \tau$ ) (1.4)

Elastic scattering (ES):  $\nu_{\alpha} + e^{-} \rightarrow \nu_{\alpha} + e^{-}$  ( $\alpha = e, \mu, \tau$ ) (1.5)

The CC and the ES processes were observed through the detection of the Cherenkov light produced by the electrons in heavy water. The NC process was observed by detecting the  $\gamma$ -rays emitted as a result of neutron capture. The NC reaction helps to determine the total flux of all active neutrino flavors from the Sun whereas the CC reaction provides the  $v_e$  flux alone. The ES reaction offers an additional measurement of neutrino interaction. The total neutrino flux thus obtained was in agreement with the SSM. This agreement was attributed by flavor change of around 2/3 flux of electron neutrino into other flavors during their journey from sun to earth. The best fit experimental results are- $\delta_{21} = 7.910^{-5} eV^2$  and  $sin^2_{12} = 0.31$  [30].

An equally intriguing problem cropped up during measurement of atmospheric neutrinos in Super-Kamiokande experiment [31, 32, 33, 34, 35, 36]. It utilized 50 Kilotons of ultra pure water-Cherenkov detector for neutrinos in a wide range of energies from about 100 MeV to about 10 TeV. The neutrino undergoes CC interaction with the hydrogen and oxygen nuclei to produce the corresponding charged lepton as per equation(1.6). This interaction was observed through the detection of the Cherenkov radiation in the form of rings on the surface of PMT. The shape of the rings was different as muons being much heavier than electrons follow a straight track in water. On the other hand, electrons undergo scattering repeatedly and also produce electromagnetic shower, which result in the formation of a diffused ring. This difference was attributed to discriminate the initial type of neutrino interaction.

$$\nu_{e} + n \rightarrow e^{-} + p.$$

$$\bar{\nu}_{e} + p \rightarrow e^{+} + n.$$

$$\nu_{\mu} + n \rightarrow \mu^{-} + p.$$

$$\bar{\nu}_{\mu} + p \rightarrow \mu^{+} + n.$$
(1.6)

The mechanism of atmospheric neutrino production as illustrated in equations(1.7) implies that at low energies (E < 1 GeV) most of the muons decay before reaching the Earth surface, the neutrino fluxes satisfy the ratio given in equation(1.8)

$$\Pi^{+} \rightarrow \mu^{+} + \nu_{\mu}.$$

$$\Pi^{-} \rightarrow \mu^{-} + \bar{\nu}_{\mu}$$

$$\mu^{+} \rightarrow {}_{+1}\beta^{0} + \nu_{e} + \bar{\nu}_{\mu}$$

$$\mu^{-} \rightarrow {}_{-1}\beta^{0} + \nu_{\mu} + \bar{\nu}_{e}$$

$$r = \frac{\phi_{\nu_{\mu}} + \phi_{\bar{\nu}_{\mu}}}{\phi_{\nu_{e}} + \phi_{\bar{\nu}_{e}}} = 2$$
(1.8)

At higher energies, relatively less muon decays into neutrino during their journey, therefore, flavor ratio 'r' increases. The experimental observation of 'r' is reported in terms of the double-ratio 'R' defined as-

$$R = \frac{(N_{\mu}/N_e)_{experimental data}}{(N_{\mu}/N_e)_{montecarlodata}}$$
(1.9)

Where,  $N_{\mu}$  and  $N_e$  are the number of muon and electron events respectively. The obtained value of R in Super-K experiment was significantly lesser than unity, given by equation(1.10). This indicated the deficit in muon neutrino flux while the electron neutrino flux matched the predicted value. This observation was named as atmospheric neutrino anomaly.

$$R = 0.69 \pm 0.06 \tag{1.10}$$

Further, Super-K experiment observed a zenith angle dependency on the muon-neutrino flux. The zenith angle ' $\theta$ ' is determined as: neutrinos going vertically downward have  $\theta = 0$  and neutrinos coming vertically upward through the earth have  $\theta = \Pi$ . At neutrino energies  $E \ge 1$  GeV, the fluxes of muon and electron neutrinos are symmetric under the change  $\theta \longrightarrow \Pi - \theta$ . Thus, if there are no neutrino oscillations in this energy region, the both electron and muon events must

satisfy the relation-

$$Nl(\cos\theta) = Nl(-\cos\theta).$$
 (where,  $l = e, \mu$ ) (1.11)

However, a large violation in this relation for high energy upward going muon events was established while the electron events satisfied the above equation(1.11). This observation is referred as *up-down asymmetry*.

The solution of both the aforesaid discrepancy was provided by *neutrino oscillation* phenomenon. Disappearance of muon neutrinos was justified by the muon to tau flavor oscillation. Also, downward going neutrino produced in the atmosphere travel 10's of kilometers to reach the detector, whereas for upward-going neutrinos it is several thousand kilometers. Consequently, upward going neutrinos have greater probability of this flavor transition, which explains the up-down asymmetry. The possibility of  $\nu_e \rightleftharpoons \nu_{\mu}$  oscillation was rejected as electron neutrino flux matched the predicted value. The best fit experimental results are- $|\delta_{32}| = 2.2 \times 10^{-3} eV^2$  and  $sin 2\theta_{23} = 0.5$ , where the sign of  $\delta_{32}$  is not known.

In 2002, the neutrino oscillation was also observed for reactor neutrinos by KamLAND experiment [37, 38]. It deployed a 1000 tons of liquid scintillator detector located in the Kamioka mine, Japan. Electron anti-neutrinos produced from 55 nuclear reactors were detected using inverse  $\beta$ -decay reaction. The signature of electron anti-neutrino capture was provided by the delayed co-incidence between prompt  $\gamma$ -rays produced by electron-positron annihilation and the delayed  $\gamma$ -rays produced by thermal neutron capture. The deformation of the observed electron anti-neutrino spectrum as well as deficit in its flux indicated the neutrino oscillation.

Subsequently, the field of neutrino experiments moved towards the long base line (LBL) experiment for more precise measurement of mixing angles. This experiment deploys two detectors, one located nearer the neutrino source and other at some distance 'L' away from it, so that both detector observations can be compared to analyze the medium effect on neutrino oscillation and mixing angles. The man made accelerator based neutrinos served a potential source for LBL experiments. In 1999, K2K [39], Japan was the first LBL experiment, which confirmed the oscillation phenomenon in accelerator neutrinos. Muon neutrino beam with a mean energy of 1.3 GeV, produced by KEK PS accelerator was detected using a near and far detector. The near detector system located at a distance of 300 meters from pion production target, consisted of two detectors. One is 1 Kton water Cherenkov detector and other is fine grained detector. Measurement of neutrino energy spectrum in the near detector is predicted to be same at far detector, if there is no neutrino flavor oscillation. The Super-K detector located at a distance of 250 km, served as a far detector. Disappearance of  $\nu_{\mu}$  was identified by the deficiency in muon flux and distortion in neutrino spectrum at the far detector through the experimental best fit results- $|\delta_{32}| = 2.8 \times 10^{-3} eV^2$  and  $sin2\theta_{23} = 1$ . Thus, K2K reconfirmed the observed atmospheric neutrino oscillation without any other explanation for atmospheric neutrino anomaly and also rejected the possibility of  $\nu_e \rightleftharpoons \nu_{\mu}$  oscillation.

In 2005, Main Injector Neutrino Oscillation Search (MINOS) [40]LBL experiment observed oscillations in muon neutrinos with energy range of 1-5 GeV, produced by NuMI (Neutrino in Main Injector) beamline facility, at Fermilab near Chicago. In this experiment, the near detector is located at a distance of few hundred meters and the far detector is located at a distance of 735 km from beam source respectively. Both near and far detectors are steel scintillator sampling calorimeters implemented by using alternate planes of magnetized steel and plastic scintillator. Results from the MINOS experiment provide more precise measurement of mass oscillation parameter  $|\delta_{32}| = 2.23^{+0.12}_{-0.08} \times 10^{-3} eV^2$  and  $sin2\theta_{23} > 0.90$  at 90% CL [41] for atmospheric neutrino.

The LBL experiment was also performed on reactor electron neutrinos to reconfirm the  $\nu_e \rightleftharpoons \nu_{\mu}$  oscillation during their journey through the atmosphere. In 1997, the CHOOZ experiment [42], France was the first experiment of this kind. Here, the 5 tons of Gd-loaded liquid scintillator detector was located at a distance of about 1 km from each of the two reactors of the CHOOZ power station (8.5 GWth). The detector had 300 m water equivalent of rock overburden, which reduced the cosmic muon flux. The anti-neutrinos were detected through the observation of the reaction given in equation-

$$\bar{\nu}_e + p \to e^+ + n. \tag{1.12}$$

The CHOOZ experiment did not observe any significant evidence for disappearance of electron anti-neutrino, which rejected the  $\nu_e \rightleftharpoons \nu_{\mu}$  solution of the atmospheric neutrino anomaly and provided an upper limit for  $\theta_{13}$  as-  $sin2\theta_{13} <$ 0.17. Further, three reactor neutrino experiments, namely Double Chooz [43], Daya Bay [44] and RENO [45], with detection principle same as that of KamLAND and CHOOZ, obtained better limits given in equation(1.13) on  $\theta_{13}$  in 2012.

$$sin^{2}2\theta_{13} = 0.086 \pm 0.041 \pm 0.030$$
 Double CHOOZ  
=  $0.092 \pm 0.016 \pm 0.005$  Daya Bay (1.13)  
=  $0.092 \pm 0.016 \pm 0.005$  RENO

The main results obtained from these neutrino experiments are summarized as follows- reactor neutrino experiment data shows that there is small admixture of the mass eigenstate  $v_3$  in the flavor eigenstate  $\nu_e$ . This leads to a small value of mixing angle  $\theta_{13}$  used in the parameterization of the neutrino mixing matrix, indicating 2-flavor (involving  $v_1$  and  $v_2$ ) mixing approximation for solar neutrino. Hence, only two parameters  $\delta_{21}$  and  $\theta_{12}$  appear in the analysis of these solar and the KamLAND reactor experiments. For the same reason, there is only one dominant mass-square difference  $\delta_{32}$  and one mixing angle  $\theta_{23}$  appearing in the atmospheric neutrino experiment. Thus, these experiments demonstrated the predictions of neutrino mass and oscillation, which are beyond the SM till date.

In spite of these remarkable results, there are several outstanding issues of fundamental importance, which are discussed below-

- Massive neutrinos are Dirac or Majorana particles: If massive neutrinos are Dirac particles, they can be distinguished from their antiparticles. By definition, a Majorana neutrino is identical to its antiparticle, hypothesized by 'Ettore Majorana' in 1933. The Majorana nature of massive neutrinos can be identified by observation of neutrino-less double beta decay where, neutrino produced by the first type of decay would serve as the anti-neutrino in inverse beta decay.
- How many neutrino species are there? Do sterile neutrinos exist? The number of active species with masses less than half the mass of the Z-boson is limited to three by LEP experiments. However, recent results suggests the existence of at least one more relatively light species, which has to be sterile, which cannot experience weak interactions), since there cannot be more than three light active species.
- What is the precise value of |Δm<sup>2</sup><sub>31</sub>|? What is the sign of Δm<sup>2</sup><sub>31</sub> or the character of the neutrino mass hierarchy? The sign of Δm<sup>2</sup><sub>31</sub> or neutrino mass hierarchy (ordering of neutrino mass states) is still not known. Determination of mass hierarchy is of prime importance, because it dictates the structure of the neutrino mass matrix, and hence could give vital clues towards the underlying theory of neutrino masses and mixing. If sign (Δm<sup>2</sup><sub>31</sub>)>0, then it

is known as normal hierarchy  $(m_1 < m_2 < m_3)$ . If sign  $(\Delta m_{31}^2) < 0$ , it is known as inverted mass hierarchy  $(m_3 < m_1 < m_2)$ .

- **Tiny neutrino mass puzzle:**The fact that the masses of neutrinos are considerably smaller than the masses of charged leptons or quarks is a big puzzle in particle physics.
- Absolute scale of neutrino masses: Direct mass measurements from beta decay place an upper limit of about 2.2 eV. Model dependent interpretation of the astrophysical results limits the sum of the neutrino masses to around 0.5 eV. Neutrino-less double beta decay, if observed, will also determine the mass scale.
- Why the mixing angles  $\theta_{12}$  and  $\theta_{23}$  are so large while  $\theta_{13}$  is relatively small: The bilarge neutrino mixing pattern is also a mystery to theorists because it is anomalously different from the familiar tri-small quark mixing pattern.
- Is there leptonic CP violation: A necessary condition for the existence of CP violation in normal neutrino oscillations is  $\theta_{13} \neq 0$ . As CP violation has been discovered in the quark sector, there is no reason why CP should be conserved in the lepton sector.
- What is the importance of neutrino physics in our understanding of dark matter and dark energy? There are two astronomical theories behind the ingredients of dark matter (unseen mass of universe); baryonic matter and non-baryonic matter. Baryonic matter is composed of protons, neutrons, and electrons but fails to be detected. The non-baryonic dark matter is hypothesized to be composed of particles that were created during the early, hot phase of the universe and are surviving in the present epoch. Some of the plausible candidates of non-baryonic dark matter are axions and neutrinos. As neutrinos have been found to possess a tiny but finite mass, their further study may be the key to mystery of dark matter.
- **Importance of neutrino to solve the matter-antimatter asymmetry:** The explanation for the survival of some matter may lies in the Majorana particle nature of neutrinos.

The above discussed missing links have paved a way for several worldwide near future neutrino experiments, aimed to precise measurement of mixing angle  $\theta_{13}$ 

and CP violations in leptonic sector. With a view of re-enter into this area, an initiative has taken to set an underground neutrino observatory laboratory in India. This is the second indigenous effort after KGF experiment on atmospheric neutrinos, performed in 1965.

## 1.4 India based Neutrino Observatory (INO) experiment

The INO [46, 47] experiment is an India based multi-institutional venture to build an underground laboratory with rock overburden of at least 1 km in all the directions. It is aimed to study the parameters related to atmospheric neutrino mass oscillations and matter effects over it. The rock overburden will shield the background radiation (from cosmic rays and natural radioactivity), which interacts much more readily than neutrinos with the detector, thereby improves the sensitivity of neutrino measurements. Also, this will be helpful to host other experiments like neutrinoless double beta decay, dark matter experiments, etc., which require low cosmic ray background environment. The proposed detector to be used for neutrino observatory is ICAL (iron calorimeter detector), which will be located at Theni, away by distance of 110 km from the city of Madurai in South India.

The main objectives of ICAL experiment are-

- To reconfirm the occurrence of oscillations in atmospheric muon neutrinos through the explicit observation of first oscillation swing in muon neutrino disappearance as a function of L/E.
- To obtain a significantly improved measurement of the oscillation parameters with respect to the earlier measurements.
- To obtain unambiguous evidence for the matter effects in neutrino oscillation.
- Determination of sign of mass squared difference between 2<sup>nd</sup> and 3<sup>rd</sup> mass eigen states 'δ'<sub>32</sub> using the matter effect.
- To determine whether mixing angle *θ*<sub>23</sub> is maximal, and if not, in which octant it lies.
- To determine whether sterile neutrino exists.

#### 1.4.1 Iron calorimeter (ICAL) detector

The magnetized ICAL detector is a static device with 50 Kiloton weight without moving parts. It has a modular structure of total lateral size ' $48m \times 16m \times 14.5m'$  as shown in Fig.1.2(a). It consists of a stack of 151 horizontal layers of 56 mm thick magnetized iron plates interleaved with 40 mm gap to house the RPC (resistive plate chamber)[48, 49, 50](A.3). It is a type of gas detector, two electrodes made by insulating material are mounted apart by a some small gap, as shown in Fig.1.2(b). The gap is filled by the mixture of gases. In INO experiment, RPC is used as a tracking device and provides active detection medium for neutrino induced muon particles. A magnetic field of 1.3 Tesla is applied to the iron to discriminate the opposite charged muon ( $\mu^-$  and  $\mu^+$ ) produced by neutrino and anti-neutrino interaction with the RPC. It exploits the opposite directional bending of muon, while traversing through layers of ICAL detector.



Figure 1.2: Constitutional details of ICAL along with RPC. These figures are adopted from Ref.[46, 47, 49]

The ICAL detector is further subdivided into three modules each of size  $`16m \times 16m \times 14.5m'$ . This modular structure allows early operation with the completed modules while constructing the others. The iron structure for this detector is self supporting with the layer above resting on the layer immediately below us-

ing steel spacers located every 2 meters along the X-direction. This will create 2 m wide roads along the Y-direction for the insertion of RPC trays. There will be a total of 8 roads per layer per module. In each road, there will be 8 RPC modules each with dimension  $2m \times 2m'$ . The readout of the RPCs will be performed by external orthogonal pick-up strips of 30 mm in pitch (center-to-center distance of pick up strips). Each RPC will have 64-read out strips along X-direction and 64 strips along Y-direction arranged orthogonally as shown in Fig.1.2(c). These strips as shown in Fig.1.2(d) behave like transmission lines with typical characteristic impedance of about 50  $\Omega$ .

A total of approximately 27000 such elements will be needed to complete the detector. The specifications of ICAL detector are given in Table1.2.

| Number of Modules                | 3                         |
|----------------------------------|---------------------------|
| One Module Dimension             | 16x16x14.5 m <sup>3</sup> |
| Complete Detector Dimen-<br>sion | 48x16x14.5 m <sup>3</sup> |
| Number of iron layers            | 151                       |
| Iron plate thickness             | 56 mm                     |
| Gap for RPC tray                 | 40 mm                     |
| Magnetic Field                   | 1.3 Tesla                 |
| RPC layers                       | 150                       |
| RPC units/ layer/Module          | 64                        |
| RPC units/Module                 | 9600                      |
| Total RPC units                  | 28,800                    |
| Total read-out strip channels    | $> 3.6 \times 10^{6}$     |
| Read-out strip pitch             | 30 mm                     |

Table 1.2: Specifications of ICAL detector

#### 1.4.2 Need of time-to-digital converter (TDC) in INO

In the ICAL detector, the magnetized iron plate is used as the target volume for neutrino interaction and RPC as path tracker for resultant charged particles. The CC interactions of neutrino with the iron leads to the production of muon and hadronic shower as per reaction given in equation(1.14). A muon typically produces a long track inside the detector, traversing many layers, while hadrons give rise to shower of secondary particles, confined within a few layers. In case of NC interactions, only hadrons are produced, which give rise generation of muons after hadronic decay.

$$\nu_{\mu} + n \to \mu^{-} + p \tag{1.14}$$

In CC interaction neutrino events, the energy of the neutrino can be estimated from the momentum of the tracked muon and the hit distribution of the hadron shower. The momentum is inferred from the curvature of muon track in presence of magnetic field. With high momentum, the curvature is small resulting large track length, whereas with low momentum, the curvature is large resulting small track length. The directionality of track is also used for up-down direction discrimination of neutrino particle, while traversing layers of ICAL detector. The direction discrimination is planned to be carried out with the help of time-todigital converter (TDC), as shown in Fig.1.3. Here, the produced muon particles interact with the gaseous medium of the RPC plate with 64 pickup strips [ $s_0 to s_{64}$ ] in X-direction. The interaction is localized inside RPC and produces an electrical signal on its corresponding pickup strip. This electrical pulse is applied to the ASIC based front-end electronics having voltage amplifier followed by a discriminator and one-fold OR logic. It processes a logical pulse, whose time of transition is corresponding to the time when amplified signal cross the set threshold in the discriminator. The one-fold OR logic groups the eight logical signals into one and thereby produces eight 1F signals [start-1 to start-8]. The time interval between rising edge of 'start' and 'global trigger' as 'stop' is measured by TDC. The 'global trigger' is obtained from the trigger module [51], which defines the validity of neutrino events based on user defined criteria. Moreover, the 'global trigger' signal enables the data acquisition system to record the coordinates of fired RPC pickup strips by registering the presence of logical pulse in pattern registers. The timing information provided by TDC along with pattern register code is used to build the track information of muon. On the other hand, for NC neutrino events, the hit distribution of the hadrons provided by pattern register code is used to estimate the



neutrino energy but there is no information about the direction of the neutrino.

Figure 1.3: Schematic representation of use of TDC in INO experiment

# Thus, in CC interaction events the main objectives of TDC development is to fulfill the following R and D requirements of INO experiments-

- The time of flight of muon particle from one ICAL detector layer to another separated by ~ 96 mm needs to be measured. The neutrino particle with energy threshold of 1 GeV traverses around 10-12 layers of ICAL at speed of light. This gives the TOF of ~ 0.32 nsec from one layer to another and ~ 3.2ns for consecutive 10 layers. This requires TDC with RMS resolution better than 2 ns and LSB as 200 ps.
- A TDC with dynamic range (DR) higher than 32 μs is required to account for the trigger latency (time elapsed in the generation and processing of global trigger from trigger module). In addition, large DR is required to capture the delayed event interactions. The delayed events arises from the decay of produced muons to neutrinos (or anti neutrinos) within its life time of ~ 2.2 μs.
- Measurement of time over threshold (TOT)[52] to implement the off-line time walk error correction. This requires multi-hit TDC with capability of '*start*' pulse width measurement with duration better than 5 ns and pulse pair resolution of 10 ns. In addition, multi-hit TDC is required to measure the occurrence time of delayed interaction events.

Further, to serve the  $\sim$  3.6 millions of signal read-out (pickup) channels in the ICAL detector, a power and area efficient multi-channel TDC is needed. Therefore,

FPGA and CMOS ASIC based solutions for the development of TDC are proposed. The choise between the two depends on various technological aspects on the performance of TDC.

The FPGA implementation method offers fast development cycle and reconfiguration. However several issues like high power consumption, unpredictable P& R delays and lack of control over delay variation are the bottlenecks in the achievement of high performance TDC. In addition, for multi-channel TDC required in the INO experiment, the FPGA has constraints of un-identical placement and routing of channels. This leads to channel-to-channel spread in resolution. Moreover, in context of bulk requirement, FPGA does not provide a cost effective solution.

In contrast, the CMOS ASIC based solutions are preferred to fulfill the bulk requirement of multi-channel TDCs with power and area efficiency. Further, it provides flexibility in the design to achieve high performance of TDC. Therefore, in this work CMOS ASIC technology are chosen to develop the TDC prototypes, which can fulfill the above stated R & D objectives for INO experiment.

## Part II

**Literature Survey** 

## Chapter 2

# Review of Time Interval Measurement Methods and Techniques

## 2.1 Introduction and applications of TDC

The measurement of precise time interval between two events is required in many applications like medical, consumer, industrial and research instrumentation. In order to cater to different requirements of these applications, various Time-to-Digital Converter (TDC) techniques and methods have been developed and reported since early 1960's. In this chapter a review of various time interval measurement methods and techniques that are used in the design of TDCs is discussed with key design aspect, strength and weakness of each technique. A time line flow diagram showing the evaluation of various time interval measurement techniques is also given. A brief description of TDC application is given below-

#### 2.1.1 High energy physics (HEP) & nuclear experiments

In HEP and nuclear experiments, a TDC is required to measure time of flight (TOF) of particles [57]-[65]. This timing information is used for particle track reconstruction and particle identification. The time interval measurement is also used in time of flight mass spectrometry (TOFMS) to determine mass-to-charge (m/e) ratio of charged particles. In these measurements, the particles are equally accelerated but they gain different velocities due to different m/e ratio. A TDC is therefore required to measure the difference in arrival times of various particles.

#### 2.1.2 TOF laser range finder

TDC is a key building block of the TOF measurement based laser range finder instruments[66, 67]. The laser range finding has variety of applications such as: profiling of hot surfaces [68], industrial method of surface analysis [69], traffic condition monitoring[70], height measurement [71], and 3-D imaging. Here, the laser range is determined by finding the time interval between the pulses, sent to and received back using expression  $\frac{1}{2} \times c \times \Delta T'$ , where 'c' is velocity of light and ' $\Delta T'$  is time interval. In these applications, a TDC with high resolution (of the order of few ps) and wide dynamic range (~ 100's of ns) is required in order to achieve the 'mm' level accuracy over a measurement range of several 100's of meters.

## 2.1.3 Ultrasonic imaging and thickness measurement of metal layers

A TDC is also used in ultrasonic imaging and thickness measurement[72] of metal layers, pipe walls, and synthetic foils. Here, the ultrasonic wave is reflected from both top and bottom surfaces of wall with thickness 'd'. The time interval between two reflected waves is 2d/v', where 'v' is the velocity of the wave in that material. In such application, TDC featuring moderate resolution (~ 300 ps) with large dynamic range is used as the thickness in cm is equivalent to the time interval in the order of 10's of  $\mu$ s.

#### 2.1.4 Ultrasonic density meter

In this application[72], a TDC is used to measure the time interval that is equivalent to the change in the velocity of sound while passing through the fluid of different concentrations. Here, the velocity of sound in the fluid directly varies with its concentration at fixed temperature. A TDC with resolution  $\sim$  500 ps is usually sufficient for these instruments.

#### 2.1.5 Positron emission tomography (PET) medical imaging

The PET images[73] are constructed through the radiographic tracers, injected into the patient's body. These tracers decay into positrons that further annihilate with the electrons, present in the body and thereby produces a pair of two 511 KeV photons traveling into the opposite directions. These photons are detected by scintillator-PMT detection ring of the PET system, where a TDC determines the time coincidence (time interval between arrivals of photons at detector) of these gamma ray photons to find line of response (LOR). With the help of various measured LORs using parallel processing channels, a tomography image is reconstructed. In such applications a low cost, low power and area efficient multichannel TDC is required. The other medical application of TDC is in the ultrasound diagnostic equipments and X-ray imaging detectors[74].

## 2.1.6 Analog to digital converter (ADC)

TDCs have a traditional application in the development of high precision and low power ADCs such as dual slope ADC. Another method of power and area efficient ADC implementation using TDC is reported in [75]. Here, the input analog voltage is converted into pulse delay time which is realized by current starved inverter having linear delay versus applied voltage characteristics. This pulse delay time is measured and digitized by a TDC.

## 2.1.7 Frequency synthesis in RF communications systems

With the advancement in CMOS technologies in sub-micron domain, TDCs have replaced the analog blocks such as phase detector and charge pump in the design of all digital PLL for modern RF communication systems [76, 77]. Here, the TDC output controls the frequency of digitally controlled oscillator (DCO).

## 2.1.8 In test & measurement instrumentation

A high resolution TDC is required in test and measurement instruments [78] that are used to characterize the timing performance parameters like skew, jitter, clockto-output delay, and setup & hold times of high speed circuits. The other applications of TDCs are in electronic test equipment such as digital storage oscilloscope[79, 80], logic analyzers, in CMOS process corner/strength estimator[81] of analog circuits, single photon counting [82], PM demodulator [83], temperature sensing [84], optical characterization of MOS circuit [85].

Considering the wide spectrum of applications of TDC, it is important to study about various time interval measurement methods and TDC implementation techniques on the basis of performance parameters, discussed in section (2.2).

Initially TDCs were implemented using ECL technology and discrete devices [67, 86, 87] with merits of high precision and accuracy. However, they were limited by high power consumption and complex circuit size. Later on, ECL gate

array was used, which reduced the power and area consumption[88].

The contemporary TDC solutions are based upon the FPGA and CMOS ASIC. The FPGA based implementation offers fast development cycle and reconfiguration. However, issues like high power consumption, unpredictable P&R delays, logic resource bottlenecks, and delay sensitivity towards temperature & voltage have to be taken into account during design & implementation. In addition, for multi-channel TDC, the FPGA has constraints of un-identical placement and routing of channels. This leads to higher channel-to-channel specification spread.

In contrast, the ASIC based solutions exhibit better power and area efficiency and are preferred for bulk requirement of multi-channel TDCs. It also provides the design flexibility to achieve high performance of TDC.

A review of TDC techniques based on above discussed implementation platforms were reported earlier, in 1984 [89] and 2004 [90]. The aim of this chapter is to provide the detailed description of time interval measurement techniques reported so far with the support of block and timing diagrams and an emphasis on ASIC based TDC design. This chapter also provides a time line flow diagram of techniques representing their chronological evaluation as well as briefly summarizes the techniques in tabular form.

## 2.2 Performance parameters of TDC

#### 2.2.1 Resolution (LSB)

The resolution or the equivalent term LSB (least significant bit) is the least value of time interval that can be measured distinguishably. The result, given by the TDC (as a digital word) must be multiplied by the LSB value in order to reconstitute the measured time interval.

#### 2.2.2 Precision (RMS resolution)

This parameter characterizes TDC for repetitive fixed time interval measurement. Due to the effect of random noise sources, the fixed time interval measured by multiple times, tends to vary around the mean vale with a standard deviation value denoted by ' $\sigma$ ', known as precision of measurement. It is a performance parameter to compare TDCs as includes the impact of various random noise sources during time interval measurement.

#### 2.2.3 Accuracy

Accuracy stands for the degree of correctness with which a measured time interval agrees with the actual value. The inaccuracy results from either systematic error sources or linearity error in measurement. The systematic error appears in the form of constant offset, which can be easily subtracted from the result as it is fixed. However, linearity error can fluctuate with the measured time interval, so it can be removed from the result only by means of look-up table (LUT). The error sources responsible for measurement uncertainty are presented in the following subsections-

#### 2.2.3.1 Quantization error

Quantization error ' $Q'_e$  is due to the finite amplitude of the time interval that can be digitized by TDC. It is measured in terms of LSB of time interval measurement. The quantization is assumed as a random variable having uniform distribution in the bounded range and independent of input time interval to formulate its impact on the performance of TDC. For time interval measurement, where asynchronous input signal '*start*' or '*stop*' arrives at any time within LSB of time measurement, the quantization error varies from 0 to LSB and its RMS (root mean square) value is  $LSB/\sqrt{12}$ . In time interval measurement techniques, where start and stop both are asynchronous to the LSB of measurement, ' $Q'_e$  varies from - LSB to + LSB and its RMS value is given by  $LSB/\sqrt{6}$ .

#### 2.2.3.2 Non-linearity error

Non-linearity error appears as deviation of TDC transfer (output versus input) characteristic from the ideal straight line. The source of non-linearity error is variation in the LSB of TDC, which is caused due to asymmetries in circuit layout, non-homogeneity of device parameters during device fabrication phase and systematic noise sources. The impact of LSB variation appears in the form of differential non-linearity (DNL), which is the deviation of single quantization step from ideal value of one LSB. The DNL errors accumulate over a long time interval measurement. The accumulated error appears as integral nonlinearity (INL) over a range of time interval measurement. For instance, in tapped delay line based time measurement techniques(refer section (2.4.2.3)), the static temporal variation in the signal edges from their ideal positions causes non-linearity error. The non-homogeneity in device parameters is the dominant source of non-linearity in modern CMOS process as with the reduction in physical dimensions of the device, the mismatch in device

parameters increases[91].

The sources of non-linearity errors do not pose a fundamental limitation over their prevention as by large device size, the degree of mismatching can be reduced and layout asymmetries can be reduced by multiple trials of back-end design. However in practice, sizing of transistor is optimized under consideration of other design specifications and design time is limited.

#### 2.2.3.3 Signal jitter

TDC uses an off-chip periodic clock as a reference for time interval measurement. The phase noise or jitter associated with the reference time acts as added component of random noise. The internal signal jitters caused due to the supply and substrate noise, bias voltage noise and MOS device noise also create measurement error. These jitter sources and their impact are discussed in chapter-3.

## 2.2.4 Dynamic range

Maximum time interval that can be digitized by TDC is its dynamic range.

## 2.2.5 Dead time

Dead time stands for time interval between the end of the measurement and start of next one. It depends on the conversion time of the measurement technique and read-out speed. It defines the measurement rate of TDC.

## 2.3 Methods of time interval measurement

The measurement of time interval can be generalized into two categories 'time interval measurement' and 'time stamping'.

## 2.3.1 Time interval measurement

It is used to measure the time interval between two logical events, 'start' and 'stop'.

## 2.3.2 Time stamping or Tagging

It is used to 'stamp' the time of occurrence of logical events with respect to the reference clock. This method can also be used to indirectly measure the time interval between two isolated events, '*start*' and '*stop*' by subtracting their corresponding time stamped values with respect to the reference clock period. It is further subclassified in single-hit and multi-hit time stamping as shown in Fig.2.1. In single-hit time stamping, time of occurrence of a single transition event (hit) is stamped. In multi-hit time stamping, the time of occurrence of multiple transitions (multi-hit) are stamped. Further, the relative time of hits is measured with respect to the trigger (reference hit), where the time of trigger is also stamped with respect to the clock.



Figure 2.1: Timing diagram of single-hit and multi-hit event time stamping

## 2.4 TDC implementation techniques

The various TDC implementation techniques, as illustrated in Fig.2.2 exhibits different performance metrics. The TAC based analog techniques provide high resolution but have limited dynamic range. The improvement in dynamic range of TAC based TDCs has been reported by the use of time to amplitude conversion chain (TACC) and TAC as fine interpolator.

In the digital techniques, direct counting of reference clock during the time interval to be measured provides large dynamic range, defined by the width of counter. However, its resolution is limited by the period of reference clock that further limits the accuracy of time interval measurement. In order to improve the accuracy, averaging method with direct counting has been reported for periodic time interval measurements.



Figure 2.2: Classification of time interval measurement techniques

Further in digital techniques, higher resolution is achieved by using a tapped delay line (TDL) that is realized by a cascaded chain of delay elements. In TDL based TDCs delayed taps of the input signal or the reference clock are used in numerous ways for precise time interval measurement. Various TDC topologies using inverter or buffer delay element based TDL have been reported such as time stamping, input interchangeable delay line, pseudo differential delay line, time memory chip, and multi-phase clock counting. In these techniques, the TDC resolution is determined by the minimum achievable gate delay in the given CMOS process that is limited due to parasitics. In order to achieve the sub-gate delay resolution, high speed delay elements such as RC, current mode logic (CML) and transmission gate have been used to realize TDL. The sub-gate delay resolution can also be achieved by different implementation techniques such as differential delay line, pulse shrinking delay line, parallel delay line, interpolation in DLL bin size and time amplification. In some TDL based techniques, the delay provided by delay element is made independent of PVT variations by using a delay lock loop (DLL).

The TDL based TDCs exhibit high resolution as compared to the direct count-

ing techniques. However, in order to achieve large dynamic range in TDL based TDCs, a long TDL is required to be implemented occupying larger silicon area. The benefits of large dynamic range in direct counting technique and of high resolution in TAC/TDL based TDC techniques have been combined in the Nutt's method. In this method, a coarse interpolator provides large dynamic range by direct counting of the reference clock and TAC or TDL based fine interpolator allows high resolution by measuring the fractional time within one reference clock period at the '*start*' and '*stop*' edges. Here, the length of TDL based fine interpolator is limited to measurement of one reference clock period only. Further, hierarchical techniques have also been reported to achieve the specifications of large dynamic range and high resolution simultaneously without using long TDLs.

There are a few other techniques of precise time interval measurement that do not incorporate TAC or TDL in their design. These techniques are mainly based upon start-able oscillators, shift registers and successive approximation. The detailed discussion on each technique while incorporating their merits and demerits are given in the following section-

#### 2.4.1 Analog time interval measurement techniques

The analog time interval measurement techniques are primarily based on the use of a time-to-amplitude (TAC) converter [88], where a capacitor (C) is linearly charged using a constant current source (I) for time interval ( $\Delta$ T) to be measured. The charging of capacitor is controlled by a switch, as shown in Fig.2.3(a). Here, the voltage ' $\Delta$ V', accumulated across the capacitor is proportional to the time interval ' $\Delta$ T' and is given by equation (2.1). This voltage can be converted into digital code by using an ADC.

$$\Delta V = \frac{I}{C} \times (T_2 - T_1) = \frac{I}{C} \times \Delta T$$
(2.1)

Ideally, a high time resolution (order of 1 ps) that depends on the LSB of ADC and choice of the constant of proportionality (C/I), can be attained through this technique. However, various factors such as charge injection & clock feed through in switch and leakage & memory effect of capacitor result in a variable offset in the voltage stored across capacitor, leading to deterioration of performance. In addition, stability & linearity of current source and accuracy of ADC are other factors that affect accuracy and linearity of the measurement. Moreover, variation in

supply and reference voltage (common mode noise) degrades the accuracy of measurement.

The dead time of TAC based current integration techniques is of the order of few microseconds, that depends on the conversion time of ADC and resetting time of logic circuits. The conversion time of ADC can be reduced by using a high speed flash ADC.

In multi-channel TACs, the dead time can be reduced by sharing the SAR-ADC among multiple multiplexed channels [57], as shown in Fig.2.3(b). Here, the number of TAC channels that can be integrated is limited by the amount of parasitic capacitor at node 'X'. This parasitic capacitance increases with the number of channels and its non-linear charging during transients degrades the measurement linearity. The mismatch in capacitors, causing variations in specifications of different TAC channels, is minimized by using common-centroid layout technique in [57].



Figure 2.3: Schematic diagram of (a) TAC method (b) 8-channel TAC based analog memory

Following section highlights the key design aspects and functionality of various reported TAC based TDC architectures.

#### 2.4.1.1 Differential TAC based TDC

In order to reduce the effect of common mode noise, Differential TAC [92] is preferred. It uses two identical TACs to convert the time intervals from '*start*' and '*stop*' to the end of a '*gate*' window into voltages independently, as shown in Fig.2.4. The differential amplifier amplifies the difference of output voltages of two TACs ( $V_{cap(start)}$  and  $V_{cap(stop)}$ ), that is equivalent to the time interval between start and stop ( $\Delta$ T) and is given by equation (2.2).

$$\Delta T = t_{start} - t_{stop} = \frac{C \times (V_{cap(start)} - V_{cap(stop)})}{I}$$
(2.2)



Figure 2.4: Differential TAC (a) schematic diagram (b) timing diagram

#### 2.4.1.2 Dual slope or analog time stretching technique

In the above discussed TAC based time measurement techniques, a separate ADC is used, which increases circuit complexity as well as power and area consumption. The dual slope or analog time stretching technique [93] based on the principle of time-to-time-to-digital conversion, avoids the need of separate ADC. In this technique, the time interval under measurement is extended by the stretching factor (S), independent of circuit parameters. The extended time interval window, thus obtained is digitized by counting the number of reference clock periods ( $T_{ref}$ ) using counter.

To implement this scheme, Wilkinson ADC technique with salient features of high resolution and linearity is used. As shown in Fig. 2.5(a), it charges a capacitor 'C' linearly through a large constant current ( $I_1$ ) within the applied time interval. Subsequently, this capacitor is discharged through a small constant current ( $I_2 = I_1/N$ ) until it attains zero volts. As  $I_2 < I_1$ , the capacitor discharges slowly and takes longer time ' $T' > \Delta T$  to fully discharge. The voltage across this capacitor is compared to zero volts with the help of a comparator to obtain a stretched time interval window. The stretching factor 'S' is defined as the ratio of discharging time (T) to the charging time ( $\Delta T$ ) of capacitor. For a stored voltage ' $\Delta V'$  across the capacitor 'C', the charging and discharging durations are given by-

$$\Delta T = \frac{C}{I_1} \times \Delta V \tag{2.3}$$

$$T = \frac{C}{I_2} \times \Delta V \tag{2.4}$$

Therefore, the stretching factor is,  $S = T/\Delta T$ 

$$= (C/I_2)/(C/I_1)$$
, where  $I_2 = I_1/N$ 

 $\Rightarrow$  S=N, where N is constant

Thus, the stretching factor is independent of absolute values of circuit parameters (current and capacitor).

Further, the extended duration 'T' is digitized by a counter. For the count 'n', the conversion expression is:

$$\Delta T = n \times \frac{T_{ref}}{S} \tag{2.5}$$

The measured time interval ( $\Delta T$ ) using dual slope technique is given by the above equation(2.5). Here, the resolution  $T_{ref}/S'$  is independent of absolute values of current and capacitor. This technique is therefore independent of process variations. However, the inaccuracies of switches (due to charge injection and clock feed-through) and noise coupled from the clock to substrate and supply rails contribute in DNL error of time interval measurement.

From equation(2.5), the LSB varies inversely with stretching factor and directly with time period of clock. Therefore, it can be improved by increasing either the clock frequency or stretching factor. However, the use of high clock frequency leads to higher power consumption. The increment in stretching factor is therefore preferred in the revised versions of this technique at the cost of conversion time. In [93], the stretching factor is increased by using two separate capacitors  $C'_1$  and  $C'_2$  $(C_1 < C_2)$  that are discharged at different rates, as shown in Fig. 2.5(b). Initially, both capacitors are pre-charged to a reference voltage. The '*start'* signal enables the discharging of capacitor  $C'_1$  through a large constant current  $I'_1$ . After time interval  $\Delta T$ , the '*stop*' signal disables it's discharging. Simultaneously, it enables the discharging of big capacitor  $C'_2$  using small constant current  $I'_2$  until it attains the voltage similar to that across  $C'_1$ . The modified stretching factor is:

 $S = M \times N$ , Where  $C_2 = M \times C_1$  and M is constant.



Figure 2.5: Dual slope techniques using Wilkinson principle (a) with single capacitor and current source(b) with separate capacitors and current sources

Here, the significant stretching can be achieved if there is a large difference in both the capacitor sizes. However, utilization of large capacitors is avoided in CMOS ASIC designs due to die area and cost considerations. Other area efficient techniques of increasing the stretching factor are synchronized discharging and Vernier charging as discussed below.

In synchronized discharging [94], the capacitor discharges on high duration of clock with duty cycle (< 50%), as shown in Fig. 2.6. This reduces the speed of discharging of capacitor and thus improves stretching factor.



Figure 2.6: Dual slope using clock synchronized discharging of capacitor

The Vernier charging method [95] improves stretching factor through successive approximation of a target voltage by various iterations of alternate charging and discharging of a capacitor for duration  $T'_D$  and  $T'_E$  ( $T_D > T_E$ ) respectively. It also avoids the dependency of resolution on the reference clock frequency. As shown in Fig. 2.7, the time interval under measurement is divided in 3 parts such that  $\Delta T = T_C + T_{fA} - T_{fB}$ . The time 'T'\_C equivalent to full clock periods is measured by the counter. To measure the difference between fractional time intervals  $T'_{fA}$  and  $T'_{fB}$ , capacitor  $C'_{1}$  is charged by using large constant current  $I'_{1}$  during ' $T'_{fA}$ . Then, it is discharged during ' $T'_{fB}$  using the same current and attains an intermediate voltage level. This voltage is compared to the voltage ' $V_2$ ' across a larger capacitor  $C'_2$  using a comparator. The voltage  $V'_2$  approaches this intermediate voltage successively by vernier charging method. In this method, the values of non-overlapping time interval windows  $T'_D$  and  $T'_E$  are chosen such that both lie in each half of the reference clock cycle, being counted by the counter. When voltages across both the capacitors become equal, the comparator toggles its output disabling the counter.



Figure 2.7: Vernier charging method

If 'n' is the number of counted cycles then for voltage stored across both the capacitors:

$$\frac{I_1 \times (T_{fA} - T_{fB})}{C_1} = 2 \times n \times \frac{I_2 \times (T_D - T_E)}{C_2}$$
(2.6)

$$\Rightarrow \frac{N \times I_2 \times (T_{fA} - T_{fB})}{C_1} = 2 \times n \times \frac{I_2 \times (T_D - T_E)}{M \times C_1}$$
(2.7)

$$\Rightarrow (T_{fA} - T_{fB}) = n \times LSB \tag{2.8}$$

Where,  $LSB = 2 \times (T_D - T_E)/(M \times N)$ 

The limitation of dual slope method is the input dependent large conversion time expressed by ' $\Delta T \times S'$ . In addition, it has an inherent trade-off in resolution and measurement rate of the TDC. For instance, resolution of 1 ps with a 100 MHz (10 ns) clock requires a stretch ratio of S = 10,000 and implies a conversion time of 150  $\mu$ s for 15 ns of time interval measurement.

The TAC based techniques are efficient for time measurement with high resolution but are limited by dynamic range. Time interval measurement with high resolution and large dynamic range requires measurement in two parts. First is coarse time measurement by counting the full cycles of reference clock within time interval. The second is fine measurement of the fractional time (within one clock period) by TAC or time stretcher. Based on this scheme, a TDC using time to charge converter with on-chip Wilkinson ADC and counter has been reported [96]. Another approach to implement the above scheme is 'time to amplitude conversion chain' (TACC), based on analog coarse and fine counting.

#### 2.4.1.3 Time to amplitude conversion chain (analog coarse and fine counting)

This technique [97] uses periodic charging and discharging of a capacitor triggered by '*start*' and ended by '*stop*'. The number of cycles of triangular clock thus obtained till the arrival of '*stop*' gives coarse count as shown in Fig. 2.8. The voltage across the capacitor sampled by '*stop*' is digitized by ADC to provide fine count.

This scheme is implemented using operational amplifier based charge integrator, two comparators, ADC, two flip-flops and a counter. Here, the '*start*' enables the charge integrator to produce voltage ramp-up at its output till it becomes equal to the reference voltage of ADC. This is detected by a comparator, which toggles its output to set a flip-flop. The output of this flip-flop enables the voltage ramp down until it becomes zero volts. This is detected by another comparator, which resets the flip-flop to further enable the voltage ramp up. Thus, cycle of subsequent voltage ramp up and down provides a triangular clock. The '*stop*' signal disables the integrator and enables sample and hold circuit with the help of switches. The final voltage across the capacitor in the integrator is sampled and is digitized by the ADC. This analog scheme is used to improve the dynamic range of TDC without sacrificing the resolution.



Figure 2.8: Timing diagram of TAC conversion chain

#### 2.4.1.4 TDC using TAC and tapped delay line (TDL)

In this technique[58, 59], the charging of capacitor 'C' is regulated by TDL that is realized by a cascaded chain of tri-state buffers as shown in Fig. 2.9. The delay line taps are connected to a ladder of resistor 'R'. The other end of the resistor ladder is connected together to the capacitor 'C' at the summing point. This structure forms an analog adder as each delayed replica of '*start*' passing through TDL provides an additional unit of charging current to the capacitor 'C' through resistor 'R'. The '*stop*' signal disconnects the resistor ladder from the delay line by disabling the tristate buffers. The voltage thus obtained across the capacitor 'C' is equivalent to the time interval between '*start*' and '*stop*' and is digitized by an ADC. The merits of this technique are high resolution beyond the intrinsic buffer delay and lower static power consumption. In [60], to achieve the accuracy of measured time interval, the delay of TDL is regulated against PVT variations using DLL.



Figure 2.9: Time interval measurement technique using TAC with TDL

#### 2.4.2 Digital time interval measurement techniques

The TAC based analog techniques of time interval measurement are very efficient in providing high time resolution (~ 10's ps) but this merit is being limited by reduction in the supply voltages and hence reduced noise margins in the current submicron semiconductor technologies. On the contrary, the advancements in submicron semiconductor technologies have enabled the digital methods of time interval measurement to achieve the resolution comparable to the analog techniques along with exhibiting large scales of integration with optimum power consumption. Fully digital TDCs are therefore now gaining popularity in wide range of applications such as TDC based temperature sensor, all digital PLLs and also used as a replacement of ADCs for time domain signal processing. In this section, TDCs implemented using fully digital topologies are presented along with a few mixed architectures that incorporate small analog blocks like DLL/PLL to benefit from the best of two worlds.

#### 2.4.2.1 Direct counting based TDC

This technique has been used in digital stopwatches and timers [98] for time interval measurement with large dynamic range. Here, the number of elapsed cycles of stable reference clock between '*start*' and '*stop*' is proportional to time interval between them. This number is determined using a counter by applying a clock for given time interval duration. The period of this clock defines the time resolution, so the maximum quantization error is within one count.

There are two ways of applying the clock to counter for the time interval duration: one is direct gating and another is synchronized gating. The direct gating [99] is simple to construct but may result more than one count error due to the truncation of clock pulse (at the moment of gate opening and closing) as shown in Fig. 2.10 (a). In order to minimize this error, synchronized gating is used as shown in Fig. 2.10(b) that operates on the edges of clock instead of level. In case of synchronized gating the maximum error would be of one clock cycle at the time of gate opening as well as closing (refer Fig. 2.10(a)). The total error in time interval measurement is the difference of these two that is within one count. Further, to avoid count error at the beginning of time interval measurement, startable ring oscillator is used as shown in Fig. 2.10(c). Here, the '*start*' signal triggers an oscillator and then its cycles are counted till the arrival of '*stop*'. The end of conversion (eoc) signal, which stops the counting, is obtained by synchronizing the '*stop*' at falling edge of the clock. To limit the measurement error within one count at the time of 'stop', one count is subtracted from the final result if stop occurs in falling edge duration of the clock.



Figure 2.10: (a) Timing diagram of direct counting method (b) circuit used in synchronized gating (c) time interval measurement using startable ring oscillator

The gated clock method is not efficient for multi-stop or multi-channel time interval measurement as it requires replication of counters as per number of stops or measurement channels. Therefore, a free running counter that is shared between all measurement channels is used till its rollover. It is initialized by 'enable' signal. The inputs 'start' and 'stop' individually sample the state of the counter and latch the corresponding counts into registers. The subtraction of both the counts gives time interval. Here, if 'start' or 'stop' occurs during toggling of counter, the metastability of registers causes unpredictable outputs. Therefore, a synchronizer is used between input (start/stop) and counter to sample its state on safe edge of clock.

The merit of direct counting is small conversion time as settling time of registers is of 100's of picoseconds. In addition, good linearity in time interval measurement can be obtained by assuring the stability of clock. However, the accuracy of measurement in direct counting technique is limited due to poor resolution (one clock period). Higher resolution could be achieved by using a high frequency clock either from external source or from an on-chip PLL. However, the maximum operating frequency is limited by the choice of technology. Also, the on-chip PLL based solution turns out to be power consuming and complex due to the issue of phase and frequency locking.

An improvement in accuracy of direct counting method has been reported (Reed, 1964, Hewlett Packard)[99] while using a low frequency clock by averaging of large number of measurements. It is based on the statistical reduction in quantization and random errors, if time interval is asynchronous to the reference clock and inputs are repetitive. The averaging reduces the measurement error by square root of the number of measurements but it limits the measurement rate of TDC.

#### 2.4.2.2 Nutt's interpolation method

Nutt's interpolation technique (1960's) was reported in order to improve the resolution of direct counting method. In this technique, the fractional time bounded between consecutive states of counter is measured using techniques featuring high resolution and is named as fine interpolator. Thus, the entire time interval measurement is carried out in two parts: one is coarse and other is fine, as shown in Fig.2.11.The coarse time ' $T'_C$  is equivalent to full reference clock cycles within time interval window and is counted by the counter. The fractional time intervals ' $T'_A$ and ' $T'_B$  at the beginning and end of time interval window are measured by the fine interpolator. The fine interpolator is designed for dynamic range of one coarse



clock period to synchronize the coarse and fine counts.

#### Figure 2.11: Nutt's interpolation technique (a) timing diagram (b) block diagram

The measured time interval is given by equation(2.9).

$$\Delta T = T_A + T_C - T_B \tag{2.9}$$

Fig.2.11 (b) shows the block diagram of interpolation technique, where a synchronizer is used to obtain time interval windows  $T'_A$ ,  $T'_B$  and  $T'_C$ . It is also used to mask the '*stop*' signal occurring prior to the '*start*'.

The merit of this interpolation technique is reduction in non-linearity of fine interpolator by averaging several measurements of fixed time interval. Here, the values of  $T'_A$  and  $T'_B$  vary randomly but absolute value of  $T_A - T'_B$  remains constant. This averages out the non-linearity of interpolator.

Nutt's technique have been implemented using TAC[67], analog time stretcher[93, 100] and tapped delay line (TDL)[88, 101] in several TDC prototypes.

In Nutt's interpolation technique with TDL based fine interpolator; DLL is used to provide the delay variation immunity in PVT variations. However, DLL is power and area inefficient in view of requirement of large number of delay elements to achieve the sub-nanosecond LSB using low frequency clock (10' s of
MHz).

The requirement of DLL has been avoided by using an in-built digital calibrator to calibrate the unit delay (LSB)[102]. It is based on measuring the time-period  $(T_{osc})$  of ring oscillator derived from TDL. This time-period when divided by number of stages in TDL gives LSB of time interval measurement.

To implement this calibration scheme, a multiphase ring oscillator is designed using even number of cascaded differential inverters[103]. The oscillations are maintained by cross coupled feedback of outputs as shown in Fig.2.12. This type of structure of ring oscillator avoids coupling stage and intrinsically realizes the interpolation of clock period by using its intermediate phases.

To calibrate the LSB, the measurement circuitry is used in the calibration mode for the known time interval ( $\Delta$ T). An on-chip calibration counter generates the '*start*' and '*stop*' signals with known time interval using a reference clock provided by an off-chip crystal oscillator. Using conversion equation(2.10) of Nutt' s interpolation technique, the  $LSB = T_{osc}/N$  can be calibrated.

$$\Delta T = N_c \times T_{osc} + (N_A - N_B) \times \frac{T_{osc}}{N}$$
(2.10)



Figure 2.12: Use of multi-phases of ring oscillator with calibration in Nutt's interpolation technique

The other variant of interpolation is reported as asynchronous interpolation technique[71], where an event trigger-able oscillator with time period ( $T_{osc}$ ) triggered by 'start' is used for coarse counting ' $T'_C$  instead of using a reference clock, as shown in Fig.2.13. The fractional time interval ' $T'_F$  in between 'stop' and next rising edge of 'clock' is measured using fine interpolator realized by time interval measurement technique with resolution ( $T_d$ ) higher than direct counting. The conver-

sion expression for this technique is given by equation(2.11), where  $N_C'$  and  $N_F'$  are the counts provided by counter and fine interpolator respectively. Here, the commencement of time measurement is asynchronous to the system clock. This technique is generally used for TDC applications in space exploratory missions and HEP experiments with random events.

$$\Delta T = T_C - T_F = T_{osc} \times N_C - T_d \times N_F \tag{2.11}$$



Figure 2.13: Timing diagram of reference clock asynchronous time interpolation technique

#### 2.4.2.3 Tapped Delay Line (TDL) based time interval measurement

A tapped delay line comprises of cascaded chain of delay elements that generates delayed replicas of either the '*start*' and '*stop*' signals or the '*referenceclock*'. These delayed replicas are then tapped and used in numerous schemes to implement high performance TDCs.

The simplest form of TDL based time interval measurement technique uses flash conversion principle, where as shown in Fig.2.14, the delayed taps of '*start*' signal are concurrently sampled and latched into a register on the arrival of the '*stop*' signal, as reported in [88, 90, 104, 105, 106]. Here, delay ( $T_d$ ) provided by each delay element defines the LSB or resolution of the time interval measurement. The number of elapsed delayed '*start*' till '*stop*' is proportional to the time interval between them. The latched register code is in the form of Thermometer code, where first logic transition count multiplied with the unit delay ( $T_d$ ) gives the time interval.

The performance of a TDL based TDC techniques mainly depend upon the choice of delay element characteristics. It is therefore prudent to briefly discuss various performance metrics of a TDL before describing the TDL based TDC techniques.



Figure 2.14: Time interval measurement using tapped delay line

# Performance metrices of TDL

(a) Unit delay: The bin size or resolution of time interval measurement depends on the smallest attainable delay from the delay element. This makes the choice of delay element a crucial aspect of TDC design.

(b) Matching of the delay elements: The linearity of TDL based time interval measurement techniques depends on the matching of tapped delays. The sources of delay mismatch across the taps are systematic errors and random variation in device parameter[107]. As the random variation in device parameters are inversely proportional to the device area [91], therefore, can be reduced by choosing large aspect ratio of transistors in the design of delay elements.

(c) Length of TDL: The dynamic range of time interval measurement is given by the product of 'unit delay and number of delay elements' in the delay line. However, use of a long delay line in order to achieve large dynamic range, is not preferred due to following two reasons-

• It is power and area inefficient.

• The differential non-linearity error due to mismatch in delays among consecutive taps of delay line accumulates over the delay line. This accumulated error appears in the form of integral non-linearity, which increases with the length of delay line.

(d) Accuracy and stability of TDL: Moreover, the stability of unit delay across process and operating condition variations determines the accuracy of time interval measurement. The CMOS buffer delay is sensitive to PVT variations. Therefore, a delay variation compensation circuit DLL[108] is used, where the delay of TDL is referenced to an external clock with the help of a feedback loop. It measures the total delay of delay line and compares it with an external reference clock. The timing error after comparison is corrected due to negative feedback of DLL in various iterations. This method ensures the accuracy of measurement. However, DLL turns out to be a complex, power & area consuming solution.

# 2.4.2.3.1 TDL based time interval measurement techniques (gate delay resolution)

An inverter is the basic delay element that provides minimum delay in a given CMOS technology. However, due to asymmetric rising and falling edge delays of an inverter, there is a mismatch in consecutive tapped delays. Therefore, a buffer (two cascaded inverters) is generally used as delay element. It maintains the polarity of input signal with uniform delays among the taps. However, the least attainable delay of buffer is twice of that provided by an inverter. Further, the minimum delay achievable is also limited by the parasitic of a given technology, thereby limiting the resolution of time interval measurement in that technology. The TDC topologies reported to exhibit gate-delay resolution are as follows-

# 2.4.2.3.1.1 Direct counting using TDL

In this technique, a TDL provides delayed copies of applied reference clock. Corresponding to each delayed clock, a counter is used to count the number of cycles between '*start*' and '*stop*'. All the counters are enabled by '*start*' and disabled by '*stop*' signal so that the addition of counts provided by counters multiplied by the resolution gives time interval. Here, the resolution is defined by the unit delay of TDL. This technique achieves improvement in the resolution of direct counting method but at the cost of use of multiple numbers of counters that are proportional to the number of delayed clocks.

## 2.4.2.3.1.2 Latch based TDL

In this technique, the TDL itself is realized by using chain of latches that are transparent for propagation of '*start*' until '*stop*' occurs. The D-to-Q delay of latch defines resolution of time interval measurement. This architecture consumes relatively less power and area as it does not require separate chain of delay elements.

# 2.4.2.3.1.3 Tapped delay line using inverter as a delay element (Pseudo differential delay line)

The resolution of TDL can be improved by a factor of two if inverter is used instead of a buffer as a delay element. However, due to the contribution of both rising and falling edge delays of the inverter in time interval measurement, linearity of tapped delays is poor as compared to that in buffer based TDL. By adjusting the aspect ratio of MOS transistors, both edge delays of inverter can be made nearly symmetric. However, the symmetry in delays is not maintained across process variation. In addition, the setup time of flip-flop used in TDL is not symmetric for input rising and falling edge transitions.

The non-linearity of inverter based TDL is circumvented in [81], where a Pseudo differential delay line with sense amplifier based flip-flops (SAFF) is used. The SAFF consists of a sense amplifier in the first stage and R-S latch in the second stage. The sense amplifier senses the true and complementary differential inputs. It produces a high to low transition in one of the output signal on leading clock edge. The S-R latch captures each transition and holds the state until next clock edge arrives, thereby the whole structure works as flip-flop.

As shown in the Fig.2.15 (a), '*start*', and its complementary signal pass through two identical delay lines of cascaded inverters. The '*stop*' signal samples and latches the status of delayed versions of '*start*' and its complementary signal using SAFF. In the latch register chain, inputs of alternate flip-flops are interchanged between delayed start and corresponding delayed complimentary to avoid the impact of asymmetric delays of inverter. The sense amplifier based flip-flop is designed with small & symmetric setup time window for both rising and falling edge input transition. Due to small setup time window of flip-flops, the resolution of time interval measurement is limited by the inverter delay. As shown in Fig. 2.15 (b), the combined output of flip-flops is a thermometer code like in buffer based TDL, where first logic '1' to '0' transition count multiplied by inverter delay (average of rising and falling delay) gives the measured time interval. This technique has been used to design TDC with 20 ps resolution in 90 nm CMOS process for phase error detection in all digital PLL [81].



Figure 2.15: Pseudo differential delay line (a) block diagram (b) timing diagram



#### 2.4.2.3.1.4 Input interchangeable tapped delay line

Figure 2.16: Input interchangeable tapped delay line (a) block diagram (b) timing diagram

This time interval measurement scheme [109] is capable of measuring negative time interval that is when '*stop*' occurs before '*start*'. This type of measurement is similar to the sound direction sensitivity in the nerve system of Barn Owl. Here, the direction of sound is detected by comparing the timing of sound coming from a source to its ears. For time interval measurement, the '*start*' and '*stop*' pass through the identical TDLs in opposite directions as shown in Fig.2.16 (a). The timings of delayed versions of '*start*' and '*stop*' are compared in the reverse order i.e. the first delayed version of '*start*' is compared to last delayed version of '*stop*'. In this order of comparison, the time interval between '*start*' and '*stop*' reduces on each tap, which leads to coincidence as shown in Fig.2.16 (b). The AND logic gate is used to detect coincidence and leading of '*stop*' from '*start*' by setting the data input to logic '1' before clock input of the flip-flop. The number of logic '1' in the output code multiplied by unit delay gives the time interval to be measured. To determine the total number of logic '1' in the code, each AND gate output signal samples the status of its previous gate output.

In this architecture, the circuit has structure and functional symmetry so, the role of '*start*' and '*stop*' are interchangeable. Hence, it fulfills the requirement of negative time interval measurement.

## 2.4.2.3.1.5 TDL based time stamping technique

In many HEP experiments, it is required to stamp the time of occurrence of random events with respect to a reference clock in a high event rate and multi-hit environment. In such applications a multi-channel, multi-hit TDC with high resolution, large dynamic range and small conversion time is preferred.

The techniques reported in [108, 110, 60, 111], have implemented a DLL based TDL to stamp the time of occurrence of random events with respect to a reference clock. The total delay of TDL is continuously calibrated using feedback loop of DLL to maintain it equal to one reference clock period as shown in Fig.2.17. The DLL ensures the accuracy of time stamping across PVT variations.

The TDL provides delayed replicas of reference clock with time period  $T_{ref}$  as per equation(2.12), where 'N' is number of delay elements and ' $T'_d$  is unit delay. The number of elapsed delayed clocks till the occurrence of '*hit*' stamps its time from the previous rising edge of reference clock. Here, the range of time stamping is confined within one clock period ( $T_{ref}$ ). To improve the range, the frequency of clock has to be reduced, which further demands an increase in the length of delay line (refer equation(2.12)). Alternatively, a counter synchronized to the reference clock is used to improve the range of time stamping. Thus, time stamping of hit is carried out in two parts: coarse measurement (full clock cycles) using counter and fine measurement (fractional time within one clock period) using TDL as shown in Fig.2.17 (b).

$$T_{ref} = N \times T_d \tag{2.12}$$



Figure 2.17: Time stamping using TDL (a) block diagram (b) timing diagram

The '*hit*' on its detection samples the status of counter and TDL in coarse and fine registers respectively. The first logic '1' to '0' transition count in fine register code gives fine count ( $N_f$ ) that is equivalent to number of elapsed delayed clocks before hit.

For accurate coarse measurement, it is required to ensure that the latching of coarse counter occurs on the safe edge of reference clock when the counter is in stable state. If latching occurs on the same clock edge when counter is in switching state, the coarse register may be in metastable state resulting in ambiguous coarse count. However, latching on the safe edge will cause an error of one coarse count if the hit comes after the safe clock edge during one clock cycle. A dual edge synchronization [112] and dual counter method [108] have been utilized to avoid metastability and respective coarse count errors.

If  $N'_c$  and  $N'_f$  are coarse and fine register binary values respectively, the arrival time of hit is given by equation(2.13). Here, as coarse count is added with fine count, so accounted value of coarse count is  $N_c - 1'$ .

$$T = (N_c - 1) \times T_{ref} + N_f \times T_d \tag{2.13}$$

In time stamping of multi-hit events, the double hit resolution is crucial design aspect. The TDL with single fine register cannot resolve two consecutive hits occurring within one clock cycle. Therefore, in the reported multi-hit TDCs [113, 114], a single transition is allowed to be detected in the range of TDL (one clock period).

Further, time stamping can be utilized in triggered as well as non-triggered applications. In non-triggered applications, time of hit is given relative to the time of the reference clock. In triggered application, the time of trigger is also stamped in the reference channel, so that the relative time of hits is given with respect to trigger.

This technique is efficient for multi-channel TDCs as free running counter and TDL can be shared between multiple channels of TDC thereby reducing the power consumption per channel. However, the memory access time for each channel and read-out speed adds a significant dead time in the measurements. This limits the utilization of this technique in HEP experiments, where dead time less measurement of events along with data buffering capability is required.

## 2.4.2.3.1.6 Shift register based time stamping

In this technique [61], the hit line is injected to the pipeline of registers, synchronized to the clock. On each clock cycle, hit line is sampled and the sampled data is shifted subsequently. Thus, time-period of clock defines resolution of hit line sampling. Further, the amount of shifting needed depends on the trigger latency as relative time of sampled hits is measured with respect to time of trigger. Therefore, depth of the shift register is chosen as per the specified trigger latency.

This technique requires high clock frequency order of 1 GHz to achieve resolution of 1 ns during hit sampling. In addition, power consumption is high (nearly 1.2 W at the rate of 1 GHz) due to shifting of sampled data on each cycle. This limits the application of shift register based TDC in the HEP experiment with millions of channels. To reduce the power consumption, time memory chip (TMC) based TDC architecture [62, 63, 64] has been reported and discussed below-

#### 2.4.2.3.1.7 Time memory chip

This technique uses low frequency clock to achieve same resolution for sampling the hit line (with multiple events) as was in shift register based TDC [61]. Here, the delayed replicas of clock within one clock period are used as memory write signals to sample the hit line in the low power CMOS static memories. The time interval

( $T_d$ ) between consecutive delayed clock signals defines resolution of sampling. A TDL is used to obtain the delayed clocks with time interval ' $T'_d$  as shown in Fig.2.18.



Figure 2.18: Schematic diagram of digital memory based time interval measurement

Here, reduction in power from 1.2 W for 256-bit memory [61] to 7 mW /channel for four-channel 1K memory 'TMC4004' [65] has been achieved as-

1. The required clock frequency is lowered by the length of delay line to achieve the same resolution (1ns). This leads to the increment in word size of memory in the same proportion.

2. This scheme does not involve the shifting of sampled data.

In pipeline architectures to increase the dynamic range of measurement, the expansion in memory size is required, which is cost effective. Therefore, low cost gate array version of TMC with separate dual port memory ( $24 \times 128$ ) for data buffering and four channels of 32-bit memory for hit-line sampling is reported. There are some major differences in time stamping and pipeline techniques. In pipeline, the storage depth of memory is chosen as per the trigger latency. While, in time stamping it is as per the expected hit rates in the experiments.

In the above discussed digital techniques whether for time interval measurement or time stamping, the best attainable resolution is defined by the smallest delay provided by used delay element. The other approach to achieve high resolution is to use the sub-gate delay resolution techniques, discussed in the following section-

# 2.4.2.3.2 Sub-gate delay resolution based TDC techniques

The sub-gate delay resolution can be achieved by-

- Using a high speed delay elements in TDL
- Using time interval measurement techniques that do not depend on absolute delay of the delay element. These techniques include differential delay line, pulse shrinking delay line, gate delay interpolation, and parallel delay line and time amplification based TDC architecture.

# 2.4.2.3.2.1 Use of high speed delay element in TDL

# (a) Current Mode Logic (CML) based differential delay element:

A TDL utilizing the high speed differential delay element based on current mode logic (CML) [chapter-3] is reported in [109]. It provided a time interval measurement resolution of 25.5 ps with 0.35  $\mu$ m CMOS technology, where intrinsic buffer delay is 150 ps. However, it had a high static power consumption (1 mA/delay element), making it unsuitable for HEP experiments with millions of channels and low power portable instruments.

## (b) Passive RC delay element:

The passive RC delay element can be used as low power and fast delay element. In addition, the RC delay is stable due to lower sensitivity to the variation in temperature and supply voltage (typically 500 ppm per volt and few tens  $ppm/{}^{0}C$ ) [108]. However, it is technology process sensitive. Therefore, only a startup calibration is needed to calibrate its delay for a given process. In spite of these advantages, utilization of these delay elements is constrained due to the following reasons-

- The propagation delay of RC-delay line is a quadratic function of number of stages as per Elmore delay equation [115]. This causes non-linearity in the tapped delays.
- The RC-delay line is not able to produce the uniform delayed replicas of applied '*clock*' or '*start*' signal due to successive attenuation in the amplitude of clock on the taps. This is attributed by low pass filter characteristic of RC-delay element. The number of RC elements (N) in the TDL defines the order of filter. The gain (*V*<sub>out</sub>/*V*<sub>in</sub>) and -3db frequency of the N-order filter reduces with its order as per equation(2.14) and equation(2.15) respectively, where *f*<sub>c</sub>

is calculated corner frequency. In[116], buffers are used in between the taps of long RC-delay line in order to minimize above discussed issues to some extent.

$$(V_{out}) = (1/\sqrt{2})^N \times V_{in}$$
 (2.14)

$$(f_{-3db}) = f_c \times \sqrt{2^{1/N} - 1}$$
 (2.15)

• The transmission line [117] such as 'co-axial cable' and 'micro-strip' on PCB of certain length is used as high speed tapped delay line. Here, the resolution is equal to length of unit tap, which is equivalent to ps. The propagation delay for unit length is given by  $\sqrt{L \times C}$ , where 'L' and 'C' are inductance and capacitance per unit length of TM line. However, the non-linearity in tapped delay,temperature, resistive and dielectric losses affect the delay characteristics and hence the TDC precision.

## (b) Transmission gate based delay element:

A transmission gate serves as a high speed, low power, low parasitic and area efficient delay element. However, similar to RC delay lines cascading of transmission gates also causes attenuation of the signal amplitude and delay non-linearity due to the quadratic dependency of delay on number of stages. In addition, due to switching inaccuracies such as charge injection, the quality of waveform degrades at successive stages of TDL.

## 2.4.2.3.2.2 Differential (Vernier) delay line

This technique, as reported in [118, 119, 120] uses two tapped delay lines having slightly different unit delays in order to achieve sub-gate delay resolution in time interval measurement. The inputs '*start*' and '*stop*' are applied to the '*slow*' and '*fast*' tapped delay lines having unit delays  $\tau_{st}$  and  $\tau_{sp}$  ( $\tau_{sp} < \tau_{st}$ ) respectively as shown in Fig.2.19 (a). Due to the difference in speed, on each tap, the '*stop*' signal approaches the '*start*' by the step size of  $\Delta T_d = \tau_{st} - \tau_{sp}$ , as shown in Fig.2.19 (b). Eventually, the '*stop*' signal coincides and subsequently leads the '*start*' signal. The number (n) of stages in the delay lines that are crossed by both the signals till the coincidence occurred is used to find the time interval between them. In order to determine this number, the '*stop*' signal samples and latches the status of '*start*' on each tap. Finally, the register code gives the value of 'n' in the form of a thermometer code, where the position of logic '1' to '0' transition gives the point of coincidence.



Figure 2.19: Differential delay line technique (a) block diagram including (LSB) stabilization using DLL (b) timing diagram

The conversion expression is given by equation(2.16), where the difference of unit delays ( $\tau_{st}$  and  $\tau_{sp}$ ) provides resolution of time interval measurement.

$$\Delta T = (\tau_{st} - \tau_{sp}) \times n = \Delta T_d \times n \tag{2.16}$$

The dynamic range of time interval measurement is the product of 'number of delay stages (N) and resolution  $(\Delta T_d)'$ . Here, the resolution is higher than that achieved in single tapped delay line method. However, differential delay line exhibits longer conversion time ( $N \times \tau_{sp}$ ) than single tapped delay line based technique which works on the principle of flash conversion. In [120],the dead time is reduced by using an asynchronous read out buffer.

In order to achieve stability in resolution ' $\Delta T'_d$  across PVT variations, DLL

architecture as shown in Fig.2.19 (a) is reported. Here, for reference delay line, the 'start' and 'stop' signal with time interval of one clock period is obtained on-chip using reference clock. The number of stages in the reference delay line is chosen such that the 'start' coincides with the 'stop' at the end of delay line. The DLL adjusts the control voltage to modify the delay of slow delay line in PVT variations, so that 'start' and 'stop' have coincidence at the end of delay line. This scheme regulates the difference of unit delays and hence the resolution of differential delay line.

Theoretically, very small resolution (10's of ps) can be achieved using this technique. However, with resolution of 10's ps over dynamic range of 100's ns, the length (N=10,000) of delay lines is long enough to have significant systematic and random errors. This may nullify the resolution gain achieved by differential method. To reduce the requirement of long delay lines for reasonable dynamic range, other two architectures of differential scheme- two dimensional differential delay line and cyclic differential delay line have been reported, which are discussed below-

## 2.4.2.3.2.3 Two dimensional differential delay line

The 2-D differential delay line requires less number of stages as compared to linear differential delay line for the same dynamic range. This reduces the power consumption as well as the integral non-linearity error, prevalent in long delay lines.



Figure 2.20: 2D differential delay line technique for TDC

In the linear differential delay line, the delays are compared between the taps located in the same position. If all combinations of both delay lines are chosen, a 2D plane of time reference can be implemented as shown in Fig.2.20. Here, the range of plane with uniform quantization defines the dynamic range of time interval measurement. For instance, the delay lines with 5-stages, the range of linear differential delay line is from '0' to ' $5\Delta T'_d$  while, for 2D (5×5) Vernier plane (shaded area), the dynamic range is from ' $-3\Delta T'_d$  to ' $9\Delta T'_d$ .

Moreover, in 2D Vernier plane, by extending the length of only one delay line, the dynamic range can be extended effectively. For instance, if number of slow delay elements are extended from 5 to 8, the modified dynamic range is from  $^{\circ}-3\Delta T'_{d}$  to  $^{\circ}24\Delta T'_{d}$ . Based on this concept, TDC prototypes have been reported [121] that find application as phase comparator in all digital PLL.

## 2.4.2.3.2.4 Cyclic differential delay line



Figure 2.21: Cyclic differential delay line implemented using buffers (a) schematic diagram (b) timing diagram

The cyclic differential delay line provides re-usability of the delay stages for theoretically infinite dynamic range. Here, '*start*' and '*stop*' triggerable ring oscillators are derived from differential delay line, as shown in Fig.2.21 (a). To establish the oscillations, when '*start*' and '*stop*' complete a span of the respective delay line, their inverted outputs are coupled back to the input of delay line with the help of coupling blocks. This gives 'slow' and 'fast' clocks corresponding to 'start' and 'stop' delay line respectively as shown in Fig.2.21 (b). As delay of 'start' delay line is more than 'stop' one, so time interval between 'slow' and 'fast' clocks reduces on each rotation. Thus, if this time interval is more than the dynamic range of differential delay line, no phase coincidence takes place on its taps. So, this time interval is accounted by counting the number of cycles of 'slow' and 'fast' clocks individually with the help of coarse and fine counters respectively. This provides the expansion in the dynamic range. Eventually, after few rotations, the time interval between 'slow' and 'fast' clocks becomes less than the dynamic range of differential delay line. Subsequently, there is a phase coincidence in the differential delay line. This asserts 'eoc', which stops the counters.

At the time of coincidence, the thermometer code, equivalent to the number of stages 'n' covered in the delay line till coincidence is produced in the register. Further, as rising and falling edges of oscillator clocks passes through the delay line in every alternate rotation, therefore, coincidence of both the rising and falling edge is checked across the taps.

The conversion expression is given by equation(2.17), where  $N'_c$  and  $N'_f$  are the number of cycles provided by coarse and fine counters respectively.

$$\Delta T = n \times (\tau_s - \tau_f) + (N_c - 1) \times T_{st} - (N_f - 1) \times T_{sp}$$

$$(2.17)$$

The above equation can also be written as:

$$\Delta T = n \times (T_{st} - T_{sp})/2N + (N_c - 1) \times T_{st} - (N_f - 1) \times T_{sp}$$
(2.18)

Where,  $\Delta T_d = (T_{st} - T_{sp})/2N'$  represents the resolution of time measurement for N-stage cyclic delay line.

However, if coincidence takes place at falling edge, the conversion expression is given by:

$$\Delta T = T_{st}/2 - n \times (T_{st} - T_{sp})/2N + (N_c - 1) \times T_{st} - (N_f - 1) \times T_{sp} + T_{sp}/2$$
(2.19)

The above equation can also be written as:

$$\Delta T = n \times (\tau_s - \tau_f) + (N_c - 1/2) \times T_{st} - (N_f - 1/2) \times T_{sp}$$
(2.20)

Here, there are two conversion expressions depending on type of edge co-

incidence. The other approach which has single conversion expression of cyclic scheme is used in [122]. Here, the delay lines are realized by using odd number of 'NAND' gates, each one functions like a basic inverter. The NAND gate is chosen instead of inverter to provide trigger ability by the inputs (start /stop) as well as to keep all delay elements identical. The oscillations are maintained by simple feedback owing to the odd number of stages. Here, there are alternate signal edge transitions on each tap. So, to detect the coincidence, two arbiters corresponding to rising and falling edge co-incidence are used on each tap.



Figure 2.22: Cyclic differential delay line implemented using odd number of NAND gates

As shown in Fig.2.22, the coarse and fine counters are clocked by slow oscillator clock. The coarse counter counts full cycles ( $N_c$ ) of clock till the occurrence of '*stop*' signal. The fine counter counts the number of times ( $N_f$ ), the ring core is reused before coincidence on the taps of delay line. The outputs of arbiters are combined into thermometer code, where first logic transition gives '*eoc*' which stops fine counter and oscillators. In this scheme of coarse and fine counting, the conversion expression is independent of type of edge coincidence and is given by equation:

$$\Delta T = N_C \times T_{st} + (N_F - N_C) \times T_{st} + n \times \Delta T_d - (N_F - N_C) \times T_{sp}$$
(2.21)

#### 2.4.2.3.2.5 Parallel delay line

The parallel delay line consists of delay elements with capacitive load scaling as shown in Fig.2.23 (a). Each successive delay element is loaded with linearly scaled capacitor at node 'x'. When a rising edge of signal is applied to parallel delay line, the time required to discharge the load capacitor is proportional to the scaling factor. Therefore, the delay versus scaling factor characteristic is linear, as shown in Fig.2.23 (b).



Figure 2.23: Parallel delay line with load capacitor scaling based delay element (a) schematic diagram (b) delay vs. scaling factor characteristic

The delay  $T'_p$  of delay element loaded with capacitor of scaling factor N is given by equation(2.22), where  $T_d$  is the intrinsic propagation delay and  $\Delta T_d$  is delay due to unit load (N=1C). The delay between consecutive delay elements is  $\Delta T_d$  if their intrinsic propagation delays are identical.

$$T_p = T_d + N \times \Delta T_d \tag{2.22}$$

For time interval measurement, the '*start*' is applied to parallel delay line to obtain its delayed replicas with time interval ' $\Delta T'_d$ . The '*stop*' signal samples and latches the status of delayed replicas of start to determine their number just before '*stop*'. The number thus obtained, multiplied by  $\Delta T_d$  gives time interval measurement.

The salient features of this technique are small conversion time and sub-gate delay ( $\Delta T_d$ ) resolution in time interval measurement. In addition, the measurement has less non-linearity errors, as the DNL error does not accumulate to create

a high INL like in cascaded delay line.

On the other hand, this technique has large current spikes as all stages change their state at the same time. In addition, the dynamic range is limited by the number of capacitors  $N \times (N + 1)/2'$  required for N-stages. With the limitations of small dynamic range and high power, this technique is suitable as fine time interpolator [123].

The key design challenges are to reduce the fan-out delay of 'start'signal, layout complexity in view of matching in intrinsic delays of delay elements and matching of capacitors.

# 2.4.2.3.2.6 Hierarchical time interval measurement techniques (reduction in number of delay elements)

Hierarchical techniques (two & three stage interpolation and nested DLL) use hierarchy of interpolators to reduce the length of delay line with the achievement of high resolution. It leads to reduction in non-linearity error as well as power and area consumption due to less number of delay elements. These techniques are discussed in the following sections-

## 2.4.2.3.2.6 (a) Two stage interpolation

In two-stage interpolation[124, 125, 126], the coarse interpolator is implemented by a tapped delay line providing delayed replicas of reference clock for measurement range of one clock period. It measures the time intervals  $T'_{st1}$  and  $T'_{sp1}$ , which is equal to delayed clock edge (next to start/stop signal) to the next rising edge of reference clock, as shown in Fig.2.24. The resolution ( $T_d$ ) of coarse interpolator is determined by gate propagation delay.



Figure 2.24: Timing diagram for hierarchical technique

The sub-gate delay measurement techniques are used as fine interpolator over a range that is equal to the resolution of coarse interpolator. Two fine interpolators (corresponding to start and stop) working in parallel are used to measure time intervals  $T'_{st2}$  and  $T'_{sp2}$ . These time intervals are in between the start to its adjacent and stop to its adjacent delayed clocks respectively.

The measurement errors due to metastability in coarse register are avoided by using fine synchronizers for '*start*' and '*stop*' signals at second interpolation level. It provides sufficient time to the coarse register to attain the stable state from metastability. The scheme of synchronization is based on generating a signal '*syn*' corresponding to delayed clock, adjacent to input (start/stop) with a synchronization delay. This delay is in multiples of the coarse interpolator resolution. The input (start/stop) is also delayed by the same amount to obtain '*asyn*' signal thereby maintaining the time interval ' $T'_{st2}$  and ' $T'_{sp2}$  intact. The conversion time is given by-

$$\Delta T = T_{ctrl} + T_{st1} + T_{st2} - (T_{sp2} + T_{sp1})$$
(2.23)

In the reported architectures [125, 126], to provide the immunity to LSBs of coarse interpolator and fine interpolator across PVT variations, different schemes of DLL have been used.

In [125], a dual loop dual DLL is used as shown in Fig.2.25 (a). It consists of slow and fast DLLs having identical delay elements. The 'fast' DLL locks the delay of N-stage delay line to the time period of clock as per equation (2.24), In slow DLL, the delay of  $(N)^{th}$  delay element is matched with the  $(N + 1)^{th}$  delay element of fast DLL as per equation (2.25). By virtue of loop correction mechanism, the control voltages ' $V'_{ctrlf}$  and ' $V'_{ctrls}$  are different providing smaller unit delay ' $\tau'_{f}$ in fast DLL than ' $\tau'_{s}$  in slow DLL. The delay ' $\tau'_{f}$  is the LSB of coarse interpolator and ' $\Delta T'_{d}$  is for fine interpolator. The required number of stages 'N' for a clock frequency ' $T'_{ref}$  and resolution  $\Delta T_{d}$  is calculated from the equation(2.26).

$$\tau_f = \frac{T_{ref}}{N} \tag{2.24}$$

$$N \times \tau_s = (N+1) \times \tau_f \tag{2.25}$$

$$\Delta T_d = \frac{T_{ref}}{N^2} \tag{2.26}$$

In [126], nested DLL scheme has been used as shown in Fig.2.25 (b), where the coarse DLL locks the delay of N-stage delay line to one clock period. The fine DLL locks the  $M^{th}$  output delay of parallel delay line to the two times of bins  $T_{ref}/N'$  of coarse DLL. The LSB of fine interpolator is  $2 \times T_{ref}/(M \times N)'$ .



Figure 2.25: Block diagram of (a) dual loop dual DLL (b) nested DLL scheme

# 2.4.2.3.2.6 (b) Three stage interpolation using reference recycling with low frequency clock (high precision time interval measurement)

Low frequency reference clock due to its stability and small power requirement, improves the performance of TDCs. On the other hand, it requires large number of delay elements for delay line based fine interpolators. To further reduce the number of delay elements, reference recycling based DLL [127, 128, 129] architecture with three stage interpolation has been reported. It reduces the length of delay line with low frequency clock while maintaining the high resolution. This results in reduction of non-linearity error, which improves precision of TDC.

The scheme is based on recycling of the reference clock edge several times

in delay line until one reference clock period is covered. After that, a new reference edge is applied to the delay line. The required number of recycling round of N-stage delay line to complete one reference clock period is defined by recycling factor 'R'.



Figure 2.26: Cyclic DLL based coarse interpolator (a) block diagram (b) timing diagram

The re-cycling in the delay line is carried out by using differential multiplexer based delay element [129] having a selection line (Sel), as shown in Fig.2.26 (a). Initially, clock is selected to enter into delay line. When clock completes a span of delay line, feedback loop is selected by cross coupling between last element and second channel of first element to enable recycling. The selection line of delay element is controlled by recycling counter, which toggles when count is equal to re-cycling factor (R). It further injects the jitter free clock into the delay line. The first element uses both the channels and others use single channel to avoid delay variation between first and last delay element in the loop.

The silent feature of this delay element is that it provides two outputs like  $d'_0$  and  $d'_4$  with different phases and identical time-period of  $2 \times N \times T'_d$ , as shown in Fig.2.26 (b). This reduces the power consumption and doubles the effective length of delay line.

The time interval measurement is divided into three parts such as coarse count, coarse interpolation and fine interpolation. The phase ' $d'_0$  which interpolates reference clock by '2', is applied to the coarse counter. It counts the number of cycles between '*start*' and '*stop*'. In the coarse interpolator, '*start*' and '*stop*' latches the state of intermediate clocks ' $d'_0$  to ' $d'_7$  with resolution ' $T_d = T_{ref}/2 \times (2 \times N \times R)'$ .

In fine interpolator, the parallel delay line with capacitor load scaling (section 2.4.2.3.2.5) is used. Here fine resolution  $\Delta T_d$  is  $M^{th}$  part of  $T_d$ . Therefore, minimum 'M' delay elements are needed. To further reduce the number of delay elements, a 2-D arrangement of delay elements is used. As shown in Fig.2.27, one delay line contains 2-delay elements and another contains 5. Thus 10 interpolation slots with resolution ' $\Delta T'_d$  can be obtained using 7 delay elements. So, final achieved interpolation in the reference clock is ' $T_{ref}/4NRM'$ .



Figure 2.27: Timing diagram of 2-D fine interpolator in three stage interpolation

The performance comparison among the hierarchical architectures is given in Table2.1.

| Parameters    | Time<br>Stamping | TDC based<br>on nested<br>DLL | TDC with<br>two level<br>conversion | TDC based<br>on recy-<br>cling DLL |
|---------------|------------------|-------------------------------|-------------------------------------|------------------------------------|
| CMOS          | $1 \mu m$        | 0.8µm                         | 0.35µm                              | 0.35µm                             |
| Process       |                  |                               |                                     |                                    |
| Clock         | 40 MHz           | 85 MHz                        | 130-160                             | 5 MHz                              |
| frequency     |                  |                               | MHz                                 |                                    |
| LSB           | 1.56 ns          | 92 ps                         | 24-30 ps                            | 12.2 ps                            |
| Power         | 10mW/ch          | 100mW                         | <50mW                               | 40mW                               |
| Area          | $25 mm^2$        | $3.1 \times 2.2 \ mm^2$       | -                                   | 2.5×3.0                            |
|               |                  |                               |                                     | $mm^2$                             |
| Delay         | 24               | 32                            | 16                                  | 20                                 |
| Elements      |                  |                               |                                     |                                    |
| Precision     | 100 ps           | 50 ps                         | -                                   | 8.1 ps                             |
| Interpolation | 16               | 128                           | 256                                 | 16384                              |
| Ratio         |                  |                               |                                     |                                    |

Table 2.1: Performance comparison among hierarchical architectures

## 2.4.2.3.2.7 Pulse shrinking delay line (PSDL)

In this technique [130, 131] for time interval measurement, the shrinking in the pulse length is monitored until it vanishes. To apply shrinking in the pulse, a current starved buffer is used as pulse-shrinking element. The shrinking is introduced by adjusting the amount of current with the help of bias voltage, during rising edge input transition. So, the difference of rising ( $\tau_r$ ) and falling ( $\tau_f$ ) edge delays is equal to the amount of shrinking and defines LSB of time interval measurement. The pulse with length equivalent to time interval is applied to a cascaded delay line of pulse shrinking elements. A latch on each tap is set to logic '1' by the presence of the pulse. Fig.2.28 shows the timing diagram where, at  $n^{th}$  stage, pulse vanishes so that the output of corresponding latch is at logic '0'. The first logic '1'-to-logic '0' transition count 'n' in the output thermometer code multiplied by LSB ' $\Delta T'_d$  gives time interval.

A high resolution of time interval measurement can be achieved by controlling the amount of shrinking. On the other hand, the design value of shrinking is sensitive to PVT variations. So, the amount of shrinking needs to be calibrated frequently to ensure the correctness of measured time. To calibrate this, a different architecture of DLL [130, 131] is used, where a reference pulse of known duration  $(W_{ref})$  passes through the N-stage reference PSDL. The DLL adjusts the control voltage so that the pulse vanishes at the end of delay line. The calibrated amount of shrinking is given by-

$$\Delta T_d = \frac{W_{ref}}{N} \tag{2.27}$$

The conversion expression of pulse shrinking delay line is given by equation(2.28), where 'n' is the number of stages carrying the pulse in the measurement chain.



$$\Delta T = n \times \frac{W_{ref}}{N} \tag{2.28}$$

Figure 2.28: Timing diagram of pulse shrinking delay line technique

In the linear PSDL, the non-linearity error due to the mismatch in the amount of shrinking increases with the number of stages. This limits the dynamic range of PSDL in view of accuracy in time interval measurement[130].

To reduce this non-linearity, the cyclic PSDL has been reported [132, 133] as shown in Fig.2.29 (a). Here, the number of rotation of the pulse (through the cyclic PSDL) till it vanishes, is used for time interval measurement. The non-linearity error reduces as only one element is used for shrinking. Even, if more than one shrinking elements are used in the delay line; still non-linearity error is not encountered in cycle to cycle counting of pulse rotation. The LSB of time interval measurement is calibrated by counting the rotations of reference pulse of known duration through cyclic PSDL till it vanishes. To set the high resolution, the amount of shrinking is adjusted by external reference voltage, so that the pulse covers a large number of rotations before it vanishes. In the measurement phase, the calibrated LSB multiplied by number of pulse rotations gives time interval measurement.



Figure 2.29: Cyclic pulse shrinking delay line (a) block diagram (b) schematic diagram with temperature compensation

To avoid bias adjustment for voltage controlled shrinking, fully digital pulse shrinking delay line has been reported[133]. Here, the shrinking is obtained by passing the pulse through two buffers, which have different driving strengths. This leads to different durations of pulses at the output of buffers. So, their logical AND gives a final pulse that is shrunk in length. The degree of shrinking is controlled by the adjustment of aspect ratio of buffers. The other way is utilization of inhomogeneous dimensions of coupling stage for shrinking as it differs from other inverters in the delay line. When pulse passes through the interface of delay line and coupling stage, it encounters different rising and falling edge delays. To ensure shrinking the aspect ratio of coupling stage and inverters is adjusted, so that rising edge delays are more than falling. This digital pulse shrinking TDC achieves 20 ps LSB and 18  $\mu$ s dynamic range. However, it has variation in LSB of ±25% over a temperature range from 0<sup>o</sup>C to 100<sup>o</sup>C.

The variation in shrinking over temperature is due to the temperature sensitivity of threshold voltage ( $V_T$ ) and mobility ( $\mu$ ) in MOS transistor. Both parameters reduces with increase in temperature as per equation(2.29) and equation (2.30) respectively.

$$V_T(T) = V_T(T_0) + \alpha \times (T - T_0)$$
(2.29)

$$\mu(T) = \mu(T_o)(T/T_o)^{Km}$$
(2.30)

where, ' $K'_m$  is in the range of -1.2 to -2.0 and ' $T'_0$  is the reference temperature (e.g., 300 degree kelvin), T is the environment absolute temperature and  $\alpha$  is in the range of 0.5  $mV/{}^0K$  to 3.0  $mV/{}^0K$ .

The variation in above factors influences the drain current in opposite directions. The mobility dominates because of its exponential nature if bias voltage for transistor is much greater than the threshold voltage. Otherwise, threshold voltage effect dominates (refer chapter-3). Using this feature, a temperature compensated cyclic pulse shrinking delay line based TDC is reported [134], as shown in Fig.2.29 (b). Here, the bias voltages ( $V_n$  and  $V_p$ ) compensate the mobility loss of the logic inverter. These voltages are provided by bias circuit, where transistor  $P_3$  acts as the load of the two diode-connected (saturated) transistors  $P_1$  and  $N_1$ . As temperature increases, the threshold voltage of  $P_1$  and  $N_1$  decreases, which results in an increase of conduction current of  $P_3$ . The current mirrors composed of ( $P_1$ ,  $P_2$ ) and ( $N_1$ ,  $N_2$ ), transfers the conduction current of  $P_3$  with positive temperature coefficient to inverter. Thus, reduces the variation in delays.

In this temperature compensated cyclic delay line, the variation in LSB reduces from 25% to 6% over temperature variation from  $0^{0}C$  to  $100^{0}C$ . This scheme is also used to implement a TDC based temperature sensor[84].

#### 2.4.2.3.2.8 Interpolation of bin size of DLL by array of delay lock loop (ADLL)

In time stamping, the bin size depends on the smallest attainable delay from delay element. The current starved inverter based delay element is frequently used in time stamping due to its salient features of low power and wide delay regulation range. However, smallest delay is limited by the parasitics of CMOS process node. As an alternative, high speed differential delay element with large static current has been used in [111]. However in portable and HEP experiments, power efficiency is desirable. Therefore, a scheme to interpolate the smallest delay of current starved inverter (bin size of DLL) has been reported in [135].

This scheme relies on using the array of several uniformly offset DLL's as shown in Fig.2.30(a). To generate the precise offsets, a phase shifting DLL (PSDLL) with M number of delay elements is used. The unit delay ( $T_M$ ) of PSDLL defines the required offset as per equation(2.31). It is chosen more than the process limited

delay  $(T_N)$  by some fraction (F) to obtain delay interpolation.

$$T_M = T_N + F^{-1} \times T_N \quad (F > 1) \tag{2.31}$$

The value of 'F' defines the number of offset DLLs. The number of stages, 'N' in each offset DLL and 'M' in PSDLL is chosen as per equation(2.32) and equation(2.33) respectively. So that, the delay line in PSDLL and each offset DLL is locked to one reference clock period.

$$N = \frac{T_{ref}}{T_N} \tag{2.32}$$

$$M = \frac{T_{ref}}{T_M} \tag{2.33}$$

The PSDLL provides the reference clocks, each having initial phase difference  ${}^{*}T'_{M}$  from the previous one to the offset DLLs. So that the interpolated bin size in offset DLLs is given by equation:

$$\Delta T_d = T_M - T_N = T_N / F \tag{2.34}$$

However, the sequencing of consecutive bins with uniform time interval of  $\Delta T'_d$  is important before its utilization in time stamping. To find this sequence, edge representation of delayed clocks on each tap of offset delay lines is illustrated in Fig.2.30(b). The notations (a,b....f) listed in Table 2.2 represents the position of delay elements. In the second reference clock cycle, sequence of consecutive bins with uniform time interval of  $\Delta T'_d$  can be achieved. The number of bins is given as  $N \times F'$ .

Table 2.2: Notation for position of delay elements in ADLL

| Notation                            | Position of De-<br>lay Element                     |
|-------------------------------------|----------------------------------------------------|
| [a to e]                            | M <sub>0</sub> [N <sub>1</sub> to N <sub>5</sub> ] |
| [a <sub>1</sub> to f <sub>1</sub> ] | M <sub>1</sub> [N <sub>1</sub> to N <sub>5</sub> ] |
| $[a_2 to f_2]$                      | $M_2$ [N <sub>1</sub> to N <sub>5</sub> ]          |
| $[a_3 to f_3]$                      | $M_3$ [N <sub>1</sub> to N <sub>5</sub> ]          |

The drawbacks of ADLL are power and area inefficiency as well as jitter accumulation in offset DLLs. By sharing ADLL among a number of channels of time stamping, its power and area efficiency can be improved. A single hit TDC prototype in [136] have been reported using ADLL for sub-gate delay resolution.



Figure 2.30: (a) Block diagram of ADLL based time interval measurement (b) edge representation of delayed clocks for 2-D matrix ( $4 \times 5$ ) of delay elements in ADLL

#### 2.4.2.3.2.9 Interpolation of bin size of DLL by RC-delay line

The gate delay interpolation using ADLL technique requires a number of closed feedback loops in a complex topology. This invariably leads to high power and area consumption.

Alternatively, a time stamping scheme to interpolate the process limited bin size of DLL relies on acquiring 'M' samples of the status of DLL on arrival of '*hit*' signal. This fractional sampling is obtained by delaying the hit signal using tapped RC delay line. It provides 'M' delayed versions of '*hit*' signal with time interval ( $\Delta T_d$ ) that is  $M^{th}$  part of bin size ( $T_d$ ). The DLL bin size interpolation is therefore achieved by determining the number of delayed hit signals (m) that shift the DLL status to next bin.



Figure 2.31: Time stamping using DLL and RC delay line with sub-gate delay resolution

As shown in Fig.2.31, the '*hit*' signal ( $S_0$ ) occurs in between  $N^{th}$  and  $(N+1)^{th}$  delayed clocks. From the samples of status of delayed clocks, the transition in logic '0' to '1' is in  $N^{th}$  bin. This provides the arrival time of hit ' $N \times T'_d$ , with bin size ' $T'_d$  (process limited delay) of DLL. Further, the delayed sample of hit corresponding to ' $S'_4$  exists in the  $(N + 1)^{th}$  bin, so m=4. This gives the fractional time of '*hit*' within  $N^{th}$  bin as:

fractional time of hit in 
$$N^{th}$$
 bin =  $T_d - \frac{(T_d \times m)}{M}$  (2.35)

So, occurrence time (T) of hit over a range of one clock period with sub-gate

delay resolution of ' $\Delta T_d = T_{ref}/(M \times N)'$ , is given as-

$$T = N \times T_d + \frac{(T_d \times (M - m))}{M} \quad (0 \le m \le M)$$
(2.36)

Using this scheme, a TDC prototype with single DLL and passive RC-delay line with same potential of interpolation as achieved from ADLL have been reported [137, 138].

#### 2.4.2.3.2.10 Interpolation in bin size of DLL by multiple period locking

The operating principle of conventional DLL architecture is based on one clock period locking [108]. If the delay of voltage controlled delay line with N-stages is locked to multiple 'M' clock periods as per equation(2.37), the interpolation in the bin size of DLL can be achieved with the condition that numbers 'M' and 'N' are co-prime to each other.

$$M \times T_{ref} = N \times T_d \tag{2.37}$$

$$\Delta T_d = T_d / M \tag{2.38}$$

The interpolated delay is given by equation(2.38). For instance, for N=32, M=3,  $T_{ref} = 8$  ns and  $T_d = 750$  ps, the value of ' $\Delta T'_d$  is 250 ps. The tap position 'X' in the chain of delay elements to achieve the sequence of uniformly delayed clocks within a reference clock period (after three clock cycles) is listed in Table 2.3. This architecture has been used in the development of multi-hit TDC with resolution of 50 picoseconds [139] using 0.8  $\mu m$  CMOS process.

|    |      |    |      |    |      | 1  |      |
|----|------|----|------|----|------|----|------|
| X  | ΔΤ   | X  | ΔΤ   | X  | ΔΤ   | X  | ΔΤ   |
|    | (ps) |    | (ps) |    | (ps) |    | (ps) |
| 0  | 0    | 24 | 2000 | 16 | 4000 | 8  | 6000 |
| 11 | 250  | 3  | 2250 | 27 | 4250 | 19 | 6250 |
| 22 | 500  | 14 | 2500 | 6  | 4500 | 30 | 6500 |
| 1  | 750  | 25 | 2750 | 17 | 4750 | 9  | 6750 |
| 12 | 1000 | 4  | 3000 | 28 | 5000 | 20 | 7000 |
| 23 | 1250 | 15 | 3250 | 7  | 5250 | 31 | 7250 |
| 2  | 1500 | 26 | 3500 | 18 | 5500 | 10 | 7500 |
| 13 | 1750 | 5  | 3750 | 29 | 5750 | 21 | 7750 |

Table 2.3: Sequence of uniformly delayed clocks with respect to position (x) ofdelay element

#### 2.4.2.3.2.11 DLL based time amplification

In this technique, the input time interval is amplified by a predefined integral factor 'N'. This amplified time interval is then measured using a relatively low resolution TDC scheme such as a FLASH using TDL. The measured value when divided by N, gives the initial time interval between '*start*' and '*stop*', thereby exhibiting subgate delay resolution.



Figure 2.32: Block diagram of time amplifier using DLL

To implement this, a DLL based time amplifier (TA) is reported in [140], as shown in Fig.2.32. In this scheme, the '*start*' and '*stop*' signals trigger two identical ring oscillators providing clocks in order to allow the DLL to maintain the loop in locked condition by iterative corrections. The respective rising edges of these clocks have a delay equal to the applied time interval. These clocks are then applied to the delay elements ' $D'_0$  and ' $C'_0$ , which are identical in structure and sizing. The time interval between rising edges of delayed outputs is compared by the PD. The feedback loop adjusts the delay of ' $D'_0$  using control voltage to have output phase coincidence. At coincidence, the difference in delays of ' $C'_0$  and ' $D'_0$  is equal to the input time interval ' $\Delta T'$  expressed by equation(2.39). To produce amplification, these delay cells are replicated 'N' times, so that the output delay difference is equal to 'N' times of input time interval. The amplified output is given by equation(2.40).

$$\Delta T = (T_{C0} - T_{D0}) \tag{2.39}$$

$$output time interval = N \times \Delta T \tag{2.40}$$

The dynamic range of time amplifier is limited as it depends on the delay regulation range of voltage controlled delay element  $D'_0$ . Therefore, it can be further improved by increasing the number of delay elements in the feedback loop.

Additionally, multiplexer has been used in [141] in order to digitally program the amplification to any integral value ranging from '1' to 'N'.

The amplified time interval is measured by low resolution TDC such as tapped delay line. This measured value when divided by 'N', gives the initial time interval between '*start*' and '*stop*'. Thus, this method provides sub-gate delay resolution with limited dynamic range. It is preferred for very short time interval measurement to characterize the rise time, fall time and skew degradation in high speed circuits.

#### 2.4.2.4 Miscellaneous Techniques

The techniques, which do not incorporate TDL as fine interpolator are classified into miscellaneous and are discussed below-

#### 2.4.2.4.1 Re-triggerable ring oscillator based Vernier technique

The Vernier technique utilizes two oscillators, '*slow'* and '*fast'* with slight difference in their time-periods for time interval measurement [142, 143, 144, 145, 146, 147, 148]. The slow and fast oscillators start oscillating with time periods ' $T'_{oscst}$ and ' $T'_{oscsp}$  ( $T_{oscsp} < T_{oscst}$ ) on the arrival of '*start'* and '*stop'* inputs respectively as shown in Fig.2.33. As ' $T'_{oscsp}$  is slightly less than ' $T'_{oscst}$ , therefore, on each cycle, the fast oscillator approaches the slow one by the step size of  $\Delta T_d = T_{oscst} - T_{oscsp}$ , which defines resolution of time interval measurement. Eventually, the fast oscillator coincides or leads the slow oscillator. The phase coincidence or leading is detected by the phase detector, which toggles 'end of conversion' (eoc) and stops ring oscillators.

The elapsed cycles of both oscillators till phase coincidence are counted by two counters. The coarse counter counts the  $N'_c$  number of slow oscillator cycles till phase coincidence. It provides a long measurement range of  $(2^b - 1) \times T_{oscst}$ , where b is the number of bits of coarse counter. The fine counter counts cycles of fast oscillator  $N'_f$  corresponding to number of  $\Delta T'_d s$  within  $T_1$  till phase coincidence. The conversion equation of time interval under measurement is given by-

$$\Delta T = (N_c - 1) \times T_{oscst} - (N_f - 1) \times T_{oscsp}$$
(2.41)

The above equation can be rearranged as-

$$\Delta T = (N_c - N_f) \times T_{oscst} + (N_f - 1) \times \Delta T_d$$
(2.42)



Figure 2.33: Vernier technique (a) block diagram (b) timing diagram

where  $(N_c - N_f)T'_{oscst}$  gives the coarse and  $(N_f - 1)\Delta T'_d$  gives the fine time measurement.

The crucial design aspect is the number of bits in fine counter. It is chosen in order to count at least the maximum number of  $\Delta T'_d s$  within  $T_{oscst}$  given by stretching factor 'S':

$$S = \frac{T_{oscst}}{T_{oscst} - T_{oscsp}} = \frac{T_{oscst}}{\Delta T_d}$$
(2.43)

The key design challenges in the Vernier technique are generation of two frequencies with slight difference and their stabilization across PVT variations. In the TDC prototype [143], two phase lock loops (PLLs) have been incorporated to generate the stabilized frequencies with slight difference. To avoid the area and power inefficient DLL or PLL, in [145, 146] on-chip time-period calibration method, which corrects the timing expression in PVT variation has been used. The slight difference in frequencies of ring oscillators is obtained by using a fan-out difference of logic gate in [144, 145] and by different wire length in automatic P&R of buffers in [147].

Theoretically, a high resolution  $(T_{oscst} - T_{oscsp})$  can be achieved using this technique. However, minimum detectable phase error by phase coincidence detector limits the achievable resolution. In the reported phase detectors [144, 145], the phase detection uses sampling of status of slow clock by fast clock by using

D Flip-flop. At the time of phase coincidence, the setup time violation of flip-flop results in unpredictable (metastable) output. This situation can be avoided with the help of metastability hardened flip-flops. However, its dead zone characteristic further limits the accurate phase coincidence detection and hence disabling of counters and oscillators. To mitigate the impact of dead zone and metastability over accuracy of phase detection, in [148], a phase frequency detector (PFD) followed by  $C^2MOS$  register has been reported. In the PFD shown in Fig.2.34 (a), both oscillator clocks trigger two separate flip-flops with their inputs preset to logic '1'. This avoids the violation of setup time and dead zone of flip-flop at the time of phase coincidence. The dynamic range of PFD is limited by the resetting time of flip flops. The re-setting time is reduced by designing PFD using dynamic logic with an in-built resetting using feedback of its outputs as shown in Fig.2.34 (b).



Figure 2.34: (a) PFD based phase detector (b) PFD implemented using dynamic logic to reduce its resetting time

Further, to obtain 'eoc',  $C^2MOS$  register is used to sample the status of UP signal by DN, provided by PFD. When UP leads DN, the output of register is at logic one, which enables the counters. At the time of coincidence, the output is at logic zero and it disables the counter and oscillators.

The merits of Vernier technique are minimal logic resource consumption, area efficient, process independent resolution, large dynamic range. The limitation is higher conversion time ' $S \times T'_2$ , which limits the measurement rate of TDC.

In Vernier technique, the accuracy in the ring oscillator's time period can not be maintained over long time interval measurement due to inherent accumulated jitter in the oscillators. A dual Vernier technique is reported in order to improve the accuracy of Vernier technique. In this scheme, two Vernier converters corresponding to '*start*' and '*stop*' are used respectively. The difference of their counts in the conversion expression reduces the impact of jitter and systematic errors over
accuracy of time interval measurement.

This technique uses two event trigger-able oscillators with identical timeperiods. The identical time-period ( $T_{osc}$ ) of both oscillators is slightly greater than the time-period ( $T_{ref}$ ) of reference clock. Therefore, on each cycle, reference clock independently approaches both oscillators with the step size of  $\Delta T_d = T_{osc} - T_{ref}$  as shown in Fig.2.35 (a). Eventually, they have phase coincidence with the reference clock.

> 1 2 4 5 3 6  $T_{ref} = 4 \Delta T_d$ coincidence reference clock cycles 1 2  $\Delta T_d$ Cycles of start ring oscillator  $T_{osc} = 5 \Delta T_d$  $T_{dr} < 3\Delta T_{d}$  $T_{dr} < 2\Delta T_d$  $T_{dr} = \Delta T_d$ t=0 t (a) T<sub>ref</sub> N<sub>c</sub>T<sub>ref</sub> reference clock  $N_{f1}T_{osc}$ Start Coincidence T<sub>osc</sub> Start Oscillator T<sub>osc</sub> Stop  $\Delta T$ Coincidence **Stop Oscillator**  $N_{f2}T_0$ (b)

In this technique, the cycles of oscillators and reference clock are counted

Figure 2.35: (a) Edge representation of start oscillator coincidence with reference clock in dual vernier technique (b) timing diagram of dual Vernier technique

by three counters one coarse and two fine. The fine counters count cycle of start  $N'_{f1}$  and stop  $N'_{f2}$  till their phase coincidence with the reference clock, as shown in Fig.2.35 (b). These counts are corresponding to the number of  $\Delta T'_d s$  within  $T_{ref}$ . The coarse counter counts cycles of reference clock  $N'_c$  within time interval corresponding to individual phase coincidences of both oscillators with reference clock. It provides dynamic range of time interval between '*start*' and '*stop*'. The conver-

sion expression is given by-

$$\Delta T = N_{f1} \times T_{osc} + N_c \times T_{ref} - N_{f2} \times T_{osc}$$
(2.44)

The number of bits in fine counter is crucial design parameter. It is chosen to count at least maximum number of  $\Delta T'_{ds}$  within  $T_{ref}$ . Here, ' $\Delta T'_{d}$  defines the resolution of time measurement.

This technique can measure negative time interval i.e. when stop comes before start. Based on this technique, time interval counter 'HP5370A' has been reported with 20 ps single shot resolution [149].

#### 2.4.2.4.2 Time interval using successive approximation technique

Successive Approximation is a well known principle for ADC implementation achieving high resolution but exhibits long conversion time. In time domain also, this technique can be implemented to measure time interval using binary search method. Here, the time interval between '*start*' and '*stop*' is compared in '*N*' iterations with delay adjustment in '*start*' or '*stop*', where'*N*' is number of bits in the result. Each comparison converges one bit (starting from MSB) at a time. The algorithm advances in time domain such that the leading signal (start or stop) is always delayed after a comparison. The amount of delay is successively reduced by half of full scale before each comparison.

The timing diagram of this technique is shown in Fig.2.36 for measuring time interval of 10.4 LSB over full scale ( $T_{FS}$ ) of 16 LSB. The 'start' signal is first delayed by  $T_{FS}/2$  to compare with '*stop*' at the mid of dynamic range. If it lags the stop signal, it means that time interval is below half of dynamic range and MSB bit is set to be logic '0'. If it leads from stop, the time interval is greater than half the dynamic range and so the corresponding MSB bit is set to logic '1'. Therefore, the 'start' is further delayed by  $T_{FS}/4$  and compared with the 'stop' signal. As it lags the 'stop', it means the time interval is in between  $t_1 + T_{FS}/2'$  and  $t_1 + T_{FS}/2 + T_{FS}/4'$ . So, next bit is set to logic '0'. Further, 'stop' signal is delayed by  $T_{FS}/8$  and compared with delayed 'start'. As 'start' leads 'stop', therefore, the corresponding bit is set to logic '1'. Finally, start is further delayed by LSB of  $T_{FS}/16$  and it approximately coincides with the stop. Therefore, the corresponding bit is set to logic '0'. As the signals are approximately aligned with the error of less than one LSB, the condition refers to the end of conversion. At the end of conversion, the bits '1010' correspond to the conversion result. Using this technique, time interval measurement with 1 ps resolution over 312.5 ps range has been reported [150].



Figure 2.36: Time interval measurement using successive approximation technique

## 2.5 Summary

This chapter reports a description of time interval measurement techniques since their inception along with several modifications over the time line. The description is supported with the block, timing diagram and conversion expression. The description is organized as per the classification of techniques in terms of performance. However, to see the chronology among the evaluation of techniques, a time line flow diagram is shown in 2.37.

Moreover, the techniques are briefly summarized in Table 2.4. For high precision and high resolution time interval measurement, discrete TAC with ADC has been used. However, it was limited by high power and area consumption as well as the cost. The integrated solution has provided reduction in area and cost. However, it is not robust across PVT variations and so degrades the precision of measurement.

Among digital techniques based on delay lines, high resolution with moderate conversion time can be achieved. However, precision is limited by non-linearity error due to mismatch among tapped delays. Vernier delay line (VDL) offered technology independent high resolution but is limited by the dynamic range. The cyclic vernier delay line offered a good performance with respect to both precision and dynamic range. However, they suffered from a long conversion time. The hier-archical structures provided a solution to reduce non-linearity error by reducing length of delay lines at moderate conversion time. Thus, techniques can be pre-ferred according to the specific requirements of experiment or application in time interval measurement after thoroughly weighing their advantages and limitations. So, this table assists in the selection of technique based on the specifications in particular application.

The roadblocks in achieving the accuracy in CMOS TDC are the PVT variations and timing jitter. This requires LSB calibration as a crucial step to assure the correctness of measurement. This can be carried out by using inbuilt digital calibrator or automatic calibration circuits (DLL or PLL). A separate identical reference channel can be used for calibration or measurement channel itself can be used in calibration mode. The first option features independent time interval measurement and resolution calibration at the cost of calibration accuracy against local mismatches. The second option reduces the inaccuracy in calibration, however,it limits the measurement rate of TDC.

Moreover, the local mismatches, which are dominating in modern CMOS process, limit the precision in long delay line based TDCs. Thus prevents the efficient use of modern process in terms of high resolution. Further, the techniques, which use ring oscillators are limited to achieve high accuracy due to accumulated jitter.

| Parallel Delay Line<br>(1) Sub-gate delay resolution (M)<br>(2) Strongly affected by PVT vari-<br>ations (D)<br>(3) High linearity as DNL does not<br>accumulate (M)<br>(4) Requires N(N+1)/2 capacitors<br>for N stage (D)<br>(5) Careful hand design of each<br>stage (R)<br>(6) No loop feasible (R)<br>(7) Efficient to be used as fine in-<br>terpolator (R).        | <b>Tapped Delay Line (TDL)</b><br>(1) Negligible conversion time (M)<br>(2) Resolution dependent on ab-<br>solute gate delay and sensitive<br>to PVT variation (D) so requires<br>calibration (R)<br>(3) De-skewing required at stop<br>input (D)<br>(4) DR (10's of ns) defined by<br>number of stages.<br>(5) Loop (cyclic) structure is pos-<br>sible for theoretically infinite DR<br>(R)<br>(6) Low Precision limited by<br>matching among tapped delays<br>(D)<br>(7) Trade-off in resolution and<br>dynamic range(D) |
|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Differential and Shrinking De-<br>lay Line<br>(1) Sub-gate delay resolution (M).<br>(2) Large DR is possible in Loop<br>(M)<br>(3) large conversion time (D)<br>(4) Trade-off in resolution and DR<br>(D)<br>(5) Low precision limited by<br>matching among tapped delays<br>and amount of shrinking (D)<br>(6) Power hungry if continuous<br>calibration is required (R) | TAC with ADC<br>(1) High resolution (M)<br>(2) DR limited by non-linearity of<br>current source (D)<br>(3) Conversion time depends on<br>ADC (R)<br>(4) Not efficient for modern CMOS<br>process (D)<br>(5) Susceptible to process varia-<br>tion and environmental noise (D)<br>(6) Accuracy much affected by<br>noise coupling and nonlinearity<br>(D)<br>(7) Trade-off in resolution and DR<br>(D)                                                                                                                       |
| Digital Counter<br>(1) Small non-linearity error de-<br>pends on stability of clock (M).<br>(2) Theoretically infinite DR (M)<br>(3) Negligible Conversion time<br>(M)<br>(4) Resolution and accuracy de-<br>pends on the clock frequency (D)<br>(5) Trade-off in resolution and<br>power consumption (D)                                                                 | Loop Structures<br>(1) Number of stages is reusable<br>to expand dynamic range (M)<br>(2) Less non-linearity error as<br>pulse covers the delay line once in<br>each cycle (M)<br>(3) Chip-area efficient (M)<br>(4) High resolution (M)<br>(5) Large conversion time(D)                                                                                                                                                                                                                                                    |

## Table 2.4: Summary of TDC techniques

| Dual Slope Technique<br>(1) Robust and free from calibra-<br>tion as gain depends upon ratio,<br>instead of absolute values of cir-<br>cuit parameters (M)<br>(2) Area and power efficient as<br>compared to TAC with ADC due<br>to elimination of separate ADC<br>(M)<br>(3) High resolution and high pre-<br>cision time measurement(M)<br>(4) Large conversion time(D)                                                  | Pseudo Differential Delay Line<br>(1) Sensitive to PMOS and NMOS<br>strength mismatch (M)<br>(2) Resolution is twice as com-<br>pared to buffer based tapped de-<br>lay line (M)<br>(3) Output is in the form of ther-<br>mometer code (R)<br>(4) Careful layout is required due<br>to twisting at taps (R)                                                                                               |
|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Single DLL with RC Delay Line<br>(1) Uses single DLL to provide<br>same potential of interpolation as<br>that in array of delay lock loop<br>(M).<br>(2) Power and area efficient as<br>compared to ADLL (M)<br>(3) Start-up calibration is required<br>for passive delay element (R)<br>(4) New decoding logic is required<br>for fine time measurement (R)                                                               | Array of DLL<br>(1) Sub-gate delay resolution de-<br>fined by interpolation factor(M)<br>(2) Power and area inefficient due<br>to number of DLLs (D)<br>(3) Accumulation of jitter in num-<br>ber of DLLs (D)<br>(4) Number of bins is not repre-<br>sented by power of two (R)<br>(5) Dynamic range is equal to one<br>reference clock period (R)<br>(6) Better performance in multi-<br>channel TDC (R) |
| Time Stamping<br>(1) Less conversion time so effi-<br>cient for multi-hit time stamping<br>(M).<br>(2) Capable for multi-channel in-<br>tegration by sharing counter and<br>TDL(M)<br>(3) Large DR limited by number of<br>bits of counter (M)<br>(4) LSB depends on the interpola-<br>tion ratio of reference clock (D)<br>(5) Power dominated by DLL so<br>Provides power and area effi-<br>ciency in multi-channels (R) | Vernier Ring technique<br>(1) Sub-gate delay resolution (M)<br>(2) Sensitivity of resolution is<br>small for process variation (M)<br>(3) Theoretically infinite DR (M)<br>(4) Accuracy limited by jitter in<br>ring oscillators(R)<br>(5) Large conversion time(D)<br>(6) Area efficient(M))<br>(7) negligible non-linearity due to<br>local mismatch (R)                                                |

| Dual Vernier Ring technique<br>(1) Sub-gate delay resolution (M)<br>(2) Theoretically infinite DR(M)<br>(3) High accuracy due to reduces<br>effect of jitter obtained by differ-<br>ence in counts of both Vernier con-<br>verters (M)<br>(4) large conversion time(D)<br>(5) Requires reference clock(R)<br>(6) negligible non-linearity due to<br>local mismatch(R)<br>(7) Negative Time interval mea-<br>surement is possible (M)                                            | Nutt's Interpolation<br>(1) High resolution and large dy-<br>namic range(M)<br>(2) Needs two interpolators corre-<br>sponding to start and stop (R)<br>(3) Dynamic range of interpolator<br>is defined by clock period (R)<br>(4) Needs synchronizer to obtain<br>time intervals for interpolators<br>and to synchronize them with<br>counter(R)               |
|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Time Amplifiers<br>(1) Very small dynamic range<br>(100's ps)(D)<br>(2) Very High resolution (1's ps)<br>(D)<br>(3) Moderate conversion time(R)<br>(4) Low power (M)                                                                                                                                                                                                                                                                                                            | <ul> <li>Successive Approximation <ol> <li>Sub-gate delay resolution (1's ps) with low cost (0.35 μm) CMOS process (M)</li> <li>Requires digital to time converters in the TDC implementation (R)</li> <li>Dynamic range (100's ns) is limited by the number of conversion bits</li> <li>Low power consumption Large conversion time(D)</li> </ol> </li> </ul> |
| Cyclic DLL<br>(1) Sub-gate delay resolution (10's<br>ps) with small number of delay el-<br>ements and low frequency clock (5<br>MHz)(M).<br>(2) Requires MUX based delay ele-<br>ment to match the feedback delay<br>with other delay elements(R)<br>(3) High precision due to less non-<br>linearity in short delay lines and<br>stable clock (M)<br>(4) Low power consumption (D)<br>(5) Conversion time depends on<br>the required rotations to match<br>one clock period(R) | Hierarchical techniques<br>(1) Efficient in reducing the length<br>of delay line at low reference clock<br>frequency(M)<br>(2) High precision due to reduced<br>non-linearity error(M)<br>(3) Area and power efficient due to<br>less number of stages(M)<br>(4) Moderate conversion time(R)<br>(5) Synchronizers are required at<br>each interpolation level  |

#### Chapter 2. Review of Time Interval Measurement Methods and Techniques



Figure 2.37: Chronological order of TDC techniques

## Chapter 3

# TDC Design Aspects in CMOS Technology

## 3.1 Introduction of CMOS Technology



Figure 3.1: CMOS technology with history and road-map source. This figure is adopted from Ref.[151]

Today's field of microelectronics is dominated by the semiconductor device named 'complementary metal oxide semi-conductor' (CMOS) transistor by virtue of its low power consumption and sizing flexibility to achieve the desired performance. The trend of feature size scaling in MOS technology successively shrunk the consumed die area, which led to spectacular increase in integration level and cost reduction for bulk production of ICs. As a result, MOS technology gained widespread acceptance for ASIC based front-end electronics in HEP experiments while replacing the ECL and bipolar logic based power & area consuming HMCs. In addition,

the current scenario of ASIC design methodology is efficient for faster design cycles due to coupling of passive components, bi-polar transistor and pre-characterized CMOS analog and digital standard cells with the IC design CAD tool. The rapid development in the CMOS process technology right from the era of 1960's is shown in Fig.3.1.

## 3.1.1 Structure of MOS device



Figure 3.2: (a) MOS as capacitor (b) NMOS driven by the gate and drain voltage (c) cross sectional view of NMOS device

The MOS device is built using a metal (conductive plate), insulator (dielectric material) and semiconductor materials. The choice of materials depends on their electrical properties, stability at high temperature and ease of fabrication. The MOS structure follows the simple geometry of a parallel plate capacitor as shown in Fig.3.2(a). The 'insulator' of high dielectric constant 'k' is sandwiched between conductive 'metal' plate at the top and 'semiconductor' plate at the bottom. An extrinsic semi-conductor (n or p type) is chosen in order to have sufficient conductivity in mirroring the charge placed on the top plate, named as gate terminal. The semiconductor plate serves as a substrate for MOS device. When voltage is applied on the gate, it attracts the opposite charge on the  $SiO_2$ -Si interface and creates a channel of mobile charge carriers (electrons) that potentially serve as a conductive path for current. To allow current flowing through the conductive path, two ohmic metal contacts are attached to the substrate via heavily doped 'n/p' type semiconductor. These contacts are known as a 'source' and 'drain', where the former provides the charge carrier and the latter absorbs it, as shown in Fig. 3.2(b). The 'n' type source and drain forms back to back n-p-n junction diodes with the p-type substrate, which prevents the flow of current through the substrate. The dimension of gate along the source-drain path is called the channel length, 'L' and perpendicular to this length is called channel width, 'W'. Since during fabrication, the source/drain junctions side-diffuse, the actual distance ' $L'_{eff}$  between source and drain is slightly less than L and is equals to ' $L - 2L'_D$  where,' $L'_D$  is amount of side diffusion as shown in Fig.3.2(c)

The MOS device serves as a voltage controlled current source, as density of charge carrier is a function of applied gate voltage. In order to have a strong control of gate voltage, the value of capacitance,  $C = \epsilon_{sio2}/t_{ox}$  has to be high, where ' $\epsilon'_{sio2}$  is dielectric constant of insulator  $Sio_2$  and ' $t'_{ox}$  is the oxide thickness. This is feasible by using a very thin layer of oxide (~ few nanometers) and by using an insulator with high dielectric constant.

There is no conductive path in the vertical direction of the device through the substrate to gate due to the insulating layer in between them. To reduce the leak-age current through source-to-substrate and drain-to-substrate, both back-to-back p-n junctions are biased in a reverse mode.

The NMOS transistor consists of 'n+' drain and source regions, fabricated in a p-type substrate. The current is carried by electrons moving through an ntype channel between source and drain. In PMOS transistor 'p+' drain and source regions are embedded in n-type substrate. In a complementary MOS technology (CMOS), NMOS and PMOS are present in the same substrate.

## 3.2 MOS operating regions and logic styles

The MOS transistor can be operated in four operating regions:-cutoff, sub-threshold, saturation and linear. These operating regions are defined by the relation among: gate-to-source voltage  $V'_{GS}$ , drain-to-source voltage  $V'_{DS}$  and threshold voltage  $V'_t$ .

The sub-threshold region is characterized by supply voltage  $V'_{dd}$  below  $V'_t$  and small drive current that decays exponentially with respect to  $V'_{GS}$ , as shown

in Fig.3.3(a). This operating region is gaining attention for design of logic circuits [152], used in low power battery operated applications like pacemaker and digital wrist watch. However, due to large exponential delay dependency on  $V'_{GS}$ , the sub-threshold logic circuit is not suitable for low power TDC designs.



Figure 3.3: Operating regions of typical NMOS transistor with size  $10\mu$ m/0.35 $\mu$ m

The sub-threshold current is viewed as attendant evil in traditional static CMOS logic due to its contribution in leakage current when device is in standby. In various static logic styles like complementary CMOS, complementary passtransistor logic (CPL)[153] and differential cascade voltage switch logic (DCVSL) [154], the MOS transistor works as a switch by operating in deep linear region (when on) and cutoff (when off) for stable inputs. As shown in Fig.3.3(b), the deep linear region has linear relation between  $V'_{DS}$  and drain current  $I'_{DS}$ , therefore, MOS transistor is modeled as a static resistor with finite on-resistance. In cutoff, the on-resistance has theoretically infinite value. The finite on-resistor acts as a pullup (pull-down) path by stucking the voltage across load capacitor to  $V_{dd}$  (gnd) for stable inputs. However, during transition in input signal, the on-resistance varies as pull-down (or pull-up) transistor passes from cutoff to linear (linear-to-cutoff) through saturation region while discharging (or charging) the load capacitor. The operating region transition causes a significant delay in the propagation of signal from input to output. This delay is large in complementary CMOS logic as compared to CPL and DCVSL logic styles. In CPL logic, the pull-up and pull-down resistors are in parallel, therefore provides a less resistive path for charging and discharging of load capacitor. In DCVSL logic, the regenerative positive feedback provides fast pull-up and pull-down.

The values of on-resistance reduces by increasing the aspect ratio of tran-

sistor. Therefore, to achieve small transition time (rise/fall time), transistors with large aspect ratio can be used. On the other side, it increases the input parasitic capacitances, which contribute in the formation of load capacitor, thereby increases the propagation delay. Thus, in static CMOS logic, there are various design and process parameters (discussed in section(3.3)) that affect and limit the gate propagation delay and therefore the TDC resolution for gate delay based techniques.

In the aspect of power consumption, the static logic exhibits no rail-to-rail (static) current when inputs are stable. The power consumption includes only dynamic and the sub-threshold leakage currents, therefore this logic style has been frequently used for low power TDC design.

The MOS current mode logic (MCML)[155] is preferred for design of high speed buffers in mixed signal high resolution TDC at the cost of high power consumption. Here, switching of current with respect to input swing in the two halves of differential circuit determines the logic levels. The high speed is attained as the driver transistor is maintained in a saturation operating region (Fig.3.3) for whole input partial swing. The differential signals in this logic offers a high immunity for common mode noise. Also, it exhibits low switching noise, achieved by drawing the continuous rail-to-rail static current during the stable inputs. However, the partial signal swing reduces its interfacing ability with other logic styles in mixed signal TDC designs. Also, the rail-to-rail current leads to high static power consumption, which limits its use in low power TDC designs.

Alternatively, the current balanced logic (CBL) [156] provides high speed and reduced switching noise by a low power and simple technique. This logic is derived from pseudo NMOS logic with additional static current sinking (or balancing) transistor, when driver NMOS is in cutoff mode. Here, improvement in speed is obtained due to reduced parasitics and fan-out capacitance (one NMOS transistor). This logic does not use current sources; therefore the output swing is almost rail-to-rail, which makes it compatible in interfacing with standard cells available in PDK used in mixed signal designs.

# 3.3 Impact of various parameters on speed of CMOS inverter

The speed of CMOS inverter is an important aspect in the design of digital TDC's, as it defines the unit of digitization (resolution) in time interval measurement. The CMOS inverter provides a minimum propagation delay with logic inversion. To

maintain the polarity of applied signal, usually CMOS buffers (two cascaded inverters) are used.

The propagation delay of an inverter can be analyzed by applying two different approaches. The first approach is based on switch model [153] of CMOS inverter, where the propagation delay is defined by the time constant  $R_{on} \times C'_{load}$ . Here, the on-resistance  $R'_{on}$  of MOS transistor varies non-linearly with respect to the output drain voltage of transistor. This variation of on-resistance is addressed by using an average value of resistance over the range of voltage of interest (0 to  $V_{dd}/2$ ). This gives average propagation delay as-

$$t_{p} = 0.69 \times C_{load} \times \left(\frac{R_{eqn} + R_{eqp}}{2}\right)$$
where,  $R_{eqn} = \frac{1}{\frac{V_{dd}}{2}} \int_{V_{dd}}^{V_{dd/2}} \frac{V_{out} \times dV_{out}}{I_{DS}(V_{out})}$ 
(3.1)

Where,  $R'_{eqn}$  and  $R'_{eqp}$  is average on-resistances of NMOS and PMOS respectively. The current  $I_{DS}$  used for calculation of on-resistance is determined as per the operating region of transistor, given in Fig.3.3(b).

Another approach to find the propagation delay is to integrate the transient current  $I'_{DS}$  of load capacitor  $C'_{laod}$  from  $V_{dd}$  or gnd to 50 % of supply voltage (for rail-to-rail swing). The falling edge propagation delay is given by equation(3.2), where the drain current  $I_{DS}$  is a non-linear function of  $V'_{out}$  and is determined as per the operating region of transistor, given in Fig.3.3(b).

$$t_p = C_{load} \times \int_{V_{dd}}^{V_{dd/2}} \frac{V_{out} \times dV_{out}}{I_{DS}(V_{out})}$$
(3.2)

In both the approaches for delay calculation, it is assumed that the load capacitor is fixed over the transistor operating region transitions as well as type of input transition.

It can be deduced from the above equations that the propagation delay in CMOS logic is a function of design parameter (load Capacitance  $C'_{load}$  and aspect ratio (W/L) of transistors), process parameters (threshold voltage  $V'_t$  and oxide thickness  $t'_{ox}$ ) and operating condition (supply voltage  $V'_{dd}$  and temperature T').

#### 3.3.1 Capacitive load effect on propagation delay

The inverter delay is dependent on the load capacitance  $C'_{load}$ . This load capacitance is a function of drain to bulk diffusion capacitances  $C'_{db}$ , Miller capacitance of value  $2 \times W \times L_d \times C'_{ox}$  (Inverter intrinsic parasitic capacitances), fan-out capacitance and the parasitic capacitances of wire interconnects [153]. Large load capacitance results in higher propagation delays. The diffusion capacitances can be reduced by using multi-finger layout technique [157] and parasitic interconnect can be reduced by limiting the length of interconnect to improve the speed of Inverter.

#### 3.3.2 Temperature effect on speed of MOSFET

There are two primary temperature dependent effects in MOS device given by equation(3.3) and equation (3.4). The first is a change in threshold voltage, which tends to have a negative temperature coefficient (TC) ' $\alpha$ ' similar to that of ' $V'_{BE} \approx -2mV/^{o}C$  of bipolar transistor. The threshold voltage decreases with increase in temperature, where ' $\alpha$ ' is in the range of  $-0.5 mV/^{0}K$  to  $-3.0 mV/^{0}K$ .

$$V_T(T) = V_T(T_0) + \alpha(T - T_0)$$
(3.3)

$$\mu(T) = \mu(T_o) \left(\frac{T}{T_o}\right)^{Km}$$
(3.4)



Figure 3.4: Simulated typical delay versus temperature characteristic of buffer in 0.35  $\mu$ m CMOS process

The second effect is of mobility reduction with increase in temperature, where,  $K_m$  is in the range of -1.2 to -2.0,  $T_0$  is the reference temperature (300<sup>0</sup>K) and T is the environment absolute temperature.

Both temperature dependent mobility and threshold voltage effects impact the conduction current in opposite directions. The threshold voltage effect dominates when  $V'_{GS}$  is comparable to the threshold voltage  $V'_t$  of MOS transistor and leads to increment in current with temperature. The mobility effect dominates if  $V'_{GS}$  is higher than  $V'_t$  and leads to decrement in current with temperature. As in logic circuits the  $V'_{GS}$  of NMOS (or PMOS) is either  $V_{dd}$  (or- $V_{dd}$ ), the mobility effect dominates and conduction current decreases with increase in temperature. This ultimately leads to increase in propagation delays of CMOS buffer, as shown in Fig.3.4.

## 3.3.3 Threshold voltage effect on inverter delay

Referring equation of current shown in Fig.3.3, lower  $V'_t$  increases the current thereby reduces the propagation delay. Therefore, in time interval measurement techniques, where resolution depends on the gate propagation delay, low  $V'_t$  transistors can be preferred to improve the resolution. Usually, two sets of  $V_t$  devices are available in the mixed high performance process. However, for low  $V_t$ , it needs an additional mask, which will cost more than a standard  $V_t$ .

## 3.4 Jitter sources in time interval measurement

CMOS integrated circuit carries jitter mainly due to two types of noise: device electronic noise and environmental noise. The device electronic noise is contributed by the active (MOS transistor) and passive (resistor and capacitor) devices [91]. The main noise sources for MOS transistor are 'channel thermal noise', ' $\frac{1}{f}$  noise', 'bulk resistance thermal noise', 'gate resistance thermal noise' and 'gate leakage current noise', discussed in section (3.4.1). The environment noise is contributed by power supply and substrate noise, discussed in section (3.4.2). Both device and environmental noise sources manifest themselves as a source of random jitter[158].

## 3.4.1 Types of noise in MOS transistor

#### 3.4.1.1 Channel thermal noise

The thermal noise is due to the random motion of charge carriers in the channel. The spectral power density of current or voltage noise varies in different operating region of the device from weak inversion to strong inversion. The drain current spectral power density of channel noise while operating in strong inversion and saturation is given by equation(3.5)[159], where,  $g'_m$  and  $g'_{mb}$  are gate transconductance and bulk trans-conductance, K' is Boltzmann constant, T' is the absolute temperature.

$$\frac{i_{ds}^{-2}}{\Delta f} = \frac{8}{3} \times K \times T \times (g_m + g_{mb})$$
(3.5)

The input referred noise is given by dividing above equation by  $g_m^2$ 

$$\frac{i_{ds}^{2}}{\Delta f \times g_{m}^{2}} = \frac{8}{3} \times K \times T \times \left(\frac{g_{m} + g_{mb}}{g_{m}^{2}}\right)$$
(3.6)

The equation (3.6) can be rearranged as-

$$\frac{\overline{v_{in}}^2}{\triangle f} = 4 \times n \times K \times T \times \frac{2}{3} \times \frac{1}{g_m}$$
(3.7)

Where,  $n = (g_m + g_{mb})/g_m$  is proportional to inverse of sub threshold slope. This expression of 'n' is valid in weak, moderate and strong inversion region.

Channel thermal noise when device is operating in weak inversion and saturation is given by[160]-

$$\frac{i_{ds}^{-2}}{\Delta f} = 2 \times q \times I_d \tag{3.8}$$

Where, 'q' is electronic charge and ' $I'_d$  is drain current, which can be represented in terms of trans-conductance ' $g'_m$  and ' $V'_T$  by equation-

$$g_m = \frac{I_d}{n \times V_T}$$
 where,  $V_T = KT/q$  (3.9)

Dividing equation(3.8) by  $g_m^2$ -

$$\frac{\overline{v_{in}}^2}{\Delta f} = \frac{2 \times q \times I_d}{g_m^2} = 2qnV_T \frac{1}{g_m}$$
(3.10)

Putting the expression for  $V_T = KT/q$ , the input referred noise in weak inversion is given by-

$$\frac{\overline{v_{in}}^2}{\Delta f} = 4 \times K \times T \times n \times \frac{1}{2} \times \frac{1}{g_m}$$
(3.11)

Using equations (3.5) and (3.11), the general expression for input referred channel thermal noise in saturation and in any inversion region is given by-

$$\frac{\bar{v_{ch}}^2}{\Delta f} = \gamma \times 4 \times K \times T \times n \times \frac{1}{2} \times \frac{1}{g_m}$$
(3.12)

where  $\gamma$  varies from 1/2 to 2/3 depending upon weak to strong inversion region for ideal MOS device.

#### 3.4.1.2 Flicker noise

The 1/f (or flicker) noise, according to the McWorther model [161], is due to random trapping and detrapping of mobile carriers in the traps located at the Si- $SiO_2$ interface and within the gate oxide. It is the dominating source of noise in a MOS device at low frequencies and its input referred power spectral density [162] is given by-

$$\frac{\bar{v}_{1/f}^2}{\Delta f} = \frac{K_a}{C_{ox}^2 \times W \times L} \times \frac{1}{f^{\alpha}}$$
(3.13)

Where ' $\alpha$ ' is a parameter close to 1 and ' $K'_a$  is a technology dependent parameter, which expresses the noise characteristic of the process. The constant ' $K'_a$  has two different values for NMOS and PMOS transistors for a given technology. The PMOS has lesser flicked noise than NMOS transistor.

#### 3.4.1.3 Bulk resistance thermal noise

The three dimensional distributed substrate resistances  $R'_b$  introduces a noise component, which when referred to the input, can be expressed by equation (3.14)[159]

$$\frac{\bar{v}_{R_b}^2}{\Delta f} = 4 \times K \times T \times R_b \times \frac{g_{mb}^2}{g_m^2}.$$
(3.14)

#### 3.4.1.4 Gate resistance thermal noise

This source of noise is originated from the poly-silicon gate resistance  $R_G$ . This noise component is directly present at the input and is expressed by equation(3.15)[159]

$$\frac{\bar{v}_{R_g}^2}{\Delta f} = 4 \times K \times T \times R_G \tag{3.15}$$

Total input referred noise power spectral density referred to the input can be expressed by adding together the above discussed four noise contributions and is given as-

$$\frac{\bar{V_{in}}^2}{\Delta f} = 4KTR_B \frac{g_{mb}^2}{g_m^2} + 4kTn\gamma \frac{1}{g_m} + \frac{K_\alpha}{C_{ox}^2 \times W \times L} \times \frac{1}{f^\alpha} + 4KTR_G$$
(3.16)

Other kinds of intrinsic noise having a spectral density with a higher order

dependency on the frequency such as popcorn or burst noise, reflects mostly the quality of processing of material. Their amplitude probability density function (PDF) is not Gaussian. Finally avalanche or breakdown noise is caused by the avalanche process just before junction breakdown. Its spectral density is usually flat and its amplitude PDF is not Gaussian.

#### 3.4.2 Substrate and power supply noise

The noise generated in the substrate is due to the minority carrier injection and coupling through the parasitics [157].

The minority charge carriers (electrons in p-type substrate/holes in n-type substrate) are injected from source and drain diffusions into the substrate when source/drain-to-substrate p-n junction diode turns in forward bias. This may happen due to the following possibility-

- Inductive ground path causes substrate to bounce.
- Potential drop due to resistive power and ground path from the power pins to the N-well and substrate.
- Fast switching signal with significant overshoot.

In order to reduce the disturbance from the minority carriers, proper guard rings are used surrounding the MOS transistor. For PMOS in the n-well, p-diffusion guard ring tied to ground is used. The p-type source/drain diffusions inject stray holes into the n-well. These stray holes are collected efficiently by the p-type guard ring that is biased to ground to attract the holes. Similarly, to surround NMOS in the p-substrate, n-well guard ring tied to  $V_{dd}$  is used. The n-type source/drain injects stray electrons into the substrate. These stray electrons are collected efficiently by the n-well guard ring that is biased to  $V_{dd}$  to attract the electrons.

The noise is coupled to the substrate through the junction capacitance (wellto-substrate or diffusions-to-substrate) at the time of switching at drain or source node. The coupled noise magnitude is proportional to the 'magnitude of junction capacitance', 'amount of current during switching' and 'number of gates switching at the same time'. This noise entirely depends upon the complexity of system and its activity factor[163, 164].

Noise coupled from power supply is another source of substrate noise. The power supply noise is due to the variation in dc voltages of power and ground distribution networks. This variation combines with the resistance 'R' of the network

and causes IR drops. The second type of power supply noise is 'delta-1' noise [165]. It is produced due to the simultaneous switching of off-chip drivers and internal logic, usually synchronized with the clock. This switching activity injects a large current spike in the supply network, as shown in Fig.3.5. This current spike combines with the inductance 'L' associated with the supply lines and package inductance, results voltage fluctuation on the power supply network. The amount of inductive noise depends upon switching speed and distributed inductance per unit length 'L'.



Figure 3.5: Noise in supply voltage due to transient current in CMOS logic.This figure is adopted from Ref.[178]

The impact of substrate noise on time interval measurement circuit appears as a source of random jitter[158]. It causes a change in the threshold voltage of the device by body effect, which results variation in propagation delay of logic circuit. In addition, the sensitive analog node such as bias voltage of analog voltage controlled delay elements and filter capacitor voltage of DLL or PLL also becomes affected by capturing the substrate noise through junction capacitors. To reduce the effect of substrate coupling noise, guard rings with the configuration 'p-type tied to ground for NMOS and n-type tied to  $V_{dd}$  for PMOS device'can be used around the critical transistors.

## 3.5 Interconnect noise

This noise is due to the interference of one part of circuit behavior to the sensitive nodes of other part of design. The level of interaction depends on the circuit layout

design style and signal distribution topology. The common types of noise coupling methods in integrated circuit domain are capacitive coupling and inductive coupling.

## 3.5.1 Capacitive coupling

Capacitive coupling is due to the existence of electric fields between any two conductors. The current flowing through the coupling capacitor is a function of the rate of change of potential difference across its terminals. Therefore, any signal variation in one of the plates of the coupling capacitor induces a variation on the other plate. This effect is often known as *crosstalk*[166, 167]. It may be significant if the coupling capacitor is large (e.g, two long parallel lines) or high frequency and large amplitude signal variations occur close to a weak signal path. The cross talk impacts the signal integrity of weak signal line by three ways-

- If one line is switching and other is quiet, a glitch appears on quiet line due to energy transfer through coupling capacitance.
- If both lines have simultaneous signal transition in opposite directions, due to Miller effect [164, 168, 169, 170], the coupling capacitance appears as twice than nominal case, thereby maximizing the effective capacitive load derived by victim line. This increases the signal transition delay on victim line due to increased RC time constant.
- If both lines have simultaneous transitions in the same direction, the capacitive load derived by victim line becomes minimum since there will be no contribution by coupling capacitance [169, 170]. This effect reduces the RC time constant and minimizes the propagation delays.

The gap between the maximum and minimum propagation delays increases with the increment in coupling capacitance [164, 170]. The coupling capacitance increases in the deep sub-micron CMOS process due to reduced interconnect pitch and spacing of interconnects in same metal layer. Therefore, cross talk prominently appears in bus dominant architectures designed using deep sub-micron CMOS process[171].

## 3.5.1.1 Key points to reduce the cross talk

Various techniques are used to reduce the cross talk in CMOS digital circuits such as metal layer widening, increasing adjacent metal layer spacing, buffer insertion, net ordering, shielding, and differential signaling[164]-

- By increasing the layer spacing, coupling capacitance can be reduced[173].
- Widening of metal wires helps to reduce the effect of coupling capacitance. This is achieved by increasing the net to ground capacitance through wide overlapping area, which results a decrease in ratio of coupling to total capacitance. On the other side, the increased line to ground capacitance results in increased delay.
- Net ordering is the effective method to reduce the cross talk. It is implemented by putting the sensitive nets apart in the design to reduce the interaction from interfering nets [174, 175, 176].
- To reduce the cross talk, two metal layers routed adjacently in a same metal layer carrying the high frequency clocks are shielded by metal layer tied statically to ground or supply[174, 175, 176], so called 'passive shielding'. This effectively turns the coupling capacitance into capacitance to ground and eliminates interference[177]. The routing of signals in the adjacent metals available in the CMOS process can be drawn orthogonal to reduce the overlapping area, which reduces the coupling capacitance.
- To maintain the signal integrity between the nets, where switching is in opposite direction, active shielding[176] has been proposed. It uses shields on either side of wire, which speeds up the signal propagation through Miller effect.

## 3.5.2 Inductive coupling

In the integrated digital designs, inductive coupling is dominant over the supply voltage by injecting the ' $L \times di/dt'$  noise[163, 164]. This is due to the fact that during transition, a transient current is sourced (or sinked) to the supply rail to charge (or discharge) the load capacitance inside the chip. Both supply and ground rails are connected to the external supplies through the bonding wire and package pins, which possess a non ignorable series inductance  $\approx 7 \text{ nH/mm}$  and 30 nH/mm respectively[153]. Therefore, by the property of inductor, a large spike of transient current produces a voltage ' $L \times di/dt'$  that is in the opposition direction of its reason of production [Lenz's law]. This results in a difference in the external (on board) and internal (on-chip) supply voltages. This effect is severe on the output pads, where driving a large external load capacitance generates large current surges. The

variation in internal supply voltage may affect the logic level and noise margin. Several methods [163, 164] can be used to reduce the inductive noise-

- Power supply pins dedicated for pads and internal core logic can be separated so that noise will not severely affect the internal logic performance.
- By careful selection of location of power supply pins on the package (middle pins for smallest length of bonding wire), type of the package and length of bonding wire, the inductance can be reduced.
- By increasing the rise time or fall time of external signal to the input pad driver as well as input to the output buffer pad, helps to reduce  $L \times di/dt'$  noise as this is proportional to the rate of change of current.
- EBIS model of package can be used to simulate and optimize the issues due to inductive coupling before selecting the package type like DIL,PLCC,QFP and PGA.
- By using on-chip decoupling capacitors, the voltage fluctuations on the power supply lines can be reduced.
- By using the reduced supply bounce CMOS logic, where digital circuits are implemented in static CMOS together with a simple circuit for reducing the switching noise on supply lines.
- By choosing the constant current CMOS logic styles (discussed in section 3.2) such as current balanced logic (CBL),current mode logic (CML) and current steering logic (CSL), where the idea is to maintain the supply currents as stable as possible.

## 3.6 CMOS process variation

In CMOS technology, the process variation is caused by the inability to precisely control the fabrication steps. This leads to the variation in silicon process parameters such as device dimensions 'W/L', oxide thickness ' $t'_{ox}$  and dopant concentration. The limited resolution of photo-lithographic process causes a variation in the device dimensions 'W' and 'L' [178]. The channel length 'L' also varies randomly across the width of device so called line edge roughness (LER), which is caused due to statistical variations in the photon count or imperfections during photo resist removal [179]. The variation in 'W' and 'L' are uncorrelated as 'W' is defined by

field oxide step and 'L' by gate, source/drain diffusion step of fabrication process.



Figure 3.6: Average number of dopant atoms versus CMOS process nodes.This figure is adopted from Ref.[178]

The variation in threshold voltage  $V'_t$  is due to change in oxide thickness, impurity level in substrate & poly-silicon and implant doping statistical variations. In the MOS transistor, the channel region is doped with the impurity atoms for threshold voltage adjustment by the technique called 'dopant implantation'. Here, the random variation in the impurity atoms, known as random dopant fluctuation (RDF)[180], causes a shift in threshold voltage. This variation is more dominating in the modern CMOS process (nano-scale) as number of dopant atoms per channel region keep on reducing with process node as shown in Fig.3.6. Therefore, a slight variation in small number of atoms in the channel region causes a significant shift in threshold voltage.

The process variations are categorized into global and local as per their impact on device performance. The global variations like oxide thickness and dopant concentration appear equally for all devices. The lot-to-lot, wafer-to-wafer and die-to-die variations fall into this category[181]. In case of local variations, which are uncorrelated or random, each device within the die is affected individually by process induced variations. This variation further can be subdivided into systematic and random. The systematic variation is caused by lithographic aberrations and they have spatial correlation therefore, nearby devices share same device parameters. RDF and LER induces random variations, which are spatially uncorrelated, therefore randomly varying parameters of transistor differ from its immediate neighbors.

#### 3.6.1 Impact of process variation on performance of TDC

The variation in device dimension W/L', threshold voltage  $V'_t$  and device transconductance K' causes a variation in device current. This affects digital circuits by changing the propagation delay of logic gate as it is inversely proportional to the device current. The impact of local variations in these parameters appears as mismatch in the propagation delay of neighboring identical gates. This mismatch in propagation delay can be found by partial derivatives of delay equation (3.17)[182].Here  $K'_n$  and  $K'_p$  are the transconductance for NMOS and PMOS devices respectively and  $\alpha < 2$  is for short channel device.

$$t_{p} = \frac{1}{2} \left( \frac{C_{load} \times V_{dd}}{k_{n} (V_{dd} - V_{tn})^{\alpha}} + \frac{C_{load} V_{dd}}{k_{p} (V_{dd} - V_{tp})^{\alpha}} \right)$$
(3.17)

The derivative result is given by-

$$\sigma_{tp}^{2} = \frac{t_{pd}^{2}}{2} \left( \frac{2\sigma_{C_{load}}^{2}}{C_{load}^{2}} + \frac{\alpha^{2}\sigma_{Vtp}^{2}}{\left(V_{dd} - V_{tp}\right)^{\alpha}} + \frac{\sigma_{Kp}^{2}}{K_{p}^{2}} + \frac{\sigma_{Kn}^{2}}{K_{n}^{2}} + \frac{\alpha^{2} \times \sigma_{Vtn}^{2}}{\left(V_{dd} - V_{tn}\right)^{\alpha}} \right)$$
(3.18)

Where  $\sigma_{Vtn}^2$ ,  $\sigma_{Vtp}^2$ ,  $\sigma_{kn}^2$ ,  $\sigma_{kp}^2$  and  $\sigma_{cload}^2$  are the variance in  $V_{tn}$ ,  $V_{tp}$ ,  $K_n$ ,  $K_p$  and  $C_{Load}$  respectively.



Figure 3.7: Standard deviation in the delay for chain of buffer due to local mismatch

The equation(3.18) shows that to reduce the variation, device transconductance ( $K_n$  or  $K_p$ ),  $C_{Load}$  and  $V_{dd}$  should be large. The standard deviation  $\sigma_{tp}$  in delay for each buffer in the cascaded chain accumulates over the entire length, as shown in Fig.3.7. The total delay variation after the  $N^{th}$  buffer is given by equation(3.19)[178]. This delay variation manifests itself as a non-linearity error in the TDL based TDC characteristics.

$$\sigma^2_{tpN} = \sqrt{N} \sigma^2_{tp} \tag{3.19}$$

Local variations are more prone with the scaling in CMOS technologies. This comes from the fact that with decreasing transistor dimensions, the standard deviation of threshold voltage and transconductance factor increases, since they are proportional to the inverse of the square root of the active device area[91]. It is therefore needed to take these issues into account carefully during design phase to achieve high performance of TDC.

The global variation in gate delay affects the offset and resolution of TDC. By performing Monte Carlo (MC) statistical simulation of transistor level circuit, the impact on delay variation can be observed across the design process corners[183]. For instance, the standard deviation of inverter delay in 0.35  $\mu$ m CMOS process variation is shown in Fig.3.8. As MC simulations are computationally intensive and hence hardly feasible for large designs, therefore corner simulations are typically used to evaluate the impact of global variations. The corners represent extreme cases, where the devices ultimately diverge from their nominal (typical) characteristics. For the fast corner 'WP', all process fluctuations increase the drive current of a transistor, leading to maximum speed. At slow corner 'WS', a device is ultimately slowed down by the process variations. Besides fast and slow corner, also cross corners exist with maximum PMOS and minimum NMOS speed and vice versa. Cross corners are critical in the design of digital or mixed signal TDC.



Figure 3.8: Effect of global variations on the gate delay in 0.35μm CMOS process on 27<sup>0</sup>C and 3.3V using Monte Carlo simulator

## 3.7 Aspect of delay lock loop (DLL) in TDC

The sensitivity of TDC resolution to global process variations poses a challenge for design & development of accurate TDC. In order to achieve process portability, a delay variation compensation circuit 'DLL' [158, 184, 185] is integrated along with time interval measurement circuitry. The working principle of DLL is based on 'Servo mechanism', where the control loop has three main functions: it compares the delayed clock edge with the reference clock, convert the phase difference into equivalent time duration pulse, integrate the time duration in terms of voltage and hold the value until new information comes. This functionality is implemented with the help of 'voltage controlled delay line (VCDL)', 'phase detector (PD)', 'charge pump (CP)'and 'filter capacitor (C)', as shown in Fig.3.9.



Figure 3.9: Block diagram of DLL

The VCDL is realized by a cascaded chain of 'N' voltage controlled delay elements. It introduces a delay in the applied clock by an amount of ' $N \times T'_d$ , where ' $T'_d$  is the unit delay provided by the used delay element. The delayed output clock is compared to the rising edge of reference clock corresponding to next cycle using PD. If rising edge of delayed clock lags the rising edge of reference, the 'DN' output of PD goes 'high' for the duration equivalent to phase error. The 'UP' signal is maintained at supply voltage so that it keeps off the switch  $S_1$ . The 'DN' signal turns on the switch  $S_2$  of CP, which allows a discharging of filter capacitor through the constant bias current. This decreases the voltage across the capacitor, which results a slight reduction in delay ' $T'_d$  (if in VCDL characteristic delay increases with the control voltage) leading to reduction in VCDL delay. This results in reduced phase error due to reduced time gap between rising edges of both the clocks. After few numbers of iterations of loop corrections, the rising edges of both the clocks coincide. This leads to the locking of VCDL delay to one clock period of reference clock.

On the contrary if rising edge of delayed clock leads to the rising edge of reference clock, the 'UP' output of PD goes low for a duration equivalent to the phase error. The 'DN' signal remains at gnd and turns off the switch  $S_2$ . The 'UP' signal turns on the switch  $S_1$ , which allows charging of filter capacitor through the constant bias current. This increases the voltage across capacitor, which results increment in delay of VCDL so that after some iterations rising edges of both the clocks coincide. On coincidence, the PD does not detect any phase error therefore; the control voltage ' $V'_{ctrl}$  remains steady ideally.

The slope of delay versus control voltage characteristics of delay element used to realize VCDL may be positive or negative, depending on the design criteria of delay element. The PD is designed so that its functionality is in agreement with slope of delay characteristic to fulfil the requirements of negative feedback loop of DLL.

## 3.8 Description of DLL design blocks

#### 3.8.1 Voltage controlled delay element

Voltage controlled delay elements are used in DLL due to their capability of finegrain delay variation. These delay elements are categorized into two groups: one is full swing based on single ended architecture[185, 182] and other is partial swing based on differential architecture[188, 189].

#### 3.8.1.1 Full swing delay element

Current Starved Inverter (CSI) and RC loaded inverter are in the category of full swing delay element. As shown in Fig.3.10, the CSI[185, 182, 187] is realized by adding a PMOS or a NMOS or both voltage controlled transistors in series of pullup or pull-down or both the paths in CMOS inverter. This is followed by another high strength CMOS inverter to improve the transition time of output signal as well as to reserve the polarity of applied clock.

The choice of added transistors depends on the requirement of delay introduced either in falling or rising or both the transitions of applied signal. The propagation delay of CSI (Fig.3.10(a)) is controlled by varying the rate of discharging of load capacitor at node X. This is achieved by varying the amount of current through transistor  $N'_{2}$ . Due to the second inverter at output node, the delay is introduce in rising edge transition of applied signal. To achieve delay in both edge transitions, a PMOS transistor  $P'_{1}$  is also added in series of pull up transistor  $P'_{2}$ , as shown in Fig.3.10(b). In this architecture of CSI two separate control voltages for PMOS and NMOS are needed to control the falling and rising edge transition delays. To avoid the use of two control voltages, another architecture of CSI using only NMOS voltage controlled transistors  $(N'_{2} \text{ and } N'_{4})$  with common control voltage to introduce the delay in both transitions, as shown in Fig.3.10(c) is used.



Figure 3.10: Full swing delay elements (a) CSI with delay in rising edge transition (b) CSI with delay in both edge transitions (c) CSI realized by NMOS transistors with delay in both edge transition (d) RC loaded inverter



Figure 3.11: Pre-layout delay versus control voltage characteristics of full swing delay elements

In RC loaded delay element[184], in Fig. 3.10(d), the transistor  $N'_1$  acts as a voltage controlled linear resistor. It defines the charging (or discharging) current for the capacitor implemented by MOS transistor  $N'_2$  with its source and drain terminals shorted to ground. In this type of delay element the capacitor occupies large

silicon area.

The full swing delay elements based on static CMOS logic (discussed in section 3.2), exhibit negligible static power dissipation and full rail-to-rail swing. However, their delay regulation range is different as shown in Fig. 3.11 where, CSI has wide delay regulation range as compared to that of RC loaded inverter. The delay regulation range impacts the performance of DLL in terms of its portability across PVT variations. Also, the CSI delay versus control voltage characteristic has high gain and more non-linearity as compared to that of RC loaded inverter. Usually, the delay characteristic is adjusted so that gain at the target delay is small as noise on the control voltage easily disturbs the loop stability if target delay is biased in high gain region. Also, the least achievable delay in these architectures is limited to the RC parasitics of used MOS technology. The propagation delay is highly sensitive to power supply noise.

#### 3.8.1.2 Differential delay element

The differential delay element[188, 189], based on CML (discussed in section 3.2) is preferred due to high speed and its common mode noise immunity. It consists of source coupled differential pair with loads and biasing tail current source ( $I_{ss}$ ). The load may be resistive or diode connected depending on the design considerations such as output swing, control over the delay and supply noise rejection.

For diode connected load as shown in Fig.3.12 (a), the control voltage introduces a variation in tail current that causes a variation in equivalent resistance ' $r'_{out}$ of load transistor ' $P'_1$  and ' $P'_2$  as per equation 3.20. This leads to variation in propagation delay as per equation(3.21), where  $C_{load}$  is the load capacitance at output node. This architecture is limited due to several drawbacks such as voltage headroom, which limits the maximum output swing and uncontrolled output dc voltage.

$$r_{out} \approx \frac{1}{g_m} = \sqrt{\frac{1}{2 \times \mu_n \times C_{ox} \times W/L \times I_{ss}}}$$
(3.20)

$$T_d = \sqrt{2} \times r_{out} \times C_{load} = \sqrt{\frac{1}{\mu_n \times C_{ox} \times W/L \times I_{ss}}} \times C_{load}$$
(3.21)

In case of triode loads as shown in Fig.3.12(b), the MOS transistors  $P'_1$  and  $P'_2$  are biased in linear operating region by the control voltage. The propagation delay is controlled by varying the on-resistance of load transistor with gate over drive voltage of  $V_{ctrl} - V'_{tp}$  as per equation(3.22). The expression of propagation



Figure 3.12: Differential delay elements (a) diode connected loads (b) linear loads

delay is given by the equation(3.23). The maximum output voltage of this architecture is  $V'_{dd}$ ; however maintaining the active PMOS load transistors in triode region is difficult.

$$r_{out} \approx \frac{L}{\mu_p \times C_{ox} \times W \times (V_{dd} - V_{ctrl} - V_{tp})}$$
(3.22)

$$T_d = \sqrt{2} \times r_{out} \times C_{load} = \frac{\sqrt{2}C_{load}L}{\mu_p \times C_{ox} \times W \times (V_{dd} - V_{ctrl} - V_{tp})}$$
(3.23)

#### 3.8.2 Phase detector

Phase detector (PD)[190] generates digital pulse with a duration equivalent to the difference in phases between delayed and reference signal. The crucial performance metrics in this respect are 'dead zone'and 'dynamic range'. The dead zone limits the response of PD when phase difference between two clocks is too small to be detected. It appears in the form of static phase error after DLL achieves the lock point. The dynamic range defines the maximum phase difference that can be detected & responded by PD. Several architectures of PD such as phase frequency detector (PFD), XOR gate, and true single phase clock (TSPC)[185] have been reported earlier with different performance in terms of dead zone and dynamic range.

The phase frequency detector (PFD) is realized using two D flip-flops and a NAND gate, as shown in Fig.3.13(a). This PD works on the edges of input clocks thereby does not depend on their duty cycle. As shown in Fig.3.13(b), when de-

layed signal lags that of reference, the width of UP' signal is larger than DN' and vice versa. The difference in their widths is equivalent to the phase difference between two clocks. When phases of both clocks are identical, both UP' and DN' have equal widths. This causes an open path for direct current from  $V'_{dd}$  to ground in charge pump.

The difference of clock period and the reset path delay ( $\approx 100's$  ps) defines



Figure 3.13: PFD based phase detector (a) schematic diagram (b) timing diagram

dynamic range of PFD. For small phase difference, the pulse width of UP' or DN' signal is too small to trigger the charge pump thereby results in static phase error. To address this issue, an additional delay is introduced in the reset path, which enhances the pulse duration. However on the other side, the additional delay reduces the dynamic range of PFD, which may cause an issue of missing edge (refer chapter-7) during loop correction in DLL.

In the XOR gate based PD as shown in Fig.3.14, the dc average value of pulse duration is proportional to the phase difference of the two input signals and becomes zero when both are 90° out of phase. As a result, an XOR PD is often used in quadrature locking, where the two input signals to the PD are 90° out of phase in the locked state. The simple XOR PD has two limitations of single output and signal level detection. The single output is difficult to interface with the subsequent CP circuit. This issue is addressed by implementation of XOR based PD in dynamic logic style[158], as shown in Fig.3.14(a). The signal level detection causes the output to be dependent on the duty cycle of the input signals. As a consequence, it can give wrong phase difference information if duty cycle corrector is not used. The dynamic range is less than that of PFD and is defined by the half clock period with both clocks having 50% duty cycle.



Figure 3.14: XOR logic based PD (a) schematic diagram (b) timing diagram.This figure is adopted from Ref.[158]

The design of TSPC PD is based on dynamic logic style. As shown in Fig.3.15, the basic structure of a TSPC PD includes two blocks to generate the 'UP' and the 'DN' signals. The two blocks have exactly the same design except that the two input signals are switched in position. Each block consists of two cascaded stages with a pre-charged PMOS in each stage. The pre-charge activity of the second stage is often controlled by the output of the first stage. The dynamic PD eliminates flip-flops and has the advantages of simple structure and a fast transition time, resulting small dead zone, therefore is used in the design of high speed DLL. Also, the dead zone can be further reduced by increasing the aspect ratio of MOS transistors at the price of high peak transient current. The timing diagram of phase error detection is shown in Fig.3.16, where the dynamic range is limited to half clock period



Figure 3.15: Schematic diagram of true single phase clock based PD



Figure 3.16: Timing diagram of TSPC based PD (a) when delayed clock lags reference (b) when delayed lock leads reference clock

#### 3.8.3 Charge pump and filter capacitor

The phase difference between the reference and delayed output is sensed by the phase detector and converted in the form of voltage pulses (UP and DN) of duration equivalent to phase error. These voltage pulses are applied to the charge pump so that it adds or removes the charge to the filter capacitor in the pulse duration. Thereby, it adjusts the voltage across the capacitor, which alters the delay of VCDL.

The basic structure (refer Fig.3.9) of charge pump consists of two switches  $S'_1$  and  $S'_2$  associated with the current source and current sink respectively. The switches are controlled by the UP' and DN' signals provided by PD respectively. When switches are closed, the current source (or sink) starts adding (or removing) the charge from the filter capacitor. This process continuous until error signal (pulse width of UP or DN signals) ideally becomes zero, resulting in stable control voltage across filter capacitor. With PFD based PD after achieving the locked state, the CP has rail-to-rail current for the on duration of switches in each cycle of reference clock. This prevents the fluctuation in control voltage to minimize the amount of jitter if charging and discharging currents are equal. Along with matching in currents, this also requires matching in on duration  $t'_{on}$  of both the switches. The amount of jitter  $\Phi_{offset}$  due to mismatch  $\Delta i'$  in charging and discharging cur-

rent ' $I'_{cp}$  is given by [191]-

$$\Phi_{offset} = 2\Pi \frac{t_{on}}{T_{ref}} \frac{\Delta i}{I_{cp}}$$
(3.24)

This equation shows that to minimize the phase offset after locking, the on duration should be narrow or charge pump current should be large to reduce jitter.

Single ended and differential are the two types of charge pumps, which are used in the design of DLL[191]. A single-ended topology has the advantages of smaller area and less power dissipation, but is more vulnerable to supply and substrate noise compared to a differential topology. There are three basic configurations for a single-ended CP: switching in the source, drain and gate, as shown in Fig.3.17. In these architectures, the logic of 'UP' and 'DN' signals are processed to control the switches with the help of inverter as shown in Fig.3.17(d). The delay of inverter causes a mismatch in the on duration of switches, disturbing the control voltage with PFD based PD. To suppress this effect, a resistor implemented by CMOS switches is used to balance the delay. Also, dummy transistors are added to match the operational condition of switches.



Figure 3.17: Three single ended charge pump configurations (a) drain switching (b) source switching (c) gate switching (d) suppression of skew in UP and DN signals by CMOS resistor

Further, the drain switching architecture (Fig.3.17(a)) suffers from the charge sharing between the common drains of switches and the loop filter when the switch is on (closed). A structure with an active unity-gain amplifier was proposed in[192] to address this problem. Another enhancement to the basic CP is the adding of two additional current steering switches[193], which improve the switching speed. The third proposed[191] technique is to use only NMOS switches to avoid the mismatch

between NMOS and PMOS currents.

A fully differential CP as shown in Fig.3.18 consists of a set of NMOS & PMOS switches, two loop filters, and some common-mode feedback circuitry. Although differential CP is not as widely used as single-ended CP, they do possess several unique advantages-(1) the fully differential structure offers better noise immunity to common-mode noise sources such as supply and substrate noise (2) The fully differential structure offers better noise immunity to common-mode noise sources such as supply and substrate noise (3) The output voltage range can be doubled if the voltages of both loop filters are used. These advantages are achieved at the expense of double chip area and higher power dissipation.

The loop filter capacitor has two functions: one is to generate the control



Figure 3.18: Schematic diagram of differential charge pump

voltage and other is to reduce the noise ripples on control voltage. In DLL, a large value capacitor is used to implement this functionality.

## 3.9 Full custom ASIC design flow

The flow of full custom analog ASIC design is briefly given below-

- First phase is the technology identification, based on the support for passive, low noise characteristics, design kit integration in the available CAD tools and finally fabrication cost with multi product wafer support.
- The second phase is partitioning of the design into blocks and sub topologies. These topologies are translated to circuits and then are simulated at transistor level using the spice models (BSIM3) for required performance using the SPICE simulators. This allows freezing of sizing for the MOS devices. At this
stage of design, the tentative layout features are decided and this is embedded in circuit as parameters like length, width, area and perimeter of source, drain along with the number of gate fingers. Such an approach has smaller deviation with respect to the post layout simulation.

- Subsequent to finalization of circuit, the layout is crafted for typical mask design at about ten levels. Subsequently, the layout is analyzed for design rule errors (DRC) and then compared for equivalence with the circuit LVS (layout versus schematic).
- After clearing the process of DRC/LVS, the design circuit and parasitic are extracted from the layout. The extracted views are tool dependent and independent of initially drawn circuit. This extracted view is simulated to check if the desired specifications are met across the process corners.
- The design mask data is shipped to the manufacturer in GDS/CIF file format for IC fabrication.

The whole process is iterative and time consuming. As layout is hand crafted the effort and time needed is very high.

## Part III

**Research Work** 

### Chapter 4

## Design and Implementation of CMOS Standard Cell based Vernier Time-to-Digital Converter

#### 4.1 Introduction

In the INO experiment, the time interval between discriminator output signal and trigger needs to be measured. This information corresponds to time interval between neutrino induced muon event interactions with RPC and trigger. It provides up and down direction discrimination of neutrino particle, traversing through the layers of ICAL detector. For time interval measurement, a TDC needs to be designed for the specifications provided by INO working group. The resolution required for TDC is specified as better than 200 ps. In order to account the trigger latency, the dynamic range required for TDC is set  $1\mu$ s. Further, to cater millions of RPC detector channels, a TDC with good area and power efficiency is required. In the design and development of TDC, the crucial step is choice of time interval measurement technique that is efficient to fulfill these specifications with 0.35  $\mu$ m CMOS technology.

Various time interval measurement techniques with their relevant TDC ASICs are discussed in chapter-2. The sub-gate delay time interval measurement techniques are good choice as they eliminate the resolution dependency on our available technology capability. Among them, the Vernier ring technique is a simple one that uses two '*start*' and '*stop*' triggrable oscillators with slight difference in their time-periods. The difference in time-periods of both oscillators leads to the resolution of TDC. The time interval measurement demands the counting of num-

ber of cycles of start and stop oscillators until their phase coincidence is achieved. It provides resolution of the order of 10's of ps, low noise and area efficient design. However, the key design challenges in Vernier technique with CMOS technology are-

- Design of start and stop oscillators with precise time period difference (<200 ps).
- Stable oscillator frequencies across PVT variations.
- Issues of noise coupling among oscillators through RC parasitics
- Matching in resolution among channels, in presence of random and systematic process induced variations in device parameters

In order to overcome the design challenges pertaining to stabilization of oscillator frequencies while maintaining a small difference, earlier a CMOS TDC [143] that uses dual PLL with different frequency division ratios and referenced to an accurate off-chip clock is reported. The bias voltages are provided by their control loops to tune and stabilize the frequency of the voltage controlled oscillators. However, the use of PLLs adds-on a high design complexity and additional power requirement. Also, time interval measurement in INO experiment will require constantly running of both the PLLs along with their corresponding oscillators to maintain their frequency stabilization. This in turn leads to higher power consumption and noise production due to millions of TDC channels. In addition to this, the DLL/PLL based delay variation compensation techniques have high layout design challenges and are power & area inefficient. Thus, the problem statement is to develop low noise Vernier TDC, efficient to fulfill the time interval measurement specifications for INO while using simple and elegant design idea. In this endeavor, a 4-channel Vernier ring oscillator based TDC ASIC with inbuilt SPI based read-out is designed and implemented using the standard cells available in the PDK of 0.35  $\mu$ m CMOS technology. The pre-optimized standard cells (in terms of switching speed, power and area consumption) provided by the foundry reduces the design; effort, time and complexity. Moreover, the standard cell based digital TDC design is scalable to modern CMOS process in view of improved power and area efficiency.

Further, with the aim of low noise, reduced design complexity and high power & area efficiency, event triggrable ring oscillators and their time period calibration circuits are carefully designed. To obtain a slight difference in frequencies of standard cell based ring oscillators, one method is to use the tiny mismatch in place and route (P & R) of standard cell in the design of ring oscillators. However, due to unpredictable P & R strategy used by automation tools, the matching among different channels of oscillators in our multi-channel TDC is difficult to achieve. This leads to spread in its resolution across channels. Therefore in this work, a small frequency difference is achieved by designing different feedback gates with different fan-in in the design of ring oscillators. To circumvent the impact of variations in process (5.6 % variation in  $t_{ox}$  and 30 % variation in  $V_t$ ), temperature and supply voltage over accuracy of time interval measurement, the designed digital time-period calibrator calibrates the time-periods of ring oscillators. These calibrated values are used to accurately evaluate the time interval, thereby ensures the accuracy of measurement.

Moreover, this TDC ASIC design approach is based on mixed design flow, where by manual P & R layout design approach, the route delays are controlled and maintained identical for 'start' and 'stop' oscillators. This reduces the variation in resolution among channels due to differential nature of Vernier technique. Also, to minimize the spread in resolution across TDC channels, inter-channel variations are reduced by placing them in the close proximity while applying methods to minimize the noise coupling.

The calibration circuit and SPI based read-out logic are implemented using digital design approach. This approach reduces the design effort and time. These blocks are designed using Verilog HDL, synthesized using 'Build gates' to produce gate level netlist. The layout of gate level netlist is designed by automatic P& R using 'SOC encounter'. The top level integration is carried out using custom layout editor tool.

#### 4.2 Architecture of TDC ASIC

This ASIC consists of four TDC channels interfaced with common SPI based readout block as shown in Fig.4.1. Each channel has separate inputs '*start*' and '*stop*' to achieve its independent performance. This enables the utilization of ASIC in both 'common start' and 'common stop' mode. In common start mode, the '*trigger*' starts the time interval measurement in all four channels. In common stop mode, '*trigger*' stops the time interval measurement in all four channels. Each TDC channel includes start & stop ring oscillators, coarse & fine counters and leading edge phase detector.

To make oscillator's time period calibration independent of time interval measurement, the replicas of start and stop ring oscillators are designed in the calibration channel along with calibration circuit. The calibration channel is interfaced with the separate SPI based data read-out circuit.

Both the SPIs are in communication independently to micro-controller based external interface to transfer the measured and calibrated data.



Figure 4.1: Block diagram of Vernier four channel TDC ASIC

#### 4.2.1 Working of TDC channel

The 'start' and 'stop' inputs start the triggrable slow and fast ring oscillators with their time periods ' $T'_{oscst}$  and ' $T'_{oscsp}$  respectively, as shown in Fig.4.2. Here, time period ' $T'_{oscsp}$  is designed slightly less than ' $T'_{oscst}$ , so, on each cycle, the rising edge of fast oscillator clock (spclk) approaches the slow one (stclk) by a step size of ' $T_{oscst} - T'_{oscsp}$ . Eventually both oscillator clocks either have a phase coincidence or rising edge of spclk leads the stclk. This phase coincidence or phase leading is detected by leading edge detector followed by the assertion of end of conversion (eoc). The 'eoc' signal switches off the oscillators that reduce the power consumption in ideal state. The elapsed cycles of slow and fast oscillator clocks till phase coincidence are counted by the coarse and fine counters respectively.

The coarse counter counts the ' $N'_c$  number of slow oscillator cycles till phase coincidence. It provides a long measurement range of ' $(2^b) \times T'_{oscst}$ , where 'b' is the number of bits of coarse counter. The fine counter counts ' $N'_f$  cycles of fast oscillator till its phase coincidence. It is corresponding to number of required steps of size ' $T_{oscst} - T'_{oscsp}$  (LSB) within ' $T'_{oscst}$  till phase coincidence.

The time interval ( $\triangle$ T) measurement is given by-

$$\Delta T = (N_c - 1)T_{oscst} - (N_f - 1)T_{oscsp}$$
(4.1)

Here, one count is subtracted from  $N_c$  and  $N_f$ , due to the count considered during phase coincidence. The above equation can be rearranged as-

$$\Delta T = (N_c - N_f)T_{oscst} + (N_f - 1)(T_{oscst} - T_{oscsp})$$
(4.2)

Here,  $(N_c - N_f) \times T_{oscst}$  'gives the coarse and  $(N_f - 1) \times (T_{oscst} - T_{oscsp})$  'gives the fine time measurement. The applied time interval is accurately measured provided the  $T_{oscst}$  and  $T_{oscsp}$  are accurately known.



Figure 4.2: Vernier ring oscillator method (a) block diagram (b) timing diagram

The conversion time  $(C_T)$  of this technique is given by equation(4.3), where 'S' is stretching factor, which defines maximum number of counts, which needs to be covered by the fine counter. The stretching factor depends on the frequencies of ring oscillators, therefore its value varies across PVT variations.

$$C_T = S \times T_{oscsp} = \left(\frac{T_{oscst}}{T_{oscst} - T_{oscsp}}\right) \times T_{oscsp}$$
(4.3)

#### 4.2.2 Front-end design aspects

The critical design aspects to achieve the required dynamic range and resolution are-

- Number of bits in coarse and fine counters
- Frequency of ring oscillators

In this design, the target dynamic range is  $\sim 1.4 \ \mu s$ ; therefore, a 9-bit coarse counter is designed with use of 8-bits in dynamic range calculation while assuring start oscillator clock frequency of  $\sim 135$  MHz. Another aspect is the frequency of ring oscillators, which impacts the design at two points-

- The high frequency of oscillators reduces the conversion time of TDC channel but on the other side increases the dynamic power consumption
- The operating frequency of 9-bit counter has to be less than 200 MHz after taking into account the constraints imposed by the design kit

Therefore, the time-period of slow oscillator is designed to be  $T_{oscst} = 7.382$  ns. For the target resolution of better than 200 ps, the time-period of fast oscillator is designed to be  $T_{oscsp} = 7.268$  ns, so that the typical designed value of resolution is:  $T_{oscst} - T_{oscsp} = 114$  ps. The number of bits in fine counter is designed in order to count at least  $S = T_{oscst}/(T_{oscst} - T_{oscsp}) = 65$  cycles of spclk. However, across PVT variations, the value of 'S'also varies from the typical designed value in the range from 54 to 80. Therefore, to safely accommodate the variation in 'S', 8-bit fine counter is chosen.

The conversion time of TDC channel for these chosen values of  $T_{oscst}$  and  $T_{oscsp}$  are calculated from equation(4.4). This value (470.63 ns) of conversion time is acceptable and is not of much significance as event rate in INO experiment is low (~100 Hz).

$$C_T = \left(\frac{T_{oscst}}{T_{oscst} - T_{oscsp}}\right) \times T_{oscst} = \left(\frac{7.382}{114 \times 10^{-3}}\right) \times 7.268ns = 470.63ns$$
(4.4)

To limit the dynamic range of TDC, the '*stop*' signal is disabled after rollover of eight bits of coarse counter. Therefore, the maximum designed dynamic range of TDC is  $(2^b - S - 1)T_{oscst}$ , where b=9. Here, if  $2^b$  is much greater than S, long dynamic range can be obtained by increasing number of bits in coarse counter. The theoretical maximum dynamic range calculated from above equation is:  $(2^9 - 65 -$  1)  $\times$  7.382  $\approx$  1.402  $\mu$ s.

#### 4.3 Description of design blocks

#### 4.3.1 Ring oscillator



Figure 4.3: Schematic representation of ring oscillators

The input triggrable ring oscillators as shown in Fig.4.3, are designed to generate two clocks (stclk and spclk) with small difference in time-periods. The AND gate provides a re-trigger ability. The time-period difference is obtained by changing the type of feedback gate, which also contributes as different fan-out for cell '*DLY*'. The '*DLY*' has a fan-out of '2' (AOI) and '1' (NOR) in slow and fast oscillators respectively. Due to fan-out difference, this standard cell has different output load capacitance. This results in unequal propagation delay ( $T_d \alpha C_{load}$ ) introduced by '*DLY*' in both the ring oscillators. In addition, the delay introduced by the gate '*AOI*' is higher than the NOR gate. This ensures the difference in frequencies of the two oscillators.

#### 4.3.2 Leading edge phase detector

A phase detector is designed to detect the phase coincidence (or phase leading) of ring oscillators. It is realized by using D flip-flops and logic gate, as shown in

Fig.4.4. The flip-flop (FF-1) samples the status of slow oscillator clock on rising edge of fast one. The second flip-flop (FF-2) stores the sample obtained from FF-1 for one clock period. The combinational logic compares the magnitude of current and stored samples. The '*eoc*' signal asserts from logic '0' to '1' either at rising edge phase coincidence or directly leading of 'spclk' from 'stclk'.

This architecture of phase detector has theoretically infinite dynamic range. However, the minimum detectable phase error is limited by the issue of dead zone and metastability in FF-1. The probability of metastability is reduced by using standard cell flip-flop with smallest setup and hold time windows.



Figure 4.4: Schematic diagram of leading edge phase detector

#### 4.3.3 Calibration block



Figure 4.5: Timing diagram for time period calibration of oscillator clocks

In CMOS integrated circuits, time-period of oscillators vary as propagation delays are sensitive for PVT variations. In the earlier reported work[143], area and power inefficient delay variation compensation circuit like PLL has been used to stabilize the time periods and their difference. In this work, a digital calibrator is designed, which calibrates the real time values of time-period of ring oscillator clocks. The calibrated time-periods are used in the conversion equation(4.1) to calculate the applied time interval between '*start*' and '*stop*' signals.

The calibration scheme is based on, *'counting the number of cycles of unknown frequency clock for a known period of time'*, so that the total count is proportional to the frequency of clock as shown in Fig.4.5. It is expressed by equation(4.5), where  $T'_0$ 

is the unknown input frequency, 't' is the known period of time named calibration time window and 'N' is the number of pulse count within a calibration window.

The calibration time window of duration  $t = 80 \,\mu\text{s}$  is designed. The duration of window is long enough to calibrate the time-periods with an accuracy of 1's of ps. For instance, with calibration window of  $t = 80 \,\mu\text{s}$  duration and for count of N = 11000,  $T'_o$  evaluates to be 7.272 ns and with N=11002,  $T'_o$  evaluates to be 7.271 ns, the difference is of  $\pm 1$  ps.

$$T_0 = \frac{t}{N} \tag{4.5}$$

From equation(4.5), the number of counts is proportional to the frequency of clock if duration 't'of calibration window is constant. Therefore, to maintain the duration of calibration window across process and operating condition variations, a 40 MHz accurate system clock is used for window generation. This clock is provided by the off-chip precise and stable crystal oscillator.



Figure 4.6: Block diagram for time period calibration of oscillator clocks

Fig.4.6 depicts the block diagram of time-period calibrator, where the reference slow and fast ring oscillators are triggered simultaneously by 'calibration start' signal. An 'oscillator ready' signal asserts when both oscillators become stable. It enables 12-bit counter clocked by system clock to generate the calibration time window. The 14-bit calibration counters count the cycles of slow clock 'stclk' and fast clock 'spclk' within calibration window. The 'data ready' signal asserts when calibration window is over. It enables the transfer of 14-bit data provided by calibration counters 'spclk [13 : 0]' and 'stclk [13 : 0]' to internal register of SPI (Serial Peripheral Interface) based read-out block.

#### 4.3.4 SPI based read-out logic

The SPI [194] interface was developed by Motorola to provide full-duplex, synchronous and serial communication between master and slave device. SPI master and slave communicate with each other using the serial clock (SCK), master out slave in (MOSI), master in slave out (MISO), and slave select (SSEL) lines. Also, the master may communicate with multiple slaves but one at a time. So, the signals '*SCK*', '*MOSI*', and '*MISO*' can be shared by slaves and therefore are designed in tri-state logic. However, each slave has a unique '*SSEL*' line to be interfaced with the master at a time.



Figure 4.7: Timing diagram of four modes of data transfer through SPI

As shown in Fig.4.7, there are four modes of data transfer between master and selected slave depending on the polarity of issued clock (SCK) and clock phase at which data is shifted on MISO line. If clock polarity and phase both are '0' (defined as Mode-0) data is sampled at the leading rising edge of the clock. For clock polarity = '1' and clock phase = '0' (Mode-2), data is sampled at the leading falling edge of the clock. Likewise, clock polarity = '0' and clock phase = '1' (Mode-1) results in data sampled at on the trailing falling edge. For clock polarity = '1' with clock phase = '1' (Mode-3) results in data sampled on the trailing rising edge.

In our design, the SPI interface is designed for unidirectional data transfer in Mode-2 from a single TDC ASIC (slave) to off-chip micro-controller (master). Hence, there is only one slave over SPI bus, so 'MISO' is not designed in tri-state logic as well as 'MOSI' signal is not incorporated in the design.

The overall readout block consists of two modules: TDC channel interface and SPI as shown in Fig.4.8. The channel interface logic interfaces the TDC core channels to SPI in the order of 1 to 4. This is to avoid the ambiguities like-

- If data is ready to transfer to external interface through SPI in more than one channel at the same time.
- If in any channel data is ready while SPI is busy in transferring the TDC data pertaining to other channel



Figure 4.8: Block diagram of SPI based read-out logic

The timing diagram of working of SPI is shown in Fig.4.9 where, on each rising edge of system clock, if status of 'SPI busy' (derived from SSEL signal) is active high, the interface logic samples the status of 'eoc' lines from TDC channels in order of 1 to 4. The 17-bit TDC data corresponding to the interfaced channel is latched into 24-bit internal register of SPI followed by the assertion of 'data ready' signal with the time margin of two system clock periods. The 24-bit SPI internal register includes first 0-to-16 bits reserved for sixteen bit word (9-bit coarse count and 8-bit fine count) from TDC channel, two bits for channel ID and last five bits reserved as logic '0', as shown in Fig.4.10. The 'data ready' signal issues a request for data transfer to micro-controller based external interface. The micro-controller acknowledges the TDC ASIC by issuing clock 'SCK' (8 MHz) and slave select 'SSEL' signal with active low status.

In order to detect the active low status of 'SSEL', a 3-bit shift register is designed. It samples and shifts the status of 'SSEL' on each rising edge of system clock. Initially the status of shift register is '111'. Thus, low status of 'SSEL' is detected by logic '0' status of first bit of shift register, which subsequently enables the assertion of ' $SSEL_active'$  signal from logic '0' to logic '1', which enables:

- Serially shifting of data out over '*MISO*' line on each falling edge of master clock (SCK) in the transmitter block of SPI.
- Counting of transferred bits on each rising edge of master clock (SCK) in the receiver block of SPI.



Figure 4.9: Timing diagram of data transfer through SPI



Figure 4.10: Data format of 24-bit SPI register

The logic for synchronization of master clock '*SCK*' with system clock is implemented by magnitude comparison of last two bits of 3-bit shift register, which samples the status of SCK on system clock. When status of last two bits of shift register are '10' and '01', generates the pulses '*SCK\_risingedge*' and '*SCK\_fallingedge*' respectively.

In the data transmitting block of SPI, on falling edge of system clock, when  $`SCK\_falling'$  pulse as well as  $`SSEL\_active'$  are in high state, data from internal register is loaded to 24-bit SPI followed by transferring the  $24^{th}$  bit over `MISO'

line. Thus, after each falling edge of 'SCK', one bit of SPI register shifts out over 'MISO' line.

On the receiver side, to count the number of bits transferred and to intimate the master about completion of data transfer, a 5-bit counter is designed. It counts the transfer of each bit on the falling edge of system clock when '*SCK\_risingedge'* is on high status. Once, 24 bits are transferred serially, the '*SPI\_done'* signal asserts and intimates the master that data transfer is over. It pulls up the '*SSEL*' and thus enables the interfacing of other channel for data transfer.

The time required to transfer the TDC channel data depends on the frequency of master clock SCK that is chosen as 8 MHz. Therefore, it takes 125 ns in transferring one bit from SPI register. This corresponds to  $(125 \times 24)ns = 3\mu s$  time in transferring the one 24-bit word through SPI.

#### 4.4 Layout design aspects

In this design, a mixed signal layout design approach is used where, the TDC channels including oscillators, counters and phase detectors are designed by manual P & R so that routing delays, noise coupling through parasitics and crosstalk can be controlled. The channel interface with SPI and time-period calibrator is designed by automatic P & R with the benefits of compact layout as well as reduced design time & effort.

In the design of TDC channels using manual P & R approach, some layout design techniques and protocols are followed to achieve the high performance-

#### 4.4.1 Matching of oscillator channels

The matching among oscillator channels is important to reduce the spread in resolution across channels. The major sources of mismatch is systematic variations, which are caused due to un-identical placement of standard cells and un-identical routes among four channels of start/stop oscillators, which are designed to be identical. Also, the other source of systematic variation is process induced due to photo-lithography aberrations, which may be spatially correlated.

To reduce the mismatch, the start and stop oscillators are identically laid out so that the difference of their time periods is contributed only by the difference incorporated in their designs. Also, to reduce the inter-channel variations, they are placed in close proximity of each other, as shown in Fig.4.11



Figure 4.11: Layout representation of two channels of TDC showing ring oscillators with phase detector

#### 4.4.2 Substrate coupling noise

As oscillators are designed in close proximity, the noise coupling among them through the common substrate dominates. It causes a fluctuation in the timeperiod of ring oscillators over oscillation cycle, which degrades the accuracy of measured time interval. To minimize the noise coupling; each oscillator is enclosed by the p-tap guard ring, tied to the ground with large number of contacts shown in Fig.4.11. This provides a low impedance path for the substrate noise to ground and thus reduces its coupling among ring oscillators. In addition, each channel is further separated from the other by another p-tap guard rings connected to ground.

#### 4.4.3 Supply rail routing

The supply rails of channels of ring oscillators are connected together through the filler cell 'PFILL'(provided in the core-library) as shown in Fig.4.11. This avoids over the standard cell routing for local connection of supply rails to avoid noise coupling through parasitics. Also, noise on  $V_{dd}$  (glitch) and gnd (ground bounce) can drastically increase the jitter and skew problems. Therefore, at each block level and top level, supply layers are routed in parallel so that opposite voltage glitch can cancel each other.



Figure 4.12: Layout representation of 4-channel Vernier TDC prototype ASIC

#### 4.4.4 Crosstalk

Cross talk refers to the capacitive and inductive coupling between adjacent interconnects and impacts their signal integrity. To reduce the cross talk between slow and fast oscillator clock interconnects, both signal lines are separated at large extant with a metal layer shorted to supply in between the two as a shield. In addition, both clocks run orthogonally through the local metal layer to reduce their overlapping area.

#### 4.4.5 Reduction of clock skew for coarse counter

In the layout design of 9-bit counter, to reduce the clock skew, buffer tree is placed in the middle of nine stages. This provides a negative skew in the first half stages as the direction of clock routing is opposite to the direction of bits toggling.

The layout of top level integration of ASIC is shown in Fig.4.12, the main functional blocks of ASIC are highlighted. The whole logic is encapsulated using 64-pin plastic package CLCC68.

### 4.5 Simulation results

This section presents the performance validation of Vernier TDC ASIC design through simulation results. The functional and timing verification of TDC channels are carried out with the help of SPICE simulators using device models provided by the foundry. This approach is timing accurate as includes the impact of parasitics related to MOS device and routing of standard cells on the performance of TDC. However, it is computationally complex and time consuming as well. Therefore, the non timing critical blocks like read-out logic and calibration circuit working on system clock frequency are verified by digital simulator with SDF (standard delay format) delay file provided by the Encounter. The top-level verification is carried out using SPICE simulator before sign off this ASIC.

The functional and timing accuracy of design blocks have been verified by applying following tests-

#### 4.5.1 Time period of oscillators and resolution across design process corners

The values of time periods ' $T'_{oscst}$  and ' $T'_{oscsp}$  and their difference for four TDC channels across five design process corners are shown in Table 4.1. On typical corner, the time-periods of slow and fast ring oscillators are  $T_{oscst} = 7.382$  ns and  $T_{oscsp} = 7.286$  ns respectively. The difference in their values is 114 ps, which is resolution of TDC. Across the corners, there is a maximum 27 % variation in the time-periods of oscillator with respect to typical designed values. Also, the variation in difference of time periods is less than typical designed value of 114 ps.

#### 4.5.2 Calibration of time periods of reference start and stop oscillators

The calibrated values of time periods for reference oscillators across design process corners are shown in Table 4.1. The calibrated values of time periods and resolution are consistent with the simulated one.

#### 4.5.3 Functional verification of Vernier TDC channel

In order to verify the functionality of TDC channel, '*start*' and '*stop*' inputs with time interval of 1.5 ns are applied to trigger the corresponding ring oscillators. The assertion of '*eoc*' signal at the phase coincidence of slow and fast clocks is shown in Fig.4.13(a). The coarse and fine counts (number of cycles of clocks till phase coincidence) are 15. The calculated time interval using equation(4.1) is 1.482 ns.

| <b>Corners</b> →                                | WP    | ТҮР   | WS    | WO    | WZ    |
|-------------------------------------------------|-------|-------|-------|-------|-------|
| Channels↓                                       |       |       |       |       |       |
| Ch1:T <sub>oscst</sub> (ns)                     | 5.392 | 7.382 | 9.804 | 7.465 | 7.074 |
| Ch1:T <sub>oscsp</sub> (ns)                     | 5.32  | 7.268 | 9.639 | 7.356 | 6.963 |
| $\Delta \mathbf{T}$ (ns)                        | 0.072 | 0.114 | 0.165 | 0.109 | 0.111 |
| Ch2:T <sub>oscst</sub> (ns)                     | 5.392 | 7.383 | 9.804 | 7.464 | 7.074 |
| Ch2:T <sub>oscsp</sub> (ns)                     | 5.32  | 7.268 | 9.639 | 7.356 | 6.962 |
| $\Delta T$ (ns)                                 | 0.072 | 0.115 | 0.165 | 0.108 | 0.112 |
| Ch3:T <sub>oscst</sub> (ns)                     | 5.392 | 7.383 | 9.804 | 7.464 | 7.074 |
| Ch3:T <sub>oscsp</sub> (ns)                     | 5.32  | 7.268 | 9.639 | 7.358 | 6.962 |
| $\Delta T$ (ns)                                 | 0.072 | 0.115 | 0.165 | 0.106 | 0.112 |
| Ch4:T <sub>oscst</sub> (ns)                     | 5.392 | 7.383 | 9.804 | 7.464 | 7.074 |
| Ch4:T <sub>oscsp(</sub> ns)                     | 5.32  | 7.268 | 9.639 | 7.358 | 6.962 |
| $\Delta \mathbf{T}$ (ns)                        | 0.072 | 0.115 | 0.165 | 0.106 | 0.112 |
| Calibrated Time-Period of Reference Oscillators |       |       |       |       |       |
| $T_{oscst}$ (ns)                                | 5.394 | 7.384 | 9.808 | 7.464 | 7.076 |
| $T_{oscsp}$ (ns)                                | 5.318 | 7.265 | 9.634 | 7.353 | 6.959 |
| $\Delta T$ (ns)                                 | 0.076 | 0.119 | 0.174 | 0.111 | 0.117 |

Table 4.1: Time-periods of ring oscillator across process corners for four<br/>channels (Ch1, Ch2, Ch3, Ch4)

## 4.5.4 Frequency stability plot over number of cycles for slow and fast oscillators

In order to verify the impact of device component noise and noise coupling through RC parasitics, the time period of oscillator clocks is simulated over 60 cycles. The variation in time-periods is less than 3 ps, as shown in Fig.4.13(b).

#### 4.5.5 Channel-to-channel variation across process corners

In order to find the channel-to-channel variation as well as to verify the functionality of channel interface logic with SPI, a fixed input time interval of 50 ns is applied simultaneously to four channels of TDC ASIC and simulated across design process corners. Fig.4.14 shows the output of SPI for four channels in the order of 1 to 4



Figure 4.13: (a)Waveform representing phase coincidence of ring oscillators for time interval measurement (b)variation in time periods over number of cycles



Figure 4.14: Waveform representing the SPI output over MISO line corresponding to four TDC channels

on WS corner. The MISO data is as per the format shown in Fig.4.10.The corresponding MISO code along with coarse and fine counts is given in Table 4.2. The calculated time interval using equation (4.1) is 50.175 ns with error of 175 ps. The MISO code for four channels across WP, WZ and WO are also given in the Table 4.2, where the maximum error is of 202 ps on WO corner. Also due to the impact of noise coupling and cross talk, the fourth channel has mismatch (maximum 106 ps) in measured time interval on WO and WP corners.

| wo                      | MISO                                                                                  | $\mathbf{N}_{c}$                            | $\mathbf{N}_{f}$                     | ОТ                                      | WP                      | MISO                                                                  | $\mathbf{N}_{c}$                            | $\mathbf{N}_{f}$                | ΟΤ                               |
|-------------------------|---------------------------------------------------------------------------------------|---------------------------------------------|--------------------------------------|-----------------------------------------|-------------------------|-----------------------------------------------------------------------|---------------------------------------------|---------------------------------|----------------------------------|
| Ch1                     | <b>0000110100</b><br>000111010                                                        | 58                                          | 52                                   | 50.202                                  | Ch1                     | <b>0000010111</b><br>000100000                                        | 32                                          | 23                              | 50.112                           |
| Ch2                     | <b>0100110100</b><br>000111010                                                        | 58                                          | 52                                   | 50.202                                  | Ch2                     | <b>0100010111</b><br>000100000                                        | 32                                          | 23                              | 50.112                           |
| Ch3                     | <b>1000110100</b><br>000111010                                                        | 58                                          | 52                                   | 50.202                                  | Ch3                     | <b>1000010111</b><br>000100000                                        | 32                                          | 23                              | 50.112                           |
| Ch4                     | <b>1100110100</b><br>000111010                                                        | 57                                          | 51                                   | 50.096                                  | Ch4                     | <b>1100010110</b><br>000011111                                        | 31                                          | 22                              | 50.04                            |
|                         |                                                                                       |                                             |                                      |                                         |                         |                                                                       | 1                                           |                                 |                                  |
| WS                      | MISO                                                                                  | $\mathbf{N}_{c}$                            | $\mathbf{N}_{f}$                     | ОТ                                      | wz                      | MISO                                                                  | $\mathbf{N}_{c}$                            | $\mathbf{N}_{f}$                | ΟΤ                               |
| WS<br>Ch1               | MISO<br>0000001000<br>000001101                                                       | <b>N</b> <sub>c</sub><br>13                 | <b>N</b> <sub>f</sub><br>8           | <b>OT</b><br>50.175                     | WZ<br>Ch1               | MISO<br>0000000110<br>000001101                                       | <b>N</b> <sub>c</sub><br>13                 | <b>N</b> <sub>f</sub><br>6      | <b>OT</b><br>50.087              |
| WS<br>Ch1<br>Ch2        | MISO<br>0000001000<br>000001101<br>0100001000<br>000001101                            | Nc           13           13                | <b>N</b> <sub>f</sub><br>8<br>8      | <b>OT</b><br>50.175<br>50.175           | WZ<br>Ch1<br>Ch2        | MISO<br>0000000110<br>000001101<br>0100000110<br>000001101            | N <sub>c</sub> 13           13              | <b>N</b> <sub>f</sub><br>6      | <b>OT</b><br>50.087<br>50.087    |
| WS<br>Ch1<br>Ch2<br>Ch3 | MISO<br>0000001000<br>000001101<br>0100001000<br>000001101<br>1000001000<br>000001101 | N <sub>c</sub> 13           13           13 | <b>N</b> <sub>f</sub><br>8<br>8<br>8 | <b>OT</b><br>50.175<br>50.175<br>50.175 | WZ<br>Ch1<br>Ch2<br>Ch3 | MISO<br>0000000110<br>000001101<br>0100000110<br>000001101<br>1000000 | N <sub>c</sub> 13           13           13 | <b>N</b> <sub>f</sub><br>6<br>6 | OT<br>50.087<br>50.087<br>50.087 |

Table 4.2: Counts of four channels of TDC across corners for applied timeinterval of 50 ns. OT stands for output time interval

#### 4.5.6 Output versus input time interval characteristic

This test is carried out for four input time interval patterns due to computationally complex and time consuming simulation strategy based on analog approach. The time intervals of 20 ns, 500 ns, 1  $\mu$ s and 1.5  $\mu$ s are applied to four channels of TDC across design process corners. Fig.4.15 shows the plot of measured versus applied time interval on typical corner. The maximum deviation from the best fit on typical corner is of 10 ps. The maximum deviations in the time interval characteristics from their best fit line are 201 ps on WP, 64 ps on WS, 620 ps on WZ and 14 ps on WO design process corners.



Figure 4.15: Output versus input time interval characteristic

#### 4.6 Summary

The Vernier TDC ASIC prototype has been implemented using standard cells of  $0.35 \ \mu m$  commercial CMOS process. The inherent features of Vernier technique are process independent resolution and less amount of logic resource consumption. The standard cell based Vernier TDC design approach with digital time period calibrator is the new initiative taken in this work. To achieve the high performance, the timing critical blocks are designed using manual placement and routing. The calibrator and read-out blocks are designed using an automatic P& R tool. The Standard cell based implementation of Vernier TDC benefits from small design time, less design effort, low power, low noise, small chip area and high resolution. The significant achievement is in the reduction of power consumption, as compared to reported PLL based Vernier TDC. Thus standard cell based Vernier TDC adheres most of the required specifications of HEP experiments and portable instruments. Therefore, design of multi-hit Vernier TDC based on standard cell approach is the next agenda of this work.

## **Chapter 5**

## Design and Implementation of 8-channel Multi-hit TDC using Vernier Technique

In the INO experiment, the specified requirement is to measure time interval between discriminator and trigger signals. To meet this requirement, a power & area efficient standard cell based 4-channel Vernier TDC ASIC is developed successfully and characterized (as discussed in chapter-4). Subsequently, in the INO experiment, the requirement for occurrence time measurement of four edge transitions in the discriminator output signal was added to find information pertaining to; capture of delayed muon interactions with RPC and implementation of time over threshold (TOT) logic[52] for off-line time walk error correction of timing data. The enhanced specifications was arrived as-

- Minimum time interval measurement of 5 ns between first rising and falling edge input transition, a measure of pulse width of discriminator signal to implement TOT logic.
- Minimum time interval measurement of ~ 10 ns between two consecutive pulses (pulse pair resolution), which corresponds to delayed muon interaction events. This time interval can be large in the order of 10's of μs. Here, the TDC should be capable to measure transitions over range higher than 32 μs.

In order to meet the added requirements, a multi-hit TDC ASIC is developed. In this design, Vernier TDC ASIC is extended with multi-hit capabilities without affecting resolution, dynamic range and low power & area requirements, needed for INO experiment. The design specifications of multi-hit Vernier ASIC are listed in Table 5.1. The implementation of this TDC ASIC is carried out by using standard cells available in the PDK of 0.35  $\mu$ m CMOS technology. Also, design approach of ASIC is based on mixed signal design flow, where to assure the timing accuracy, the oscillators and leading edge detector are customized by manual P & R (analog approach). The rest of the logic including; counters, latches, logic control block, memory and read-out interface is designed by automatic P & R (digital design approach) to reduce layout design; time and effort.

| Specifications                   | Multi-hit TDC                                                    |
|----------------------------------|------------------------------------------------------------------|
| Bin Size                         | Better than 200 ps                                               |
| Dynamic Range                    | $10 \ \mu s / 20 \ \mu s / 30 \ \mu s / 60 \ \mu s$ (Selectable) |
| Number of Channels               | 8                                                                |
| System Clock frequency           | 100 MHz                                                          |
| Start Oscillator Clock frequency | 135 MHz                                                          |
| Stop Oscillator Clock frequency  | 136.9 MHz                                                        |
| Number of bits in Fine Count     | 7 (after encoding)                                               |
| Number of bits in Coarse Count   | 15                                                               |
| Pulse width measurement          | 5 ns                                                             |
| Pulse Pair resolution            | 10 ns                                                            |
| Number of Events per             | 1                                                                |
| Measurement                      | T                                                                |
| Number of bits in read-out       | 34 bits in two 17-bit words                                      |
| Operating Mode                   | 4 (Normal, Common Start, Common                                  |
| Operating wode                   | Stop and Calibration)                                            |
| Calibration circuit              | Digital Time Period Calibrator                                   |

Table 5.1: Design specifications of Vernier multi-hit TDC

### 5.1 Architecture of multi-hit TDC ASIC

The Multi-hit TDC ASIC consists of nine time measurement channels (including one trigger channel), logic control block, TDC channel interface,  $256 \times 17$  bit dual port memory and read-out logic. All the channels are interfaced with the memory by channel interface logic, as shown in Fig.5.1. The TDC is designed in four operating modes, listed in Table 5.2 to achieve ASIC utilization in various applications. These operating modes are 'normal', 'common start', 'common stop' and 'calibration mode' that can be selected by 2-bit '*mode*' control signal. The detailed operation of each mode is discussed below-



Figure 5.1: Block diagram of multi-hit TDC ASIC

#### 5.1.1 Normal mode (mode-0)

In this mode, the TDC ASIC is used for time interval measurement between two events, '*start*' (normal\_start) and '*stop*' (first transition in the 'multi-hit' signal), as shown in Fig.5.2. The phase coincidence of 'start' and 'stop' ring oscillator clocks asserts end of conversion (eoc) signal, which enables the transfer of measured timing data to memory using channel interface logic.

The dynamic range of time interval measurement is designed to be selectable by using 2-bit ' $dr\_sel'$  control signal. Based on the value of ' $dr\_sel'$ , one of the four bits ( $11^{th}$ ,  $12^{th}$ ,  $13^{th}$  and  $14^{th}$ ) of coarse counter is used to obtain the dynamic range over signal for TDC channel.



Figure 5.2: Timing diagram of normal operating mode

#### 5.1.2 Common start mode (mode-1)

In this mode, the time of four consecutive transitions in the 'discriminator' signal occurring after 'trigger' is measured within a pre-defined dynamic range window, as shown in Fig.5.3. The 'trigger' signal starts the time interval measurement in TDC channels as well as opens dynamic range window with the help of logic control block. This window is designed to be selectable from  $10\mu s$ ,  $20 \ \mu s$ ,  $30 \ \mu s$  and  $60 \ \mu s$  by using 2-bit 'dr\_sel' control signal. Within the selected window in each channel, the timing data of four consecutive transitions in multi-hit signal are latched along with the assertion of their corresponding 'eoc' signals. These four 'eoc' signals are processed to obtain the channel specific end of conversion 'eoc\_ch' signal.

The logic control block issues the '*read command*' after checking the presence of transitions through the status of '*eoc\_ch*' signal when dynamic range window is over. The '*read command*' transfers the latched data into the memory.



Figure 5.3: Timing diagram in common start operating mode

#### 5.1.3 Common stop mode (mode-2)

In this mode, within dynamic range window, the time of four consecutive transitions in the '*multihit*' and time of '*trigger*' are measured with respect to the '*event reset*' signal, as shown in Fig.5.4. The '*event reset*' starts the time interval measurement in TDC channels as well as in trigger channel. The measured time of trigger is used to find its associated events within dynamic range window that is selectable by ' $dr_{sel}$ ' signal.

The logic control block opens the dynamic range window on an external 'event reset' or 'internal reset' signal. It also detects the presence of trigger within the selected window duration prior to issuing the 'read command' signal for TDC channel data transfer to memory. This is carried out by sampling the status of 'eoc\_ch' signals corresponding to measurement and trigger channels at the falling

edge of dynamic range window. If both are active high, it issues a read command which transfers the data associated with the trigger. If for trigger channel it is active low, it shows the absence of trigger so the timing data of each TDC channel is discarded. Further, it generates an '*internal reset*' signal, which reopens the dynamic range window to rescan the transitions that are associated with the trigger.



Figure 5.4: Timing diagram in common stop operating mode

#### 5.1.4 Calibration mode (mode-3)

In CMOS standard cell based TDC, standard cell delay is a function of PVT variations. This leads to the variation in time period of ring oscillators thereby affects the resolution of TDC. To mitigate the impact of delay variations over accuracy of measured time interval, calibration mode is implemented to account and correct for these effects. In this mode, the measurement channels are used to find the real time value of the time period of start oscillator clock and resolution of TDC. This approach of calibration is different from that chosen in previous Vernier ASIC, where independent calibration of time-periods of reference ring oscillators is carried out. The current approach improves the accuracy of calibrated parameters across local process variations. The measured parameters are used to calculate the time interval of transitions in the multi-hit with respect to 'start' (normal start/trigger/event reset) in measurement modes (mode-0, mode-1 and mode-2).

Further, the resolution of each TDC channel is equal to the difference of time periods of start oscillator and multi-hit transition corresponding stop oscillator. Here, it is difficult to match the time periods of oscillators across TDC channels due to process induced random and systematic variations. This may lead to the variation in resolution among adjacent TDC channels. Hence, separate calibration for each channel is implemented.

The calibration is initiated by asserting the 'calibration start' signal, which

triggers the start and stop ring oscillators simultaneously in each TDC channel. It also enables the generation of coarse and fine calibration windows to accurately calibrate the time-period of slow oscillator and resolution of TDC respectively. At the falling edge of coarse calibration time window, a '*calibration over*' signal asserts, which enables the transfer of calibrated data to memory.

The measured or calibrated data from TDC channels is transferred to an inbuilt memory using channel interface logic. The data transfer is carried out in the order of channel-1 to channel-8 followed by the trigger channel. Further, the memory is interfaced with readout logic for TDC data transfer to external interface. The read-out logic consists of serial peripheral interface (SPI) and parallel interface, any one of these two can be selected by control signal '*ser\_par*' for external interface.

| Mode                                         | Mode<br>ID | Start                                  | Stop                                                       | Description of Data                                                                                                                  |  |
|----------------------------------------------|------------|----------------------------------------|------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------|--|
| 0                                            | 00         | normal<br>start                        | First<br>transition<br>in<br>multi-hit                     | Data corresponding to time interval<br>between two events for all channels                                                           |  |
| 1                                            | 01         | trigger                                | Four<br>transitions<br>in<br>multi-hit                     | Timing data corresponding to arrival<br>time of maximum 4 transitions with<br>respect to start in all channels                       |  |
| 2                                            | 10         | event<br>reset or<br>internal<br>reset | Four<br>transi-<br>tions in<br>multi-hit<br>and<br>trigger | Time data corresponding to arrival<br>time of maximum four transitions and<br>trigger with respect to event reset in<br>all channels |  |
| 3                                            | 11         | calibration<br>start                   | calibration<br>over                                        | Data corresponding to time period<br>calibration and LSB calibration for all<br>channels                                             |  |
| DataFormat0-6 bits rfrom TDCcoarse cochannel |            | 0-6 bits res<br>coarse cou             | served for fine count, 7-21 bits reserved for<br>Int       |                                                                                                                                      |  |

Table 5.2: Description of operating modes

#### 5.2 Vernier multi-hit TDC channel

In each TDC channel, as shown in Fig.5.5, a preprocessor block is designed to separate the four transitions in the multi-hit signal. It provides for stop signals corresponding to four transitions in multi-hit signal. In order to measure the occurrence time of four transitions with respect to the '*start*' signal separately, the fine time measurement block (stop oscillator, leading edge phase detector, fine counter, synchronizer and latch) is replicated by four times and coarse time measurement block (start oscillator and coarse counter) is shared among them. This scheme is efficient to measure minimum pulse width duration of ~ 1 ns and is independent of large conversion time (~ 500 ns, refer chapter-4) of Vernier technique.

The 'start' signal initiates the startable slow ring oscillator to oscillate with a time period ' $T'_{oscst}$ . Each 'stop' signal provided by pre-processor triggers individual fast (stop) ring oscillator to oscillate with a time period ' $T'_{oscsp}$  ( $T_{oscsp} < T_{oscst}$ ). As shown in Fig.5.6, the rising edge of each individual 'stop oscillator'clock approaches the rising edge of 'start oscillator'clock by a step size of  $\Delta T_d = 'T_{oscst} - T'_{oscsp}$  on each cycle. Eventually, the phase of each 'stop' oscillator clock coincides with the phase of start oscillator clock and is detected by the respective phase detector, which toggles 'eoc' signal. The number of cycles ( $N_f$ ) covered by the stop oscillators till their phase coincidence is measured by the respective 7-bit fine counter. The 'eoc' signal disables the respective fine counter and stop oscillator which leads to reduction in the power consumption.

For coarse time measurement of each '*stop*' signal, the number of elapsed cycles ( $N_c$ ) of start oscillator is measured through sampling and latching the count status of coarse counter by the respective '*eoc*' signal. Here, as the coarse counter is dedicated for four latches corresponding to four transitions, so there is a significant fan-out and wire delay in the routes of counter bits. This causes an error in latched count due to the counter bit, which toggles on the clock cycle corresponding to phase coincidence. To avoid this error, coarse count synchronizer corresponding to each transition is designed that latches the coarse counter status with one additional count.

The schematic of synchronizer is shown in Fig.5.5, where, the status of '*eoc*' is sampled at falling edge of slow oscillator clock. This gives a time margin of half clock cycle, to settle '*eoc*' signal on active high status. Using this sample, the '*latch*' signal is obtained on next falling edge of clock. This gives a time margin of half clock cycle to settle the toggled bits of coarse counter. The '*latch*' signal latches the coarse and fine counts into the 22-bit latch when coarse counter is in idle state and

fine counter is in disabled state. The 22-bit latch data include 15-bit coarse count  $(N_c)$ , 7-bit fine count  $(N_f)$ .

The time interval of each 'transition ' (stop) with respect to 'start' in all measurement modes is given by equation (5.1), where one additional count considered during the phase coincidence, is subtracted from both the coarse ( $N_c$ ) and fine ( $N_f$ ) counts. The other count subtracted from coarse count is the considered cycle due to synchronization.

$$\Delta T_1 = (N_c - 2) \times T_{oscst} - (N_f - 1)T_{oscsp}$$
(5.1)

The relative time between the rising edge and subsequent falling edge transitions gives the pulse width of multi-hit signal in both common start and stop mode. The measured time intervals of third and fourth transitions are used to find delayed events. In this architecture, the number of transitions in the multi-hit time measurement can be extended by replicating the stop oscillator, fine counter, and latches with respect to each transition.



Figure 5.5: Block diagram of multi-hit TDC using Vernier technique



Figure 5.6: Timing diagram of multi-hit TDC using Vernier technique

### 5.3 Stopping scheme of start oscillator in four operating modes

The 'start'oscillator is disabled after latching the measured data or calibrated data in the TDC channel to reduce the power consumption and switching noise. The moment when start oscillator is disabled, depends on the operating mode of TDC.

In mode-0, the start oscillator is disabled, when coarse and fine counts corresponding to the time interval between '*start*' and first transition in '*multihit*' signal are latched. In absence of transition in multihit, the start oscillator is disabled by the selected roll over count of coarse counter.

In mode-1 and mode-2, the start oscillator is disabled, when timing data corresponding to  $4^{th}$  transition is latched in TDC channel. In the absence of  $4^{th}$  transition, the TDC channel waits for its occurrence over the selected dynamic range and 'dynamic range over' signal disables the oscillator.

In mode-3, the start oscillator is disabled, when calibration of time period

of start oscillator as well as TDC resolution is over. The '*calibration over*' signal disables the start oscillator.

#### 5.4 Design of calibration block

The design principle of time-period calibrator is based on 'counting the number of cycles of unknown frequency clock for a known period of time', so that the total count is proportional to the frequency of clock. It is expressed by equation 5.2, where ' $f'_0$  is the unknown frequency, 't' is the known period named as calibration window and 'N' is the number of pulse count within a calibration window.

$$T_0 = \frac{1}{f_0} = \frac{t}{N}$$
(5.2)

The logic control block is used to obtain accurate calibration time window of duration 80  $\mu$ s. The designed value of this duration is sufficiently long to calibrate the time-period of slow oscillator with an accuracy of 1's of ps.

The accuracy of window duration is achieved by referencing it to a stable reference clock (50 MHz) frequency provided by off-chip crystal oscillator. The logic of window generation is based on counting the number of cycles (M) of reference clock, which when multiplied by its time period gives window duration ('t'). Thus, for  $t = 80 \,\mu s$  and reference clock time-period of 20 ns, the required counts of reference clock is 4000.



Figure 5.7: Timing diagram of calibration window generation

As shown in Fig.5.7, the '*calibration start*' pulse triggers window generation by enabling the counting of reference clock (on rising edge) using 12-bit counter. Also, at M = 3, it asserts an internal signal ' $st_en'$  from logic '0' to '1'. At M = 4003, another internal signal ' $sp\_en'$  is asserted from logic '0' to '1'. Thus, the logical XOR combination of these two obtained internal signals (st\\_en and sp\\_en) gives a calibration time window of duration;  $4000 \times 20ns = 80\mu s$ .

The calibration time window enables the 15-bit coarse counter (calibration counter) to calibrate the time-period of slow oscillator. The calibration counter counts the number of cycles of slow oscillator clock within the calibration window duration. When calibration is over (at the falling edge of calibration window) the counter data is latched.

#### 5.4.1 LSB calibration scheme

By the operating principle of Vernier technique, the number of steps of size  $T_{oscst} - T'_{oscsp}$  or cycles of 'stop' oscillator within time-period ( $T_{oscst}$ ) of start oscillator defines the stretching factor (S) expressed by equation 5.3. With the known value of  $T_{oscst}$ , the LSB of TDC is calibrated by determining the value of 'S' in PVT variations. It is determined by counting cycles of stop oscillator within the fine calibration time window whose duration is equal to the time interval between first two consecutive phase coincidences of 'start' and 'stop' oscillator clocks, as shown in Fig.5.8.

$$S = \frac{T_{oscst}}{T_{oscst} - T_{oscsp}}$$
(5.3)

$$LSB = T_{oscst} - T_{oscsp} = \frac{T_{oscst}}{S}$$
(5.4)



#### Figure 5.8: Timing diagram representing first two consecutive phase coincidence

The fine calibration window is obtained by using the leading edge phase detector with additional logic, as shown in Fig 5.9. It provides the two consecutive '*eoc*' pulses corresponding to the first and second phase coincidence of oscillator clock signals. The logical XOR combination of latched versions of 'eoc' pulses gives the fine calibration window. This window enables the fine counter to count the number of stop oscillator clock (spclk) cycles.



Figure 5.9: Schematic diagram of fine calibration window generator

# 5.5 TDC channel interface with memory controller and read-out interface



Figure 5.10: Block diagram of FIFO based memory with interface and read-out logic

As shown in Fig.5.10, the memory is designed as  $256 \times 17$  bit dual port FIFO where, two ports are unidirectional without address inputs, dedicated for memory write and read operations. Both operations are carried out at time interval of 20 ns (50 MHz system clock) with the help of write and read logic blocks respectively. The TDC channels are interfaced with memory write logic to transfer their timing data in the order of 1 to 9. The timing data consists of 22-bits corresponding to single transition in multi-hit signal and is copied in two separate 17-bit words into FIFO.The first word consists of 3-bit channel id, 2-bit event id, 1-part id, [0-6] 7-bit fine count, [7-10] 4-bit coarse count.The second word consists of 3-bit channel id, 2-bit event id, 1-part id, [11-21] 11-bit coarse count.

The FIFO is designed to be addressed in the form of continuous ring with the help of read and write pointers. Due to the sequential access nature of memory,

the FIFO is subject to underflow and overflow conditions. These conditions are intimated to the read and write logic with the help of '*full*' and '*empty*' flags. These flags assert conditionally as per the count of 8-bit 'status\_counter' register. This register count starts from zero at the time of initializing and increases by 1 count at every write and decreases by 1 count at every read operation.

The timing diagram of data transfer from TDC channel to FIFO is shown in Fig.5.11, where the interface logic, write & read control logic blocks work on falling edge of clock. The FIFO block, status counter and memory address pointers work on rising edge of clock.

The channel interface logic checks the status of '*read command*' signal issued from logic control block corresponding to each TDC channel in the order of 1 to 9. If status of '*read command*' is active high, interface logic issues active high '*eoc\_active*'. Simultaneously, it changes the status of '*eoc\_busy*' from low to high to disable the interfacing of next TDC channel until entire data corresponding to the selected channel is transferred to the memory.

In the write logic, a 4-bit counter ' $count\_wr\_enable'$  for each channel is designed, which increments at falling edge of clock if ' $eoc\_active'$  and 'read command' signals are high as well as status of 'full' flag is low. The first bit of this counter is used to obtain 'write enable' signal. Simultaneously, the 17-bit data to be transferred to memory from TDC channel is loaded on the bus ' $data\_write\_FIFO[0 : 16]'$ .

The '*write enable*' signal enables the memory write operation. On the subsequent rising edge of clock, the 17-bit data on the bus is copied into the FIFO memory.

At each write operation, 8-bit '*status\_counter*' register increments by one count to update its status and causes '*empty*' flag to switch to low state. Thereby, it intimates the read logic for data read-out from FIFO. The read control logic detects the active low status of '*empty*' flag on subsequent falling edge of clock and asserts a '*read enable*' signal. On the subsequent rising edge of clock, the data from FIFO is loaded on bus '*FIFO dataout*' to be read-out by external interface.

To locate the address in FIFO during read and write operations, 8-bit counters named 'write pointer' and 'read pointer' are designed. They provide the memory locations with address range from 0 to 255. Both the counters increment on rising edge of clock, when their corresponding enable signals ('write enable 'and 'read enable') are in high state. The write and read operation starts with FIFO address '00000000'.

When eight 17-bit words corresponding to four transitions in single channel

have been copied to FIFO, the active high channel reset 'reset\_fr\_memory' signal asserts. It resets the corresponding channel so that the respective '*read command*' switches from logic '1' to '0'. On the subsequent falling edge, the channel interface and write logic are reset. After reset duration of three clock periods, the write logic is ready for commencement of interfacing with next TDC channel.



Figure 5.11: Timing diagram of data transfer from TDC channels to FIFO based memory

#### 5.5.1 Read-out logic interface

The 17-bit data from FIFO to external interface can be transferred either serially by using SPI or by parallel interface. Any one out of the two modes can be selected as per the need of application. The control signal '*ser\_npar'* selects the read-out mode for data transfer. When '*ser\_npar'* signal is at logic '0', the parallel read-out mode is selected and SPI is in reset state. The '*data\_ready'* signal issued by read logic at the falling edge of system clock enables the parallel interface. On the subsequent rising edge of clock, the data from FIFO is loaded in internal register '*FIFO dataout'* to be transferred to external interface through bus '*par\_dataout'*. The external interface issues an acknowledgment '*par\_ack'* after reading the data on the bus, which resets the read logic. The typical acknowledgment time is assumed 1.5  $\mu$ s in this mode to estimate the event rate that can be handled with respect to the size of memory. The
event rate is calculated as follows:

*Number of transitions per event per channel = 4* 

*Number of words in FIFO per transition = 2* 

*Number of words in FIFO per event for 9-channels* =  $4 \times 2 \times 9 = 72$  *words* 

*Time per event* =  $72 \times 1.5 \mu s = 108 \mu s \approx 100 \mu s$ 

So rate per event =  $1/(100 \ \mu s) = 10 \ \text{KHz}$ 

As depth of FIFO is 256, therefore, it can handle 3 events or 30 KHz event rate in parallel data transfer mode.

When '*ser\_npar'* signal is at logic '1', the SPI is ready for transfer of data serially. The '*data\_ready*' issues a request to external interface for data transfer. The external interface acknowledges by issuing a slave select '*SSEL*' signal in active low state and 24-cycles of master clock '*SCK*' starting with falling edge transition. The active low status of SSEL is checked on next rising edge of system clock, which leads to loading of 17-bit data from '*FIFO dataout*' bus to 24-bit internal register of SPI. Simultaneously, an acknowledgment '*ser\_data\_ack*' is issued which resets the read logic of memory controller.

The data transfer logic from internal register of SPI to external interface is same as already discussed in chapter-4. However, in this design the frequency of master clock is chosen as 4 MHz, therefore a time of 250 ns required to transfer one bit. This leads to requirement of  $250ns \times 24 = 6\mu s$  time in transferring the 24bit word through SPI. So time required to transfer 72 words corresponding to one event =  $72 \times 6\mu s = 432\mu s$ . So event rate =  $1/(432\mu s) \sim 2.3KHz$ . As the depth of FIFO is 256, sufficient for three events, hence this ASIC can handle ~ 6 KHz event rate in serial data transmission mode.

#### 5.6 Layout design aspects

The layout design of this TDC ASIC is based on mixed signal design flow using both manual and automatic P & R design approaches. In the first plan of layout design, the TDC channel including ring oscillators, phase detectors, coarse and fine counters and latches is designed using manual P & R as shown in Fig.5.12 to control the routing delays and placement of blocks.



Figure 5.12: Snap shot of layout of TDC channel using first approach



Figure 5.13: Snapshot of layout of TDC ASIC

The area occupied by single channel of TDC is  $950 \times 685 \mu m^2$ , which makes it difficult to integrate the nine such channels (including trigger channel) along with memory and interface logic occupying area of  $1.9 \times 1.9 \ mm^2$  within the cavity size of  $3.0 \times 3.0 \ mm^2$ . Therefore in the second plan, the counters and latches are integrated along with the memory and interface logic using automatic P & R tool Encounter. The TDC channel including oscillators, phase detectors and LSB calibration logic are manually placed and routed. The modified layout of single channel occupies relatively smaller area of  $432 \times 207 \ \mu m^2$ , as shown in Fig.5.13. The layout of memory is designed in two parts, each having a size of  $17 \times 128$  bits and is dedicated to four channels of TDC. The top level interfacing of both the encounter block and TDC channel is carried out by manual P & R using virtuoso layout editor and is shown in Fig.5.13.

Further, similar layout design precautions as discussed in chapter-4 have been taken to reduce the issues of systematic mismatch among stop oscillator channels, cross talk between interconnects and substrate coupling noise among oscillator channels.

### 5.7 Simulation results

In this section, the simulation results of multi-hit Vernier TDC ASIC are discussed. The functional and timing verification of TDC channels are carried out with the help of SPICE simulators using device models provided by the foundry. This approach is timing accurate as it includes the impact of parasitics related to MOS device and routing of standard cells on the performance of TDC. The functional verification of automatic P & R layout is carried out using mixed mode AMS simulator with SDF delays provided by Encounter. The top level functional verification is also carried out by SPICE simulator. Following tests have been performed to verify the performance of this ASIC-

## 5.7.1 Variation in time periods of ring oscillators and resolution of time interval measurement across five design process corners

The values of time periods  $T_{oscst} \& T_{oscsp}$  and their difference ( $\Delta T_d$ ) for four transitions in multi-hit signal across five design process corners are shown in Table5.3. On typical corner, the time-periods ' $T'_{oscst}$  and ' $T'_{oscsp}$  of ring oscillators are 7.402 ns and 7.304 ns and their difference is 98 ps, which is resolution of TDC. Across the PVT variations, there is a maximum 50 % variation in the time-periods of oscillator with respect to typical designed values. However, the variation in difference of time periods is less than typical designed value of 98 ps.

| Corners→<br>channels↓                                              | <b>WP</b><br>(0 <sup>0</sup> c) | <b>WP</b><br>(27 <sup>0</sup> c) | <b>TYP</b><br>(27 <sup>0</sup> c) | WS<br>(84 <sup>0</sup> c) | WS<br>(27 <sup>0</sup> c) | <b>WO</b><br>(27 <sup>0</sup> c) | WZ<br>(27 <sup>0</sup> c) |
|--------------------------------------------------------------------|---------------------------------|----------------------------------|-----------------------------------|---------------------------|---------------------------|----------------------------------|---------------------------|
| Stclk: T <sub>oscst</sub> (ns)                                     | 4.93                            | 5.422                            | 7.402                             | 11.71                     | 9.845                     | 7.501                            | 7.108                     |
| <b>Spclk</b> _transition <sub>1</sub> :<br>T <sub>oscsp</sub> (ns) | 4.876                           | 5.339                            | 7.304                             | 11.53                     | 9.66                      | 7.401                            | 6.998                     |
| <b>Spclk_transition</b> <sub>2</sub> :<br>T <sub>oscsp</sub> (ns)  | 4.877                           | 5.339                            | 7.302                             | 11.53                     | 9.661                     | 7.402                            | 6.999                     |
| <b>Spclk_transition</b> <sub>3</sub> :<br>T <sub>oscsp</sub> (ns)  | 4.876                           | 5.339                            | 7.304                             | 11.53                     | 9.659                     | 7.401                            | 6.998                     |
| <b>Spclk_transition</b> <sub>4</sub> :<br>T <sub>oscsp</sub> (ns)  | 4.877                           | 5.339                            | 7.304                             | 11.53                     | 9.659                     | 7.401                            | 6.998                     |
| LSB_transition1<br>(ps)                                            | 54                              | 83                               | 98                                | 180                       | 185                       | 99                               | 101                       |
| LSB_transition2<br>(ps)                                            | 53                              | 82                               | 100                               | 180                       | 184                       | 100                              | 101                       |
| LSB_transition3<br>(ps)                                            | 54                              | 83                               | 98                                | 180                       | 186                       | 100                              | 101                       |
| LSB_transition4<br>(ps)                                            | 53                              | 83                               | 98                                | 180                       | 186                       | 100                              | 101                       |

Table 5.3: Variation of time period and LSB across process corners

# 5.7.2 Impact of temperature over time period of clocks and LSB of time interval measurement

The propagation delay of standard cell increases with the temperature, as discussed in chapter-3. Therefore, the variation in time-period of ring oscillators as well as their difference ( $T_{oscst} - T_{oscsp}$ ) over temperature is simulated on typical corner and is shown in Fig.5.14 (a). The maximum frequency of oscillator corresponding to 0<sup>o</sup>C temperature is ~155 MHz, at which the safe operation of 15-bit counter is ensured. Also, as shown in Fig.5.14 (b), by virtue of Vernier technique, the variation in resolution (difference in time periods) is zero over a small range of temperature such as from 61<sup>o</sup>C to 65<sup>o</sup>c due to identical variations in oscillators time period. The end-to-end variation in resolution is 25 ps over the entire range of temperature, which is less than the typical chosen value of 98 ps.



Figure 5.14: (a) Variation in time period of oscillator over temperature (b) LSB variation over temperature

## 5.7.3 Variation in resolution across stop (transition) channels due to local mismatch

The time periods of identically designed stop oscillators are expected to be identical. However, due to systematic variation in their layout drawing as well as effect of local mismatch, there may be a slight variation in their time-periods. This leads to the variation in resolution corresponding to four transitions. A care has been taken in the layout design to avoid systematic variations. However, the impact of unavoidable local mismatch in the time periods of stop oscillators is evaluated by Monte Carlo simulation. The RMS variation in the time period of stop oscillators is evaluated as 11.90 ps, which leads to the same variation in resolution for stop channels in time interval measurement, as shown in Fig.5.15.As calibration accuracy is higher than this value so this variation will be accounted through calibration of each channel independently.



Figure 5.15: LSB variation due to local mismatches over 73 runs of Monte-Carlo simulation

#### 5.7.4 Time period and resolution (LSB) calibration

To account the variations due to process, temperature and mismatch, the individual calibration of resolution corresponding to four transitions and time period of start oscillator is designed in the calibration mode of this ASIC.

The calibration mode is tested by setting ' $dr\_sel'$  control signal to '11' to select the operating mode-3 of TDC channels. The '*calibration start'* pulse is applied using Verilog test bench and TDC channel is accurately simulated using mixed mode simulator. On typical corner, the duration of time period calibration window obtained from 'logic control block' is t=79.99954  $\mu$ s. The count obtained from 15-bit calibration counter within this duration is N=10786. Thus, using equation(5.2), the calibrated time-period of start oscillator is 7.417022 ns, which sufficiently matches with the simulated value of time-period given in Table 5.4. This Table5.4 shows the calibrated time-periods using pre-layout estimated delay netlist as well as extracted netlist. The maximum error in time period calibration is  $\sigma_{Tcal} = 3 ps$ .

In the LSB calibration, the duration of fine calibration windows, stretching factor and LSB' s corresponding to four consecutive transitions are listed in Table 5.5. The calibrated resolution sufficiently matches with the simulated results. The error in calibrated resolution is  $\sigma_{lsbcal} = 3.6 \, ps$  for  $\pm 1$  count in stretching factor and considering error of time period calibration.

| With Pre-layout Estimated Delay Netlist |                                     |           |              |                                      |                                           |  |
|-----------------------------------------|-------------------------------------|-----------|--------------|--------------------------------------|-------------------------------------------|--|
| Corner                                  | Operating<br>Condition              | t<br>(μs) | Count<br>(N) | Calculated<br>time<br>period<br>(ns) | Cali-<br>brated<br>time<br>period<br>(ns) |  |
| ТҮР                                     | <b>3.3V &amp; 25</b> <sup>0</sup> C | 80        | 10710        | 7.469                                | 7.469                                     |  |
| WP                                      | 3.3V & 25 <sup>0</sup> C            | 80        | 17087        | 4.682                                | 4.6819                                    |  |
| WP                                      | 3.3V & 0 <sup>0</sup> C             | 80        | 18042        | 4.43                                 | 4.434                                     |  |
| WP                                      | 3.6V & 0 <sup>0</sup> C             | 80        | 19455        | 4.1                                  | 4.112                                     |  |
| WS                                      | 3.3V & 25 <sup>0</sup> C            | 80        | 7477         | 10.7                                 | 10.699                                    |  |
| WS                                      | 3.3V & 75 <sup>0</sup> C            | 80        | 6746         | 11.85                                | 11.843                                    |  |
| WS                                      | <b>3.0V</b> & 75 <sup>0</sup> C     | 80        | 6256         | 12.787                               | 12.7987                                   |  |
| With RC-Extracted Delay Netlist         |                                     |           |              |                                      |                                           |  |
| ТҮР                                     | 3.3V & 27 <sup>0</sup> C            | 79.9      | 10786        | 7.417                                | 7.4169                                    |  |
| WS                                      | 3.3V & 27 <sup>0</sup> C            | 80        | 8126         | 9.845                                | 9.8449                                    |  |

Table 5.4: Calibrated results for time period of start oscillator

#### 5.7.5 Jitter analysis for ring oscillators

In the time interval measurement mode, the time-period of '*start*' and '*stop*' ring oscillator clocks is crucial design parameter. Ideally, the designed value of time-period of clocks is fixed. However, in practice due to the effect of device component noise, cross talk and substrate and supply noise, the time-period becomes a function of cycle of oscillation (n). The fluctuation ( $\Delta T_n$ ) in time-period (period jitter) on each cycle is determined by-

$$\Delta T_n = T_n - T_{avg} \tag{5.5}$$

Where,  $T_n$  is the time period of oscillator clock at  $n^{th}$  cycle and  $T_{avg}$  is the average time period calculated over large number of cycles.

In this design, the period-jitter is determined over 1500 cycles by accurate simulation of the RC-extracted netlist of start and stop oscillators with spice models (includes the noise models) using Spectre simulator. Fig.5.16 shows the plot of period jitter over number of cycles.

To represent the long term average effect of period jitter, its RMS value is calculated by[158]-

$$\Delta T_{c} = \lim_{N \to \infty} \sqrt{\frac{1}{N} \sum_{n=1}^{N} (\Delta T_{n}^{2})}$$
(5.6)

The RMS value of cycle jitter is  $\sigma_{Tjitter} = 0.3 \, ps$  that is smaller than the clock period of start ring oscillator clock thereby does not affect the expected number of cycles in the given time interval.



Figure 5.16: Period jitter in start oscillator clock (stclk)



Figure 5.17: Variation in (a) start oscillator clock time period (b) stop oscillator clock time period (c) LSB (difference in time periods), over number of oscillation cycles

The effect of period jitter of stop and start oscillator over resolution of time interval measurement is verified by triggering both oscillators simultaneously and plotting their time-periods over 92 clock cycles. This number of cycles covers the designed stretching factor S = 74 for fine time interval measurement. There is a maximum fluctuation in time-period during starting of oscillations, which settles after few numbers of cycles. Fig.5.17 (a, b) shows the variation in time-periods of start and stop oscillator and Fig.5.17 (c) shows the variation in their difference (resolution) over number of cycles. The RMS value of variation in the resolution is 0.95 ps, which is approximately equal to  $\sigma_{LSBjitter} = 0.95 \, ps$  or 0.0095 LSB.

The overall impact of these errors over the precision of time interval measurement is calculated using equation[129]-

$$\sigma_{rms} = \sqrt{(\sigma_q)^2 + (\sigma_{(lsbcal)})^2 + (\sigma_{(Tcal)})^2 + (\sigma_{Tjitter}^2) + (\sigma_{LSBjitter}^2)}$$
(5.7)

Where,  $\sigma_q$  is RMS error due to quantization noise and is given as- $\sigma_q = LSB/\sqrt{(12)} = 100/\sqrt{12} = 28.6 \, ps$ 

The total theoretical RMS error is-

$$\sigma_{rms} = \sqrt{(28.6)^2 + (3.6)^2 + (3.2)^2 + (0.3^2) + (0.95^2)} = 30.713 ps$$
(5.8)

#### 5.7.6 Output versus input time interval characteristics

The linearity of TDC channel is verified by applying linear sweep patterns of multihit signal in steps of 200 ps with respect to trigger signal over 40 ns range, as shown in Fig.5.18 (a). Fig.5.18 (b) shows the plot between measured relative times of transitions in multi-hit signal versus applied time steps on typical (TYP) corner.



Figure 5.18: (a) Applied pattern for multi-hit signal with respect to trigger (b) output versus input time interval over 40 ns range on typical corner

The DNL and INL errors are derived from this characteristic and are shown in Fig.5.19. The DNL error is less than 0.3 LSB and INL error is less than 0.5 LSB.



Figure 5.19: Simulated DNL and INL plots for linearity of occurrence time of transition-1 measurement

The same test is also performed across slow (WS) and fast (WP) design process corners for step size of 200 ps over 30 ns range. The plots of output versus input characteristic are shown in Fig.5.20 (a) and (b). The derived DNL and INL error is less than 1 LSB on both the corners, as shown in Fig.5.21.



Figure 5.20: Output versus input time interval over 40 ns range on (a) WP corner (b) WS corner

## 5.8 Negative time interval measurement in Vernier technique: suitability of Vernier multi-hit TDC in INO experiment

When '*stop*' signal occurs before '*start*', the time interval between them is negative in sign. This negative time interval can be measured using Vernier technique due



Figure 5.21: Simulated DNL and INL plots for linearity of occurrence time of transition-1 measurement

to the assured phase coincidence between 'start ' and 'stop' oscillator clocks with slight difference in their time periods. As shown in Fig.5.22, the fast (stop) oscillator with time period  $T_{oscsp} = 4 \times \Delta T_d$  starts oscillating before the slow one (start) with time period  $T_{oscst} = 5 \times \Delta T_d$ . The rising edge of stclk (slow clock) approaches the trailing rising edge of spclk (fast clock) by a step size of  $\Delta T_d$  on each clock cycle, where  $\Delta T_d = T_{oscst} - T_{oscsp}$ . Eventually, the rising edges of both clocks coincide. This coincidence is detected by the leading edge phase detector (refer chapter-4). The phase detection is based on sampling and shifting the status of stclk by spclk. It asserts '*eoc*' signal when previous sample is at logic '1' and current one is at logic '0', which happens in rising edge leading of '*spclk*' from '*stclk*' in both the positive and negative time intervals. However, in negative time interval, the elapsed cycles of spclk (fine count) is greater than that of stclk (coarse count) till coincidence is achieved.

The number of bits 'b' of coarse counter is chosen to cover the stretching factor  $S_N$ . It is defined as the number of steps of  $\Delta T_d$  spanned over a range of  $T_{oscsp}$ . Therefore stretching factor ' $S_N = \frac{T_{oscsp}}{\Delta T_d}$ ' is smaller than that designed for positive time interval (refer equation(5.3)). The number of bits 'm' for fine counter is chosen as per the required dynamic range  $(2^m - 1 - S_N) \times T_{oscsp}$ .

The concept of negative time interval measurement is applied to the conceptualized architecture of Vernier multi-hit TDC (discussed in section(5.2)). This leads to the usability of Vernier multi-hit TDC channel in both common start and stop modes for two input signals '*trigger*' and '*multi-hit*' (discriminator output signal), available in the INO experiment. Here, if trigger comes before the multihit, the TDC channel works in common start mode (section 5.1.2) and is based on positive time interval measurement. However, if trigger comes later than multi-hit (as being planned in INO experiment), the same TDC channel works in common



stop mode and is based on negative time interval measurement.

Figure 5.22: Edge representation of spclk and stclk rising edge coincidence in negative time interval

The working of Vernier multi-hit TDC channel (refer Fig.5.5) in common stop mode based on negative time interval measurement is shown in Fig.5.23. Here, the '*multi*-*hit*' signal occurs before '*trigger*' signal. The first transition in multi-hit initiates the stop oscillator ( $T_{oscsp}$ ) and '*trigger*' signal initiates start oscillator ( $T_{oscst}$ ) ( $T_{oscst}>T_{oscsp}$ ). The rising edge of start oscillator clock approaches the trailing rising edge of stop-1. After  $N_{c1}$  and  $N_{f1}$  cycles of slow and fast clocks, measured by coarse and fine counters respectively,oscillators have phase coincidence. The time interval between first transition and start signal is-

$$\Delta T_1 = (N_{c1} - 1) \times T_{oscst} - (N_{F1} - 1) \times T_{oscsp}$$

$$(5.9)$$

As the fine count  $N_{f1}(=9)$  is greater than  $N_{c1}(=4)$ , therefore, the time interval between multi-hit first transition and trigger is negative in sign. The second transition has positive time interval with respect to trigger and is measured through basic Vernier technique.

Thus, in the Vernier TDC channel, transitions occurring before and after the trigger can be measured and therefore it can be used for both common start and stop mode. *Also, in order to design the dynamic range equal for both the modes, coarse and fine counter of equal width are needed to be chosen.* Here, in common start mode (positive time interval), width of coarse counter defines dynamic range whereas in common stop (negative time interval) width of fine counter defines dynamic range. This feature of Vernier technique is planned to be incorporated in the second version of multi-hit Vernier TDC ASIC.



Figure 5.23: Negative time interval measurement using Vernier technique

### 5.9 Simulation result

Fig.5.24 shows the applied test patterns of start and multi-hit signal using Verilog test bench to verify the negative time interval measurement using Vernier technique. Here, in the common start mode, the multi-hit signal is swept with a step size of 300 ps over a range of 40 ns. In common stop mode, the trigger signal is swept with step size of 300 ps over range of 40 ns with respect to the fourth transition in multi-hit signal. In both test patterns, the absolute time interval between fourth transition in multi-hit and trigger (negative time interval) is same to the time interval between first transition in multi-hit and trigger (positive time interval).



Figure 5.24: Applied test pattern for time interval (a) common start (b) common stop

For first transition in common stop mode (Fig.5.25(b)), the waveform representing the phase coincidence and coarse ( $N_c$ ) and fine counts ( $N_f$ ) is shown in Fig.5.25. The output versus input time interval characteristics for both the modes is shown in Fig.5.26. Here in common start mode, the time of first transition with respect to trigger is same to the time of fourth transition with respect to trigger in common stop mode.

| <u>}] 1<sup>st</sup> t</u> | ransition in multil | nit                 |                 |             |
|----------------------------|---------------------|---------------------|-----------------|-------------|
| <u>'</u> ا                 | start               |                     |                 |             |
| :]                         | stclk               |                     | սուսուսու       |             |
| ;]                         | stop oscillator-1   |                     |                 |             |
| ;]                         | eoc-1               |                     |                 |             |
|                            | Nc-0                |                     |                 |             |
|                            | Nc-1                |                     |                 |             |
| ;]                         | Nc-2                |                     |                 |             |
|                            | Nc-3                |                     |                 |             |
|                            | Nc-4                |                     |                 |             |
| ;]                         | N <sub>f</sub> -0   |                     |                 |             |
| :]                         | N <sub>f</sub> -1   |                     |                 |             |
| .]                         | N <sub>f</sub> -2   |                     |                 |             |
|                            | N <sub>f</sub> -3   |                     |                 |             |
| ;]                         | N <sub>f</sub> -4   | N <sub>f</sub> fine | count=8, Nc-coa | rse count=3 |
|                            | 200                 | 2000                | 100             | 200.0       |

Figure 5.25: Waveform representing the phase coincidence for transition-1 in common stop mode



Figure 5.26: Output versus input time intervals for applied test patterns in common start and stop mode

### 5.10 Summary

This chapter presents the design and implementation of 0.35  $\mu$ m CMOS standard cell based Vernier TDC ASIC prototype with multi-hit capabilities. This ASIC is an extended version of standard cell based Vernier TDC ASIC (discussed in chapter-4) and is designed for its general purpose use in HEP experiments. It is designed in four operating modes, which include 'time interval', 'common start' and 'common stop' and 'calibration mode'. In common start mode, trigger starts the measurement and maximum four transitions occurring after the trigger within the dynamic range window can be measured. This mode is utilized in INO experiment if trigger occurs before the multi-hit signal. In common stop mode, the 'event reset' signal is required to commence the time measurement of four transitions and trigger. This mode is suitable for accelerator based HEP experiments, where 'bunch reset' signal is available to act as event reset. In INO experiment, if trigger comes after multi-hit (common stop mode), this mode is not suitable due to unavailability of 'eventreset' signal. Therefore, a concept of negative time interval measurement using Vernier technique is successfully analyzed and applied to the designed architecture of Vernier multi-hit TDC in common start mode. It leads to the usability of TDC channel in common stop mode also. This concept is planned to incorporate in the multi-hit design for next version with slight modification in input signal processing and identical widths of coarse and fine counters.

In the calibration mode, the TDC channels are used to calibrate the design parameters such as time period of oscillators and LSB of time measurement. This differs from the previous design of Vernier TDC ASIC, where a separate reference channel of oscillators with separate read-out channel is designed to calibrate the design parameters. This calibration approach improves the exactness of calibrated values, used at the time of measurement. On the other hand, this approach will affect the measurement rate of TDC ASIC, which is compromised as event rate ( $\sim$  100 Hz) in INO experiment is low.

The  $256 \times 17$  bit FIFO based memory is designed to store the timings of multiple transitions within the selected dynamic range. The size of memory is chosen so as to store the data corresponding to three events each having maximum four transitions. The timing data corresponding to each transition is designed to be copied into memory in two parts each of 17-bit word size. This approach is chosen to accommodate the area occupied by memory within the cavity size of the used package. The data read-out from memory to external interface is based on SPI or parallel interface and anyone out of the two can be chosen. The TDC channels are designed using manual P & R approach to control the routing delays. The memory and read-out logic is designed using HDL Verilog and automatic P & R approach to reduce the design effort and time as well as to achieve the compactness in design.

The functional verification has been completed by simulating the logic with pre-estimated delays using mixed-signal simulator. The timing accuracy of TDC channels and memory and read-out interface are assured by simulating the extracted netlist using Spectre simulator. The achieved specification after performance validation of TDC design are 100 ps LSB over selectable dynamic range of 10  $\mu s/20 \mu s/30 \mu s/60 \mu s$  and RMS error of 30.713 ps in measurement.

| Temp<br>Supp<br>age<br>Proce<br>corne | mperature<br>upply Volt-<br>e mat<br>ocess<br>rner |                  |                  |                  | r-               | Fine<br>Cali-<br>bration<br>win-<br>dow<br>Dura-<br>tion of<br>(ns) | S                | LSB*             | LSB                                          |                      |                                      |                            |
|---------------------------------------|----------------------------------------------------|------------------|------------------|------------------|------------------|---------------------------------------------------------------------|------------------|------------------|----------------------------------------------|----------------------|--------------------------------------|----------------------------|
|                                       |                                                    | <b>F6</b>        | <b>F5</b>        | <b>F4</b>        | <b>F3</b>        | <b>F2</b>                                                           | <b>F1</b>        | <b>F0</b>        | ()                                           |                      | ps                                   | ps                         |
| 27 <sup>0</sup> C<br>3.3V<br>TYP      | Hit-1<br>Hit-2<br>Hit-3<br>Hit-4                   | 1<br>1<br>1<br>1 | 0<br>0<br>0<br>0 | 0<br>0<br>0<br>0 | 1<br>1<br>1<br>1 | 0<br>0<br>0<br>0                                                    | 1<br>1<br>1<br>1 | 1<br>1<br>1<br>1 | 547.8165<br>547.7305<br>547.7830<br>547.7830 | 75<br>75<br>75<br>75 | 98.69<br>98.69<br>98.69<br>98.69     | 98<br>100<br>98<br>98      |
| 27 <sup>0</sup> C<br>3.3V<br>WZ       | Hit-1<br>Hit-2<br>Hit-3<br>Hit-4                   | 1<br>1<br>1<br>1 | 0<br>0<br>0<br>0 | 0<br>0<br>0<br>0 | 0<br>0<br>0<br>0 | 1<br>1<br>1<br>1                                                    | 0<br>1<br>0<br>0 | 1<br>0<br>1<br>1 | 482.5721<br>489.7308<br>482.6923<br>482.7115 | 69<br>70<br>69<br>69 | 102.89<br>101.42<br>102.89<br>102.89 | 101<br>101<br>101<br>101   |
| 0 <sup>0</sup> C<br>3.3V<br>WP        | Hit-1<br>Hit-2<br>Hit-3<br>Hit-4                   | 1<br>1<br>1<br>1 | 0<br>0<br>0<br>0 | 1<br>1<br>1<br>1 | 1<br>1<br>1<br>1 | 1<br>1<br>1<br>1                                                    | 0<br>0<br>0<br>0 | 1<br>1<br>1<br>1 | 453.191<br>458.511<br>453.684<br>458.511     | 93<br>93<br>93<br>93 | 53.010<br>53.010<br>53.010<br>53.010 | 54<br>54<br>54<br>54       |
| 27 <sup>0</sup> C<br>3.3V<br>WO       | Hit-1<br>Hit-2<br>Hit-3<br>Hit-4                   | 1<br>1<br>1<br>1 | 0<br>0<br>0<br>0 | 1<br>1<br>1<br>1 | 0<br>0<br>0<br>0 | 0<br>0<br>0<br>0                                                    | 1<br>1<br>1<br>1 | 1<br>1<br>1<br>1 | 614.286<br>614.286<br>614.286<br>614.286     | 83<br>83<br>83<br>83 | 90.2<br>90.2<br>90.2<br>90.2         | 91<br>92<br>91<br>91       |
| 84 <sup>0</sup> C<br>3.3V<br>WS       | Hit-1<br>Hit-2<br>Hit-3<br>Hit-4                   | 1<br>1<br>1<br>1 | 0<br>0<br>0<br>0 | 0<br>0<br>0<br>0 | 0<br>0<br>0<br>0 | 0<br>0<br>0<br>0                                                    | 0<br>0<br>0<br>0 | 1<br>1<br>1<br>1 | 749.0249<br>748.9638<br>748.9555<br>748.9069 | 65<br>65<br>65<br>65 | 180<br>180<br>180<br>180             | 180<br>180<br>180<br>180   |
| 2 <sup>0</sup> C<br>3.3V<br>TYP       | Hit-1<br>Hit-2<br>Hit-3<br>Hit-4                   | 1<br>1<br>1<br>1 | 0<br>0<br>0<br>0 | 0<br>0<br>0<br>0 | 1<br>1<br>1<br>1 | 0<br>0<br>0<br>0                                                    | 1<br>1<br>1<br>1 | 0<br>1<br>0<br>0 | 495.71<br>501.78<br>495.71<br>495.71         | 74<br>75<br>74<br>74 | 91.7<br>90.52<br>91.7<br>91.7        | 91<br>91<br>91<br>91<br>91 |
| 3.3V<br>1 <sup>0</sup> C<br>TYP       | Hit-1<br>Hit-2<br>Hit-3<br>Hit-4                   | 1<br>1<br>1<br>1 | 0<br>0<br>0<br>0 | 0<br>0<br>0<br>0 | 1<br>1<br>1<br>1 | 0<br>0<br>0<br>0                                                    | 1<br>1<br>1<br>1 | 0<br>1<br>0<br>0 | 493.571<br>500<br>493.571<br>493.571         | 74<br>75<br>74<br>74 | 91.4<br>90.3<br>91.4<br>91.4         | 92<br>91<br>92<br>92       |

## Table 5.5: Calibrated resolution (LSB\*) using stretching factor (S) and calculated resolution (LSB) for time interval measurement

## Chapter 6

## Design of TDC using CBL delay element based tapped delay line

The time stamping technique for TDC development is chosen as it can handle both time tagging and time interval measurement with high event rates and good linearity. Here, the occurrence time of transition in discriminator (multi-hit) signal is marked with respect to the reference clock using counter and tapped delay line (TDL). This technique is efficient to meet the specifications of TDC given by INO group in terms of dynamic range  $\approx 32 \ \mu s$ , multi-channel integration and multi-hit capability. However, the resolution of time stamping depends on the least attainable delay from the delay element, used to realize TDL. In the earlier reported TDCs [110, 113], the current starved inverter is used as delay element due to its wide delay regulation range and low power consumption (negligible static current). However, its smallest attainable delay is  $\approx 400$  ps in the available 0.35  $\mu$ m CMOS process resource. This limits the utilization of current starved inverter to provide the aimed resolution of better than 200 ps.

To achieve resolution better than that attainable in a given technology, in the earlier reported TDCs [135, 136], a delay interpolation technique 'array of delay lock loop' (ADLL) with current starved inverter based delay element has been used. This technique requires multiple DLLs to interpolate the delay of current starved inverter, leading to large power & area consumption. In second approach used in [111], a high speed delay element based on current mode logic (CML) has been used, which requires high static current ( $\approx 1$  mA) to provide small delay. In this design, with constrained requirement of low power due to millions of detector channels, a high speed voltage controlled buffer based on 'current balanced logic' (CBL) [156] is designed. It has merits of small area (4 transistors), identical edge transition delays, wide delay regulation range ( $\approx$ 136 ps to 1.6 ns) and small propagation delay (136 ps), compared to that ( $\approx$  400 ps) in current starved inverter. Also, the CBL delay element has less design complexity as both transition delays are controlled by single bias voltage whereas in current starved inverter two bias voltages are needed to control the rising and falling edge delays. Also, it has less contribution in the generation of 'di/dt' switching noise and low power consumption (300  $\mu$ A/delay element) as compared to the CML delay element.

This chapter discusses the design of CBL based TDC, designed in time interval and multi-hit operating modes with the specifications listed in Table 6.1. A 100 MHz off-chip crystal clock is chosen as a reference for time stamping of transitions in the input '*start/stop/trigger'* signal. With this clock frequency, the number of bits in counter is chosen as 12 to achieve time stamping range of  $(2^{12} - 1) \times 10ns =$  $40.95\mu s$ , which is more than the required value of 32  $\mu s$ .

Apart from time stamper, the other crucial blocks of TDC ASIC are: logic control block, memory with its control logic and read-out interface. The time stamping block has been designed using analog design approach to assure the timing accuracy. The control logic block, memory and readout interface are designed using digital design approach with automatic P & R tools. The top level integration of both custom and automatic designed blocks is planned to carry out using layout design editor tool.

| Specifications                 | Multi-hit TDC                                                    |  |  |  |  |
|--------------------------------|------------------------------------------------------------------|--|--|--|--|
| Bin Size                       | 150 ps @ 3.3 V (adjustable)                                      |  |  |  |  |
| Dynamic Range                  | $10 \ \mu s / 20 \ \mu s / 30 \ \mu s / 40 \ \mu s$ (Selectable) |  |  |  |  |
| Number of Channels             | 8 in time interval mode,                                         |  |  |  |  |
|                                | 4 in multi-hit mode                                              |  |  |  |  |
| System Clock frequency         | 100 MHz                                                          |  |  |  |  |
| Number of bits in Fine Count   | 7 (after encoding)                                               |  |  |  |  |
| Number of bits in Coarse Count | 12                                                               |  |  |  |  |
| Pulse width measurement        | 5 ns                                                             |  |  |  |  |
| Pulse Pair resolution          | 10 ns                                                            |  |  |  |  |
| Number of Events per           | 4                                                                |  |  |  |  |
| Measurement                    | -                                                                |  |  |  |  |
|                                | 25 bits including:7-bit fine count +                             |  |  |  |  |
| Number of bits in read-out     | 12-bit coarse count                                              |  |  |  |  |
|                                | +1-bit 'mode' ID +5-bit channel ID                               |  |  |  |  |
| Operating Mode                 | Common stop                                                      |  |  |  |  |
| Calibration circuit            | Delay Lock Loop                                                  |  |  |  |  |

Table 6.1: Design specifications of multi-hit TDC

## 6.1 Architecture of TDC ASIC



Figure 6.1: TDC ASIC (a) block diagram (b) timing diagram

This TDC ASIC is designed to work in two measurement modes: 8-channel time interval and 4-channel multi-hit, by stamping the occurrence time of transitions in input signals '*start*', '*stop*' and '*trigger*' with respect to reference clock (100 MHz). It consists of following blocks: logic control block, Pre-processor, CBL based TDL along with DLL, transition detector, 12-bit coarse counter, 17-fine & 17-coarse registers, memory & its control logic and readout interface, as shown in Fig.6.1(a).

The control logic block enables a 'clock' for TDL and counter on an exter-



Figure 6.2: Timing diagram for generation of read command to transfer the TDC data to memory

nal 'event reset' signal to initialize the time stamping. Subsequently, it opens a 'dynamic range window' of selectable duration from 10  $\mu$ s to 40  $\mu$ s in steps of 10  $\mu$ s using 2-bit 'dr\_sel' signal. This window enables the pre-processor block to process the inputs 'start', 'stop' and 'trigger' signals for stamping their time of transitions within dynamic range window in both the modes of TDC.

In time interval mode, each pre-processor block is dedicated for two channels of TDC. In each channel, rising edge transitions ( $T_1$  and  $T_2$ ) of '*start*' and '*stop*' is stamped independently with respect to clock as shown in Fig.6.1(b). The difference of their time stamped values is used to measure time interval between them.

In multi-hit mode, each preprocessor block is dedicated to the single channel of TDC to distinguish the four transitions (1, 2, 3 & 4) in the '*start*' (or multi-hit) signal (Fig.6.1(b)). The arrival time of each transition is stamped independently with respect to clock. The difference in time stamped values corresponding to first two transitions is used to measure pulse width of '*start*' signal. The stamped times of third and fourth transitions are used to find delayed events. The time of rising edge transition in '*trigger*' signal is also stamped with respect to clock to find its associated transition events in '*start*' within dynamic range window.The stamping of trigger is helpful during event building in INO experiment.

The time stamped data from each channel of TDC is transferred to inbuilt memory using TDC channel interface logic. The data transfer is controlled by the control logic block. It issues a read command to the memory; to read the channels only if in time interval mode '*start*' and '*stop*' are present and in multi-hit mode '*trigger*' is present within the dynamic range window. This is implemented by sampling the active high status of '*end of conversion*' (eoc) signals, obtained from measurement channel and trigger channel, when dynamic range window is over as shown in Fig.6.2. In the absence of '*start*' or '*trigger*', the channel data in both the modes is discarded. Also, in multi-hit mode in absence of trigger, it generates event reset internally and reopens the dynamic range window to rescan the transition events in '*start*' that are associated with the trigger.

Further, the memory is interfaced with readout logic for TDC data transfer to external interface. The read-out logic consists of serial peripheral interface (SPI) and parallel interface. Any one of these two can be selected for external interface depending on the required measurement rate of TDC.

The resolution of TDC is dependent on CBL delay. The CBL delay is stabilized across PVT variations by the control voltage provided by DLL. The DLL and time stamping blocks are designed in close proximity of each other in order to minimize the process induced CBL delay mismatch in both the blocks. Along with DLL, an optional provision for CBL delay tuning by using off-chip reference voltage is designed and incorporated. To find the tuned CBL delay value, a calibration scheme is designed and implemented. The design aspects of time stamping as well as implementation of CBL delay tuning & calibration are discussed in the following section-

### 6.2 Design aspects of time stamping

In the time stamping block, the key design blocks are coarse time measurement block (including 12-bit counter, synchronizer and coarse registers), fine time measurement block (including TDL and fine registers), and reference DLL. The occurrence time of transition in '*start/stop/trigger'* processed by pre-processor is stamped in two parts- coarse and fine (refer(Fig.6.1(b))). The coarse time corresponding to the elapsed full cycles of clock till the occurrence of '*transition'* is measured by using 12-bit free running counter. The fine (fractional) time within one clock period ( $T_{ref} = 10ns$ ) covered by the '*transition'* is measured by using TDL.

#### 6.2.1 Coarse time measurement

The dynamic range of TDL is defined by the reference clock period (10 ns), so it is impossible to discriminate the arrival time of two transitions separated by multiples of reference clock period. The dynamic range can be improved by increasing the reference clock period. However, this requires a large number of stages in TDL (refer equation(6.2)), which is power and area inefficient and increases nonlinearity in time measurement. Alternatively, a 12-bit counter synchronized to the clock of TDL is used to provide the range of time stamping as 40  $\mu$ s. The '*transition*' signal provided by pre-processor samples and latches the count status ' $N'_c$  of free running counter in the coarse register.

The reliable functionality of sampling and latching in coarse register requires that 12-bit data of counter is ready and pre-settled before the time of transition. As it is asynchronous to the operation of counter, therefore, causes meta-stability in the coarse register in case of setup time violation. The meta-stable output of coarse register is ambiguous and requires long exponential time to achieve the stable state. This may lead to timing error due to incorrect coarse count. In addition, the dead zone ( $\approx$  80 ps typical) of D Flip-flops can also cause an error in coarse count result. To avoid these issues, the status of counter is sampled when it is in idle state by a version of 'transition' signal that is synchronized to falling edge of clock, as shown in Fig.6.3(a). This scheme of synchronization still causes a timing error of one coarse count for position of 'transition' during negative half of clock. To address this issue, coarse and fine counts needs to be synchronized along with sampling of counter status on the safe edge, so that both counts belong to the same cycle of clock. This is implemented by designing a coarse count synchronizer, based on dual synchronization method[112] as shown in Fig.6.3 (b). Here, the rising edge 'transition' signal, provided by pre-processor is synchronized to both the rising and falling edges of clock, resulting in the generation of latched signals  $Q'_1$ and  $Q'_2$ . A multiplexer is used to select the synchronized version  $Q'_1$  or  $Q'_2$  of transition as per its position within positive ('I') or negative ('II and III') half cycle of clock as shown in Fig.6.3(c). Using sample  $Q_1$  or  $Q_2$ , the 'syn\_tran' is generated at falling edge of clock to sample and latch the counter status  $N_c$ . Thus, this synchronizer samples & latches the status of counter when it is in idle state, thereby avoids timing violations in coarse register. Also, the latched count of elapsed clock periods has one extra count, which is irrespective of event position (I, II & III) within a clock period.

The considered value of coarse count is  $N_c - 2'$  as two counts are subtractedone due to synchronization scheme and other due to addition of fine time with coarse time. Thus, the coarse time  $(T_c)$  of 'transition' occurrence is given by:

$$T_c = (N_c - 2) \times T_{ref} \tag{6.1}$$



Figure 6.3: (a) Generation of syn\_tran signal on safe clock edge to sample and latch the counter status (b) schematic diagram of dual edge synchronizer (c) timing diagram of dual edge synchronizer

In the earlier reported work [108, 110], dual counter method has been used, where two counters, synchronized to the opposite phases of clock are used. This also requires two coarse registers one for rising edge triggered counter and another for falling edge, so that at the time of occurrence of transition, at least one of the register is in stable state. The selection of stable coarse register is carried out with the help of fine counter value.

In the present design, the coarse count synchronizer uses only four flipflops with a multiplexer (Fig.6.3(b)) and requires single coarse counter and register. Hence, it reduces the area and power consumption as compared to dual counter method, where two coarse registers corresponding to each '*transition*' signal is required.

#### 6.2.2 Fine time measurement

The TDL is realized by a chain of cascaded 'N' number of CBL buffers each with delay ' $T'_{d}$ , which defines the resolution of time stamping. For stabilized CBL delay of 174 ps, 58 number of CBL buffers are required for TDL to cover a range of one clock period  $T_{ref} = 10ns$ , by using equation(6.2). However, to allow the least tunable CBL delay of 136 ps at  $V_{dd} = 3.6V$  as resolution for time stamping, 74 CBL delay elements are used in TDL. It provides 74 uniform delayed replicas of applied '*clock*' signal with time interval of  $T_d$ . These delayed clocks are shared among 17-fine registers, 16 for measurement channels (4 channels in multi-hit or 8-channels in time interval) and one for trigger channel to measure fine time ( $T_f$ ) of transition signal. The fine time is measured by determining the number ( $N_f$ ) of delayed clock edges provided by TDL till the occurrence of transition and is given by equation(6.3).

$$T_{ref} = N \times T_d \tag{6.2}$$

$$T_f = N_f \times T_d \tag{6.3}$$

In order to determine  $N'_{f}$ , the D flip-flops are used as sampling and data storing element. Based on the way of sampling, two architectures have been analyzed.

#### **First Architecture**

In the first architecture of time stamping as depicted in Fig.6.4(a), the rising edges of consecutive delayed clocks stamp the status of 'transition'signal and store the data in N-bit fine register  $(Q_1, Q_2..Q_N)$ . When 'transition'signal leads the  $M^{th}$  ( $0 \le M \le N$ ) delayed clock, the corresponding flip-flop is set to logic-1 state as shown in Fig.6.4(b). The post  $(M + 1)^{th}$  to  $N^{th}$  delayed clocks set the remaining flip-flops to logic-1 state. The output of register is N-bit thermometer code, where the first logic '0 - to - 1' transition count 'M' (provided by thermometer decoder) multiplied by

resolution ' $T'_d$  gives the fine time.

A transition of the last bit  $Q_N$  of register is used as end of conversion (eoc) signal. It stops further stamping of 'transition' signal by delayed clocks corresponding to next clock cycle with the help of any of the two methods - either disable the delayed clocks or the D-flip-flop. However, in both the methods, the processing logic has timing path delays higher than ' $T'_d$  and is uncontrollable across PVT variations. This overwrites the first few bits of thermometer code by the new stamped values of transition signal status (Fig.6.4(b)), corresponding to next clock cycle. Thus, this architecture is not suitable to measure the time of transition with respect to previous rising edge of clock.



Figure 6.4: First architecture of TDL: delayed clock edges sample the hit (a) schematic diagram (b) timing diagram



#### Second Architecture

Figure 6.5: Second architecture of TDL: delayed clock edges sample the hit (a) schematic diagram (b) timing diagram

In the second architecture of TDL as shown in Fig.6.5(a), the asynchronous 'transition' signal stamps the state of consecutive delayed clocks. The functionality of this architecture is explained for N (=8) number of delayed clocks  $d_N$  (N=1 to 8) covering the range of one clock period as depicted in Fig.6.5(b). The notation  $t_1$  to  $t_6$  represents various moments of arrival of asynchronous 'transition' signal within one clock period. The 'transition' signal occurring at the moment 't'\_1 stamps the status of delayed clocks and stores the values in 8-bit register. The count corresponding to logic '1' to logic '0' transition multiplied with 'T'\_d evaluates the arrival time of transition signal with respect to the previous rising edge of clock.

Thus, the second architecture is used in this design of time stamping. However, for the bin size corresponding to unit delay  $T_d>136ps$ , the number of delayed clocks, which covers the range of one clock period is less than 74 ( $T_{ref}$ /N). This results a repetition of logic '1' to '0' transition in the fine register code and is illustrated in Fig.6.6 for values of  $T_{ref} = 10ns$ , N = 10 and  $T_d = 1.66ns$ . The fine register codes corresponding to the transition occurrence positions,  $t_0$ ,  $t_1$  and  $t_2$  have repetition of logic '1' to '0' transitions. The count corresponding to first logic '1' to '0' transition gives the correct measurement of fine time ( $T_f$ ). This provides the adjustability in resolution (bin size) with the help of off-chip bias voltage

To determine the count corresponding to first logic '1' to '0' transition in the 74-bit fine register code, a transition detector with encoder is designed. This block is enabled by ' $syn\_tran'$  signal provided by coarse count synchronizer with a time margin of one clock period for settling the 74-bit fine register. Here, the encoder reduces the number of bits of fine count ( $N_f$ ) provided by transition detector to 7-bit binary value.



Figure 6.6: Repetition in logic '1' to '0' transition in fine register code for bin size greater than the value of  $T_{ref}/N$ 

Thus, the arrival time (T) of each transition (refer Fig.6.1(b)) is given by equation(6.4) by combining coarse and fine times from equations(6.1) and (6.3).

$$T = (N_c - 2) \times T_{ref} + N_f \times T_d \tag{6.4}$$

The logic '1' to '0' transition detector, as shown in Fig.6.7 is based on mag-

nitude comparison between two consecutive bits Data [i] and Data [i+1] ( $0 \le i \le (N-2)$ ) of fine register. The '*cntflag*' register disables the transition detector once detection of first logic '1' to '0' transition is completed. It is followed by the code conversion to 7-bit binary data with the assertion of '*eoc\_td*' signal. This signal is used to generate the '*latch\_transition*' signal, which latches the 25-bit data which includes 12-bit coarse count, 7-bit fine count, 5-bit channel ID, and 1-bit mode ID in the register and provides the '*eoc*' signal for further data processing.



Figure 6.7: Flow diagram of logic one-to-zero transition detector for 74 bits

## 6.3 Timing critical paths

In time stamping, the occurrence time of transitions in input '*start/stop/trigger*' signal is measured by using reference clock and '*transition*' signal provided by preprocessor block. The clock and its delayed replicas are used as the reference for time measurement and '*transition*' signal sets the exact time when measurement is acquired. Therefore, these signal paths are distributed to both coarse and fine time measurement blocks. However, during their distribution, their time critical paths are handled carefully to achieve high accuracy in measurement.

The timing path of 'transition' signal to both the coarse count synchronizer

and fine register should be identical so that coarse and fine register data belong to same cycle of reference clock. The variation in its timing path can be due to asymmetric capacitive load and wire length while routing the signal to both the blocks. To avoid variation due to asymmetric loading, a buffer tree is designed, as shown in Fig.6.8 to generate the identical versions '*transition\_1*' to '*transition\_8*' of '*transition*' signal dedicated to coarse and fine blocks. Also, to avoid the asymmetric routing of these signals, care has been taken in the layout design.

Moreover, the degree of mismatching among delayed replicas of clock manifests itself as non-linearity error in measurement. The sources of mismatching are systematic errors due to asymmetric layout drawing, un-identical load environment, as well as geometrical mismatches. To reduce the systematic error due to asymmetric layout and geometrical mismatches, care has been taken (discussed in layout section) in the layout drawing of TDL with fine registers. To maintain the identical load environment, dummy CBL buffers are placed at the beginning and end of TDL. Also, each tap is loaded by identical load of standard cell buffer, as shown in Fig.6.9



Figure 6.8: Schematic diagram of buffer tree used to avoid loading

The aspect while choosing the standard cell buffer is its driving capability. Its strength is chosen so that seventeen channels of D Flip-flop with equivalent capacitive load of  $\approx 64$  fF could be driven properly. This maintains the small transition time of delayed clock to ensure the correct conversion in all channels of TDC. In this design, standard cell buffer BUF4 (realized by two cascaded inverters) is chosen. The aspect ratio of second inverter is  $6\mu m/0.35\mu m$  and  $12\mu m/0.35\mu m$ , which is efficient to drive a load of 64 fF while maintaining the transition time of  $\approx 300 ps$ .



Figure 6.9: Buffer loading on TDL to avoid its delay variation among the taps

#### 6.4 TDL unit delay calibration

The CBL delay in time stamping block is stabilized across PVT variations by the control voltage obtained from DLL, as shown in Fig.6.10. The DLL is enabled by applying a rising edge of '*start\_DLL*' signal in the 'start control circuit'. It provides a clock (clock\_TDL) for reference TDL, whose initial delay is set to minimum value (4.09 ns) by pre-setting the capacitor voltage through 'preset switch' to avoid false locking in DLL. The feedback loop consisting of phase detector, charge pump and filter capacitor, locks the TDL delay to the half clock period (5 ns) of reference clock (complementary clock\_TDL) across PVT variations in acquisition time of 124 cycles of clock. The attained stable voltage across filter capacitor is used to stabilize the CBL delay to 174 ps across PVT variations.

The half clock period locking in DLL is designed to reduce its acquisition time. To further reduce the acquisition time, the initial delay of reference TDL can be fixed nearer to the target value by 'preset switch'. Thus, the present architecture achieves better performance as compared to earlier reported architectures[108, 110], where the TDL delay with load of fine register channels is locked to reference clock period in long acquisition time.

Along with CBL delay stabilization, an option for its delay tuning using offchip reference voltage is designed. It is selected by resetting the D flip-flops used in start control circuit. It turns on the 'preset switch'as well as disables DLL, so that the filter capacitor is charged to the off-chip reference voltage.

The voltage across filter capacitor is applied to TDL (time stamping block) for delay stabilization or tuning through bias circuit [refer chapter-7]. The purpose of bias circuit utilization is to avoid the loading of long delay lines on filter capacitor (C) by charge sharing with the parasitic capacitances of TDL.

#### Chapter 6. Design of TDC using CBL delay element based tapped delay line

The tuned value of CBL delay is determined by measuring on-chip generated time intervals of 10 ns and 5 ns in between "*calstart*' and '*calstop*' signals (Fig.6.1(a)). These time intervals are generated by using reference clock. The calibration mode is selected by 1-bit 'cal' signal. The CBL delay ' $T'_d$  is given by the difference of measured time intervals by using equation(6.5), where  $N_{f1}$  and  $N_{f2}$ are the counts of fine registers corresponding to 10 ns and 5 ns time intervals respectively. The calibrated CBL delay is used to calculate the final stamped time of event using equation(6.4).

$$T_d = (10ns - 5ns)/(N_{f1} - N_{f2}) \tag{6.5}$$



Figure 6.10: Block diagram of CBL delay element based DLL with start control circuit

### 6.5 Layout design aspects

In this section, techniques used for the layout design of time stamping block is discussed-

#### 6.5.1 Matching aspect

Matching in tapped delays is an extremely important issue to reduce non-linearity errors in time measurement. The sources of mismatch can be divided in two categories: random and systematic.

Random mismatch stems from microscopic fluctuations in dimensions, doping levels, oxide thickness and other parameters that influence component values. Random mismatch cannot be completely eliminated as it is intrinsic to the fabrication process. However, its impact on circuit performance can be reduced through the selection of large device area as these variations are inversely proportional to square root of device area. By choosing the large device area for transistors in CBL delay element, the impact of random variation on the uniformity of tapped delay can be reduced. However, the crucial specifications of delay element such as minimum output voltage in logic zero, static current and speed also depends on selection of width (W) and length (L) of transistors. Therefore, optimum aspect ratio of transistors in CBL delay element is chosen while considering all above stated aspects.

In contrast, systematic mismatch originates from improper layout design, process biases, contact resistance, non-uniform current flow, un-identical environment, and temperature gradients. In the layout of TDL, maximum design efforts are made so that critical circuit components are not sensitive to systematic mismatch and are discussed below-

- Typically, transistors of different lengths and widths do not match completely. Therefore, a uniform channel length (0.35  $\mu$ m) is used for all the transistors in the layout of CBL buffer as shown in Fig.6.11. Moreover, NMOS ( $M_1$  and  $M_2$ ) as well as PMOS ( $M_3$  and  $M_4$ ) transistors, which need to be matched in each delay stage to maintain the duty cycle of clock are divided into multiple fingers with all fingers being of the same width and length. The fingers are inter-digitized to reduce systematic mismatch among transistors ( $M_3$  and  $M_4$  or  $M_1$  and  $M_2$ ) of a specific fabrication run. A transistor with multiple fingers also has reduced junction area. This leads to reduction in the amount of parasitics such as diffusion capacitance and gate resistance, which drastically impacts the speed of delay element.
- In the layout of CBL delay element, to ensure matching in both NMOS as well as in both PMOS transistors, NMOS and PMOS transistors are placed parallel to each other with same orientation.

- The track of bias voltage (Fig.6.11) is separated from the clock signals (in/out) to reduce the cross talk.
- To maintain the uniformity of delays among the taps of TDL, care is taken to keep the length of interconnects joining the delay elements identical as shown in Fig.6.12.
- The length of interconnect from the output of buffer to the channels of D flipflop should also be identical on each tap to maintain the uniformity among delayed clocks. This is achieved by placing the standard cell buffer adjacent to its corresponding CBL buffer in the design of TDL (Fig.6.12). This manner of placement requires same height of CBL buffer and standard cell buffer to avoid any misalignment in the route of supply rail. Also, on the taps of TDL the length of interconnects from buffers to their corresponding D flip-flops in 16-channels is made identical to maintain the uniformity among delayed clocks, as shown in Fig.6.13.
- As a 'transition' signal provided by the pre-processor drives the clock input of 74 D-flip-flops (74-bit fine register), therefore to reduce the skew in its timing path, a buffer tree is placed, as shown in Fig.6.14. Each buffer tree is dedicated to two 10-bit fine registers (20 D Flip-flops) and is placed in local proximity between them to reduce the variation in buffered signals (transition\_1 and transition\_2).



Figure 6.11: Layout diagram of CBL delay element (refer Fig.6.5(a) for schematic diagram)



Figure 6.12: Zoomed view of layout of TDL showing adjacent placement of buffer and CBL delay element



Figure 6.13: Layout representation of TDL with 8-channels of 74-bit fine registers

### 6.5.2 Skew reduction in the latch signal

The layout of 25-bit latch having the time stamped data is designed under consideration of reduction in skew in the timing path of 'latch\_transition' signal. The other concern is to reduce the length of layout to make it compatible in the local placement along with other sub-blocks in TDC channel. The layout is shown in Fig.6.15, where, 25-stages of latch is divided in two parts and placed in two channels instead of one. The buffer tree is placed in between the two channels. Due to the parallel routing, the effective physical length of 'latch\_transition' signal from buffer tree to last flip-flop of latch is reduced by one half for same fan-out.



Figure 6.14: Scheme of buffer tree placement for 74-bit fine register



Figure 6.15: Layout diagram of 25-bit latch

## 6.6 Simulation results

In this section, the simulation results of multi-hit FLASH TDC channel are discussed. The functional and timing accuracy of design blocks have been verified by applying following tests-

# 6.6.1 Verification of post layout delay characteristic of CBL delay element across process corners

The post layout delay characteristic of CBL delay element is verified across five design process corners using Spectre simulator. Fig.6.16(a) shows the delay versus control voltage characteristics of CBL delay element with identical rising and falling edge delays. Fig.6.16(b) shows the delay characteristic across design process corners. On worst case slow corner (WS), the best attainable delay from CBL delay element is 200 ps. On typical (TYP) corner, the delay of CBL delay element is 150 ps at  $V_{dd} = 3.3V$  and control voltage of 0.3 V.



Figure 6.16: Delay versus control voltage characteristic of CBL delay element
#### 6.6.2 Calculation of theoretical RMS error

Various TDC intrinsic error sources including quantization error, random errors (process mismatch, device component noise, and supply variation), and systematic errors in layout drawings, degrade the precision of measurement. These error sources cannot be completely avoided; however their impact can be minimized by careful design techniques. The errors due to these sources are analyzed in the following simulations-

# 6.6.2.1. Tapped delay variation due to process induced local mismatch using Monte Carlo simulation

The impact of process induced local variations on the CBL buffers is uncorrelated and results in a random delay variation on the taps of TDL. This delay variation accumulates over the length of TDL and results in an integral non-linearity (INL) error. The expression for standard deviation of INL for the N-tapped delay line with a negative feedback closed loop system is given by[108]-

$$\sigma_{DLL}(i) = \sigma_{td} \times \sqrt{\frac{n}{N} \times (N-n)}$$
(6.6)

Where, ' $\sigma'_{td}$  is the standard deviation in the delay element due to local mismatch and 'n' is defined by the tap position 'i' along the delay chain by:  $n = Mod((i+1), N), 0 \le i < N$ .

Thus, the maximum INL error in the TDL is at the middle of delay line with close loop system for n=N/2:

$$\sigma_{DLL}\left(i\right) = \frac{\sigma_{td}}{2} \times \sqrt{N} \tag{6.7}$$

However, in this design, the TDL (N=74) in the measurement channel is not coupled with negative feedback loop. Therefore, the maximum INL exists at the end of delay line and its expression is given by:

$$\sigma_{TDLtd}\left(i\right) = \sigma_{td} \times \sqrt{N} \tag{6.8}$$

The tapped delay variation due to local mismatches is analyzed by mismatch simulation of the extracted netlist of TDL with the help of Monte Carlo statistical simulator for 100 numbers of runs. Fig.6.17(a) shows the standard deviation for unit delay of 166 ps on the taps of delay line. The standard deviation in tapped delays accumulated along the length of delay line, is shown in Fig.6.17(b). The simulated maximum standard deviation in tapped delay is at the end (output of  $74^{th}$  tap) of TDL is 20 ps. This value approximately matches with the value of 23.22 ps calculated from equation(6.8) for  $\sigma_{td} = 2.7ps$  and N= 74.



Figure 6.17: (a) Variation in delay on the taps of TDL due to process mismatch (b) accumulated non-linearity error

# 6.6.2.2. Variation in tapped delays due to systematic errors caused in layout drawing of TDL, parasitic cross talk and device component noise

The other sources of tapped delay variations are device component noise, effect of noise coupling through parasitics and variation in layout drawing. To find this variation, the extracted netlist of TDL is accurately simulated by Spectre simulator with transistor noise models. The delay of TDL is controlled by the reference voltage source, available in cadence library. The maximum delay variation is of 17 ps on  $25^{th}$  tap of TDL, as shown in Fig.6.18. The RMS value of delay variation ( $\sigma_{NC}$ ) for TDL is 10 ps.



Figure 6.18: Tapped delay variation due to systematic variation in layout drawing, noise coupling through parasitics and device component noise

# 6.6.2.3. Tapped delay variation due to ripples in control voltage provided by DLL over 1000 number of cycles

The ripple in the control voltage provided by DLL due to non-idealities of feedback loop causes variation in tapped delays in TDL. This delay variation is same for all the delay elements therefore, the standard deviation in tapped delay for multiple clock cycles increases linearly along the length of TDL. To evaluate this error, the delay of TDL is controlled by the control voltage provided by DLL and the whole block (TDL+ DLL) is simulated for 1000 number of reference clock cycles. Fig.6.19(a) shows the delay variation on each tap of TDL. The standard deviation of accumulated delay variation is shown in Fig.6.19(b) and its value at the end of delay line is 0.14 ps.

By addition the above delay variations quadratically, the total theoretical RMS error in time measurement can be calculated by[129]-

$$\sigma_{rms} = \sqrt{(\sigma_q)^2 + (\sigma_{TDLtd})^2 + (\sigma_{DLL})^2 + (\sigma_{NC})^2}$$
(6.9)

Where,  $\sigma_q$  is RMS error due to quantization noise,  $\sigma_{TDLtd}$  is RMS error due to process mismatch,  $\sigma_{DLL}$  is rms error due to nonideality of DLL,  $\sigma_{NC}$  is rms error due to noise coupling in parasitics, systematic error in layout drawing and device component noise.

The rms error due to quantization noise for resolution (LSB) of  $T_d = 166 ps$  is:

$$\sigma_q = \frac{T_d}{\sqrt{12}} = \frac{166}{\sqrt{12}} \ ps = 43.2 \ ps \tag{6.10}$$

(6.11)



Therefore, the total theoretical RMS error is-

Figure 6.19: (a) Standard deviation in delay for 1000 cycles of clock on the taps of TDL (b) standard deviation in accumulated delay along the length of TDL due to ripples in the control voltage provided by DLL

#### Test for resolution (LSB) adjustability 6.6.3

| V <sub>ctrl</sub><br>(V) | V <sub>dd</sub><br>(V) | LSB<br>(T <sub>d</sub> )<br>(ps) | Fine<br>count<br>(N <sub>f</sub> ) | Coarse<br>count<br>(N <sub>c</sub> ) | output<br>time<br>interval<br>(ns) |
|--------------------------|------------------------|----------------------------------|------------------------------------|--------------------------------------|------------------------------------|
| 0.3                      | 3.6                    | 134                              | 39                                 | 6                                    | 45.226                             |
| 0.4                      | 3.6                    | 138                              | 37                                 | 6                                    | 45.106                             |
| 0.3                      | 3.3                    | 150                              | 34                                 | 6                                    | 45.100                             |
| 0.4                      | 3.3                    | 160                              | 32                                 | 6                                    | 45.120                             |
| 0.55                     | 3.3                    | 170.3                            | 30                                 | 6                                    | 45.100                             |
| 0.7                      | 3.3                    | 182                              | 28                                 | 6                                    | 45.096                             |
| 0.9                      | 3.3                    | 196                              | 26                                 | 6                                    | 45.096                             |
| 1.0                      | 3.3                    | 208                              | 24                                 | 6                                    | 44.992                             |
| 1.2                      | 3.3                    | 224                              | 22                                 | 6                                    | 44.928                             |

Table 6.2: Tuning of CBL delay (resolution) using reference voltage

The adjustability of resolution of TDC with respect to off-chip reference voltage is verified by tuning the delay of CBL buffer and measuring the constant time interval of 45.116 ns. Table6.2 shows the fine count corresponding to each tuned value of CBL delay. The output time interval is in agreement with the applied time for each delay that defines the resolution of TDC. This ensures the adjustability in resolution of TDC with respect to off-chip reference voltage.

# 6.6.4 Input versus output time interval measurement characteristics

The performance of TDC channels is tested using Verilog test bench. The linearity of measured time characteristics is verified by applying linear sweep patterns of multi-hit in steps of 300 ps with respect to trigger over 20 ns range as shown in Fig.6.20(a). Fig.6.20(b) shows the plot between measured relative times of transitions versus applied time steps on typical (TYP) corner.



Figure 6.20: (a) Applied test pattern (b) output versus input time intervals on typical corner

The DNL and INL error plot derived from the time interval characteristic is shown in Fig.6.21. The maximum DNL error is 0.5 LSB and INL error is 0.8 LSB.



Figure 6.21: INL and DNL error on typical corner



Figure 6.22: Plot between relative time interval versus applied time interval on (a) WP corner (b) WS corner

This same test is also performed across slow (WS) and fast (WP) design process corners for the resolution of  $T_d = 200ps$  (at control voltage = 0.1 V) and  $T_d = 150ps$  (at control voltage =1.5 V) respectively. The plots of measured characteristics are shown in Fig.6.22(a, b).The INL and DNL errors on TYP, WP and WS are less than 1 LSB as shown in Fig.6.21 and Fig.6.23.



Figure 6.23: INL and DNL error on (a) WP corner (b) WS corner

The performance over full dynamic range of 40  $\mu$ s is verified by selecting values of 2-bit dr\_sel as '11'. A linear sweep of multi-hits patterns with time steps of 223 ns with respect to trigger is applied to TDC channel. Fig.6.24 shows the plot of relative time versus applied time step for transition-1 on TYP corner.



Figure 6.24: Plot between relative time interval versus applied time interval for transition 1 over 40  $\mu$ s range

# 6.7 Summary

This chapter discusses the design of multi-hit TDC based on time stamping, where time interval measurement is carried out in two parts: coarse and fine. The coarse counter counts the number of full clock cycles till the arrival of transition. The remaining fractional time is measured by fine counter based on Tapped delay line (TDL), which is realized by the novel CBL delay element. The CBL delay element improves the resolution of time stamping as compared to that achievable in  $0.35\mu$ m CMOS technology. Also, this delay element has a salient feature of identical rising and falling edge transition delays controlled by single bias. It maintain the duty cycle of clock while passing through TDL, thereby avoids the design complexity. Further, in order to measure multi-hit pulse width of  $\approx 5$  ns, the architecture of fine counter is modified, where a single fine register is dedicated for single transition in multi-hit signal and all fine registers share the TDL. Based on the way of sampling, two architectures of TDL are analyzed and simulated. The second architecture eliminates the issue of code overwrite, existing in the first architecture and hence is utilized in this design. Also, this architecture coupled with a transition detector provides an adjustment in bin size of TDC by off-chip reference voltage. This is the crucial aspect as the unit delay of TDL defines the bin size and is sensitive to process and operating conditions variations, thereby improves the robustness of the design.

To account the portability of TDC across process and operating condition variation for a fixed resolution, the CBL delay element is immuned with the help of delay lock loop designed in reference channel. The control voltage provided by DLL tunes the delay of CBL in TDL. This method avoids the loading of 16-channels of fine register on the control loop of DLL.

The matching of delays in the consecutive delayed clocks in TDL is crucial to reduce the non-linearity error in time measurement. To address this, care has been taken in the layout design of fine counter including fine registers and TDL. To synchronize the coarse and fine counts, a new coarse count synchronizer based on dual edge synchronization is designed. It has better performance in view of power and area consumption as compared to earlier used dual counter method. Using these key design blocks, the architecture of this multi-hit TDC ASIC is designed for maximum four hits with in selectable dynamic range window in common stop mode. In the normal mode, this ASIC can be utilized as 8-channel time interval measurement between two events 'start'and 'stop'.

# Chapter 7

# Design and Implementation of CBL Delay Element based DLL

# 7.1 Introduction

The TDC based on time stamping (as discussed in chapter-6) uses tapped delay line (TDL), realized by cascaded connection of CBL delay elements. The propagation delay of CBL delay element defines the resolution of time stamping and is sensitive to PVT variations. Therefore, to provide immunity to this delay variation, a delay lock loop (DLL) is designed in the reference channel. The control loop of DLL references the delay of TDL to the off-chip accurate reference clock and provides a requisite control voltage. This control voltage tunes the delay of CBL delay element in the time stamping block. Thereby, it provides robustness to time stamping across PVT variations.

In this part of dissertation, a design and implementation of DLL using CBL delay element with salient features of smaller unit delay and wide lock range is discussed.

# 7.2 Architecture of DLL

The designed architecture of DLL as shown in Fig.7.1, consists of key design blocks: start control circuit, reference TDL referred here as voltage controlled delay line (VCDL), phase detector (PD), charge pump (CP) with loop filter capacitor and bias circuit.

The start control circuit is designed to avoid the issue of lock failure and false harmonic locking by setting the initial delay of VCDL at minimum value before the commencement of loop correction.

The operation of DLL starts with the assertion of '*start\_DLL*' signal. It turns off the preset switch 'S' as well as enables the reference clock '*clock\_TDL*' inside VCDL. The delayed output clock ' $D'_{29}$  from VCDL is applied to PD to compare time of its rising edge with the rising edge of reference clock '*complementary clock\_TDL*'. It converts the phase error (time interval between the rising edges) into equivalent time duration pulse '*UP*' or '*DN*'. The output of PD controls the charge pump[195], to convert the phase error in equivalent voltage change across loop filter capacitor.

If phase error is negative ( $D_{29}$  lags  $A_{ref}$ ), the time duration error is given by 'UP' signal, which controls the discharging of filter capacitor by turning on the NMOS switch of charge pump. The 'DN' signal keeps off the PMOS switch allowing change in control voltage equivalent to the duration of 'UP' signal. Thus, the decrement of voltage across filter capacitor leads to decrease in delay of VCDL thereby the rising edge of ' $D'_{29}$  approaches to the rising edge of reference clock. After few iterations of loop correction, the phase error theoretically becomes zero and consequently, the control voltage becomes stable. The DLL achieves a lock point and stable control voltage biases the delay of VCDL to half reference clock period.

If phase error is positive ( $D_{29}$  leads  $A_{ref}$ ), it is given by duration of 'DN' signal, which enables the charging of filter capacitor. Simultaneously, the 'UP' signal turns off the NMOS switch so that control voltage increases equivalent to the duration of 'DN' signal. After few cycles of clock, the control voltage becomes stable and fixes the delay of VCDL to half clock period. The salient features of this DLL are:

- The VCDL is implemented using CBL delay element, which is faster than the conventional current starved inverter in our target 0.35 μm CMOS process. This allows us to achieve the smallest unit delay of nearly 150 ps, comparable to the gate delay of used technology.
- The CBL delay element provides identical delays in both rising and falling edge transitions. This enables the control of both edge delays using single control loop (including PD, CP, loop filter and bias circuit) with the benefits of reduced design complexity and area consumption.
- The delay of VCDL is locked to half reference clock period (5 ns) instead of one clock period as in the conventional DLL (discussed in chapter-3). The benefit is reduced static power consumption in VCDL due to relatively less

number of delay elements required. In addition, the acquisition time of DLL reduces. However, this leads to the reduction in lock range of DLL, which is compromised as per its need in our application.

• Wide delay regulation range of CBL delay element, which provides higher portability of DLL across fast process as well as lower temperature (< 27<sup>o</sup>C) and high power supply condition (3.3 V to 3.6 V).



Figure 7.1: Block diagram of DLL

### 7.2.1 Description of design blocks

#### 7.2.1.1 Voltage controlled delay line (VCDL)

The VCDL in the DLL consists of 'N' cascaded CBL delay elements. The number N = 29 is calculated using equation(7.1), where,  $T_{ref} = 10$  ns is the reference clock period and  $T_d = 174$  ps is unit target delay. The CBL delay elements are loaded with standard cell buffers similar to that in 74-TDL in measurement channel. To provide the same ambient environment for all delay elements, dummy CBL delay element at the end of VCDL is used.

$$N = \frac{T_{ref}}{2 \times T_d} \tag{7.1}$$

#### 7.2.1.1(a) Design aspects of CBL delay element

The performance of delay element is expected to fulfill the following specifications-

• High Speed

- Interfacing capability with standard cell delay element
- Low power consumption
- · Low switching noise sensitivity to minimize jitter
- Small area
- Reserved polarity of applied signal
- Matching in both transition delays to maintain duty cycle of clock in TDL
- Wide delay regulation range

However, it is difficult to achieve all above stated specifications from a particular architecture of delay element. Therefore, the choice of delay element is based on the criteria of needed specifications. In our application, the preferred specifications of delay element are high speed, power, area efficient, reserved polarity, low switching noise sensitivity and matched transition delays.

To achieve these specifications, in this work, a novel delay element based on current balanced logic is designed. Before discussing the delay element, the current balanced logic is introduced in the following paragraph-



Figure 7.2: Current balanced logic circuit

The current balanced logic [156] is the low noise logic family that reduces  ${}^{\prime}di/dt'$  switching noise. The noise reduction technique aims to keep the supply current steady. This scheme modifies pseudo NMOS logic by adding the NMOS transistor  ${}^{\prime}M_2'$  as shown in Fig.7.2. When NMOS logic block is on, the output node

'*Out*' is pulled down, which keeps ' $M'_2$  in cutoff operating region and there is a static supply current through ' $M'_1$ . When NMOS logic block is off, ' $M'_2$  is in saturation and draws the same supply current provided ' $M'_1$  and ' $M'_2$  are well matched.

If NMOS logic block is replaced by NMOS transistor with input clock applied to its gate terminal, the circuit behaves like a logic inverter with delay introduced in output rising edge transition. However, in this work the polarity of applied clock needs to be reserved. Therefore, the architecture of CBL inverter is modified by adding PMOS transistor ' $M'_4$ , as shown in Fig.7.3 (a) and referred here as modified CBL delay element. This circuit behaves like a buffer with identical propagation delays in both edge input transitions.



Figure 7.3: MCBL delay element (a) schematic Diagram (b) timing Diagram

The amount of charging current 'I' through the PMOS transistors ' $M'_3$  and ' $M'_4$  controls the rising and falling edge propagation delays respectively. The cur-

rent '*I*' is determined by the control voltage ' $V'_{ctrl}$ . Both rising ' $T'_{pr}$  and falling ' $T'_{pf}$  edge delays are identical provided-

- PMOS transistors ' $M'_3$  and ' $M'_4$  are matched
- NMOS transistors ' $M'_1$  and ' $M'_2$  are matched
- The nodes 'Out' and 'Outb' face the same capacitive load (C<sub>load</sub>).

As shown in Fig.7.3(b), when rising edge transition of clock  $V'_{in}$  is applied, the 'Outb' node pulls down. The NMOS transistor ' $M'_2$  enters in the cutoff region and PMOS ' $M'_3$  starts pulling up the node 'Out' by charging the load capacitance  $C_{load}$  using I. When falling edge transition of ' $V'_{in}$  is applied, ' $M'_1$  is turned off. Subsequently ' $M'_4$  starts pulling up the node 'Outb'. This turns on ' $M'_2$ , which pulls down the node 'Out'. The pull down is fast as aspect ratio 'W/L' of NMOS is designed to be higher than PMOS transistors. The signals ' $V'_{in}$  and ' $V'_{out}$  are of same polarity with identical rising  $T_{pr}$  and falling  $T_{pf}$  edge delays. The variation in the supply current is small as equal current path is maintained in rising and falling edge transitions. This ensures the current balancing and thereby the reduced switching noise.



Figure 7.4: Schematic diagram of current starved inverter

This CBL delay element can be interfaced to the standard cells available in the PDK owing to its large output swing. It is fast as compared to current starved inverter[187], shown in Fig.7.4, as the resistance of charging path of capacitive load

 $C'_{load}$  is less. There is only one transistor  $M_3/M'_4$  in the charging path for CBL while in current starved inverter, there are two series connected transistors  $M_7$  and  $M_8$ . In the cascaded delay line, the MCBL delay element faces a single transistor  $(M_1)$  load (gate capacitance) as compared to two transistors ( $M_6$  and  $M_7$ ) load in current starved inverter. This further reduces the propagation delay of CBL delay element.

#### 7.2.1.1 (a.1) Analysis of the CBL delay element

This section describes the equation for propagation delay as well as output voltages  $V_{OL}$  (maximum output voltage in logic '0') and  $V_{OH}$  (minimum output voltage in logic '1').

#### **7.2.1.1 (a.1.1)** Calculation of $V_{OL}$ and $V_{OH}$

To ensure the compatibility of delay element to be interfaced with the standard cells, it is required to calculate  $V_{OL}$  and  $V_{OH}$  over a control voltage range. In the calculation of  $V_{OL}$ , it is assumed that  $V_{in}$  is less than the threshold voltage ' $V'_{tn}$  of  $M_1$  ( $V_{in} < V_{tn}$ ) (Fig.7.3). The NMOS transistor ' $M'_1$  is turned off and subsequently ' $M'_2$  is turned on. There is a static current through ' $M'_3$  and ' $M'_2$ . The expression of  $V_{OL}$  is derived by applying KCL at node 'Out'-

$$I_{M3}(saturation) = I_{M2}(linear)$$

$$\Rightarrow \frac{1}{2}(\mu_p)(C_{ox}) \left(\frac{W}{L}\right)_{M3} (V_{ctrl} - V_{dd} - V_{tp})^2 = (\mu_n)(C_{ox}) \left(\frac{W}{L}\right)_{M2} ((V_{dd} - V_{tn}) V_{out} - \frac{V_{out}^2}{2})$$
(7.3)

Where, ' $\mu'_n$  and ' $\mu'_p$  are the mobility of electrons and holes in  $cm^2/(volt - sec)$ ,  $C_{ox}$  is oxide capacitance per unit area, 'W/L' is aspect ratio of transistors and  $V_{tn}$  and  $V_{tp}$  are threshold voltage of NMOS and PMOS transistors respectively.

$$\Rightarrow A(V_{ctrl} - V_{dd} - V_{tp})^2 = 2(V_{dd} - V_{tn})V_{out} - V_{out}^2$$
(7.4)

Where,  $A = (\mu_p \times (W/L)_{M3})/(\mu_n \times (W/L)_{M2})$ 

The equation(7.4) can be rearranged in the form of quadratic equation-

$$V_{out}^2 - 2V_{out} \left( V_{dd} - V_{tn} \right) + A \left( V_{ctrl} - V_{dd} - V_{tp} \right)^2 = 0$$
(7.5)

The roots of standard quadratic equation  $ax^2 + bx + c = 0$  are given by:

$$x = \frac{-b \pm \sqrt{b^2 - 4ac}}{2a}$$
(7.6)

Equating the coefficients of equation (7.5) and standard quadratic equation and putting the values in equation (7.6)-

$$V_{out} = V_{dd} - V_{tn} \pm S \tag{7.7}$$

Where,  $S = \sqrt{(V_{dd} - V_{tn})^2 - A \times (V_{ctrl} - V_{dd} - V_{tp})^2}$ 

Expression for  $V'_{OL}$  is given with negative root of above quadratic equation as value of  $V_{OL}$  should be in between 0 to  $V_{dd}$ .

$$V_{out} = V_{dd} - V_{tn} - S \tag{7.8}$$

To calculate  $V'_{OH}$ , the assumption is  $V_{in} >> V_{tn}$ . Consequently  $M'_2$  is turned off so that  $I_{M2} = 0$ . By applying KCL at node Out'-

$$I_{M3}(saturation) = I_{M2}(linear) = 0$$
(7.9)

$$\frac{K_p W}{2L} \times \{2(V_{ctrl} - V_{dd} - V_{tp})(V_{out} - V_{dd}) - (V_{out} - V_{dd})^2\} = 0$$
(7.10)

The only valid solution of equation (7.10) is  $V_{out} = V_{dd} = V_{OH}$ . In order to verify the analytical equations for  $V'_{OL}$ , technological and operating parameters, given in Table 7.1 are used. Fig.7.5 shows the comparison of simulated and analytical data for  $V'_{OL}$  over the control voltage  $V'_{ctrl}$ . The maximum deviation is 20 mV. The values of  $V_{OL}$  over the entire range of control voltage are much less than the switching point (~ 1.4V) of standard cells. Therefore, they can be interfaced with the standard cells in the design of DLL and time stamping block.

| $\mu_n = 475 \times 10^{-4}  m^2 / V \text{-} Sec$ | $\mu_p = 148 \times 10^{-4}  m^2 / V - Sec$    |
|----------------------------------------------------|------------------------------------------------|
| $V_{tn} = 0.5 V$                                   | $V_{tp}$ = -0.69 V                             |
| $C_{ox}$ =4.4 fF/m <sup>2</sup>                    | $V_{dd}$ = 3.3 V                               |
| $T = 27^{\circ}C = 300^{\circ} K$                  | $K_p = 0.0006512 \text{ F/V-sec}$              |
| $\lambda = 0.13 \text{ V}^{-1}$                    | $C_{load} = 14 \text{ fF}$                     |
| $(W/L)_{M3} = (W/L)_{M4} = 1\mu m/0.35\mu m$       | $(W/L)_{M1} = (W/L)_{M2} = 1.5\mu m/0.35\mu m$ |

Table 7.1: Process parameters and their values



Figure 7.5: V<sub>OL</sub> versus V<sub>ctrl</sub> (Control Voltage)

#### 7.2.1.1 (a.1.2) Propagation delay equation

The aim is to find the rising edge propagation delay  $T'_{pr}$  of CBL buffer (Fig.7.3(a)) with respect to control voltage  $V'_{ctrl}$ . When  $V'_{in}$  is at logic 1',  $V'_{outb}$  pulls down to  $V'_{OL}$ . Consequently, the NMOS transistor  $M'_2$  enters in the cut-off region. In the calculation explained below, it is assumed that  $M'_2$  enters in cut-off after a constant delay. It is not modeled to avoid the complexity in the calculation. Assuming negligible current through  $M'_2$ , the current I' through  $M_3$  is used to charge the load capacitance  $C'_{load}$ . The relation between the current I' and delay  $T'_{pr}$  is given by-

$$T_{pr} = C_{load} \int_{V_{OL}}^{\frac{V_{dd}}{2}} \frac{dV_{out}}{I(V_{out})}$$
(7.11)

The channel length 'L' of PMOS transistor  $M_3/M_4$  is designed to be 0.35  $\mu$ m.

The value of  $V_{DSAT}$  is 1.7 V in 0.35  $\mu$ m AMS CMOS process. Therefore, it is a good approximation to assume velocity saturated behavior from  $V_{OL}$  to  $V_{dd}/2$ . The current '*I*' in velocity saturation operating region is given by-

$$I = I_{sd} = -I_{ds} = -\frac{K_p}{2} \left( V_{ctrl} - V_{dd} - V_{tp} \right) \left( 1 + \lambda \left( V_{out} - V_{dd} \right) \right)$$
(7.12)

Where,  $K'_p$  is the gain factor for PMOS transistor and  $\lambda'$  is channel length modulation factor. This equation holds for the transition time of Out' node when it attains the voltage  $V_{dd}/2$ .

Substituting the expression of *I* in equation(7.11)-

$$T_{pr} = \frac{2 \times C_{load}}{K_p} \int_{V_{OL}}^{\frac{V_{dd}}{2}} \frac{dV_{out}}{(V_{dd} - V_{ctrl} + V_{tp}) \left(1 + \lambda \ (V_{out} - V_{dd})\right)}$$
(7.13)

$$\Rightarrow T_{pr} = \frac{2 \times C_{load}}{K_p \times (V_{dd} - V_{ctrl} + V_{tp})} \int_{V_{OL}}^{V_{dd}/2} \frac{dV_{out}}{(1 + \lambda \ (V_{out} - V_{dd}))}$$
(7.14)

$$\Rightarrow T_{pr} = K_1 \left[ \ln(1 + \lambda \ (V_{out} - V_{dd}) \right]_{V_{OL}}^{\frac{V_{dd}}{2}}$$
(7.15)

Where,  $K_1 = \frac{2 \times C_{load}}{\lambda \times K_p \times (V_{dd} - V_{ctrl} + V_{tp})}$ 

$$\Rightarrow T_{pr} = K_1 ln \left[ \frac{\left(1 + \lambda \left(\frac{V_{dd}}{2} - V_{dd}\right)\right)}{\left(1 + \lambda \left(V_{OL} - V_{dd}\right)\right)} \right]$$
(7.16)

Fig.7.6 shows the comparison of simulated and analytical data for delay  $T'_{pr}$  over control voltage. There is sufficient matching in the analytical and simulated data for usable range of control voltage from 0 V to 2.1 V.

#### 7.2.1.2. Design of voltage bias circuit for CBL delay element

In the ideal loop dynamics of DLL, the control voltage across loop filter capacitor is applied to the VCDL (30 CBL delay elements) and TDL (74 CBL delay elements) to regulate their delays. However, due to the issues of charge sharing and loading, this causes a dramatic disturbance in the profile of control voltage. This disturbance manifests itself as jitter in delay characteristics as well as may lead to lock failure. To avoid this, a bias circuit is designed in between loop filter capacitor and delay lines. It works as an analog buffer and provides a dc voltage to regulate the delay lines. In addition to this, it extends the range of control voltage from 0 V-to-2 V (without bias) to 0 V-to-3.2 V (with bias).

The schematic diagram is shown in Fig.7.7, where, NMOS transistor  $(M_1)$ 

common source with diode connected load circuit configuration is chosen to generate the dc bias voltage  $V_{bias}$  for CBL delay elements.



Figure 7.6: Delay versus control voltage characteristic for CBL delay element



Figure 7.7: Schematic diagram of bias circuit

#### 7.2.1.2.a Analytical equation for V<sub>bias</sub>

In order to find the delay characteristic of delay element with bias circuit, the equation of  $V_{bias}$  in terms of  $V_{ctrl}$ , is needed to be derived. The approach to derive this relation includes two steps: in the first step, voltage  $V'_{bp}$  is expressed in terms of  $V'_{ctrl}$  by applying KCL at node-1 and in the second,  $V'_{bias}$  is expressed in terms of  $V'_{bp}$  by applying KCL at node-2. With the help of these two KCL equations, the expression of  $V_{bias}$  in terms of  $V_{ctrl}$  is derived. The equations of current while applying KCL on node-1 and node-2 depend upon the transistor operating region. Therefore, over the range of control voltage  $(V_{ctrl})$  from 0 to 3.3 V, the analyzed operating regions of four transistors ( $M_1$ ,  $M_2$ ,  $M_3$  and  $M_4$  in Fig.7.7) are given in Table 7.2

| S.No. | $V_{ctrl}$                                                 | $M_1$      | $M_2$      | $M_3$      | $M_4$      |
|-------|------------------------------------------------------------|------------|------------|------------|------------|
| 1     | $< V_{tn}$                                                 | Sub-       | Sub-       | Sub-       | Sub-       |
|       |                                                            | threshold  | threshold  | threshold  | threshold  |
| 2     | $V_{\mathit{tn}} < V_{\mathit{ctrl}} \leq V_{\mathit{SL}}$ | Saturation | Saturation | Saturation | Saturation |
| 3     | $V_{ctrl} > V_{SL}$                                        | Linear     | Saturation | Saturation | Saturation |

Table 7.2: Operating regions of MOS transistors in bias circuit

#### 7.2.1.2.(a.1). Sub-threshold operating region

The equation of drain current in sub-threshold region is given by[196]:

$$I_{sub} = I_0 \exp(\frac{V_{gs} - V_{tn}}{n \times V_T}) \times [1 - \exp(\frac{-V_{ds}}{V_T})]$$
(7.17)

Where,  $I_0 = \mu_n \times C_{ox} \times \frac{W}{L} \times (n-1)V_T^2$  and represents the drain current when gate to source voltage ( $V_{gs}$ ) is equal to threshold voltage ( $V_{tn}$ ). Also, in the expression of  $I_0$ , 'n' is sub-threshold slope, which is a technology dependent parameter. In 0.35  $\mu$ m CMOS technology, it value is evaluated as 1.6.  $V_T$  is the thermal voltage and its value is 25.8 mV at room temperature ( $300^0 K$ ).

Applying KCL at node-1:

$$I_{subM1} = I_{subM2} \tag{7.18}$$

$$\Rightarrow I_{0M1} \exp\left(\frac{V_{ctrl} - V_{tn}}{n \times V_{T}}\right) = I_{0M2} \exp\left\{-\left(\frac{V_{bp} - V_{dd} - V_{tp}}{n \times V_{T}}\right)\right\}$$
(7.19)

(The effect of drain voltage is ignored in the drain current equation to simplify the analysis)

$$\Rightarrow \frac{I_{0M1}}{I_{0M2}} \exp\left(\frac{V_{ctrl} - V_{tn}}{n \times V_{T}}\right) = \exp\left\{-\left(\frac{V_{bp} - V_{dd} - V_{tp}}{n \times V_{T}}\right)\right\}$$
(7.20)

The above expression can be simplified to:

$$\Rightarrow V_{bp} = V_{dd} + V_{tp} + V_{tn} - n \times V_T \ln\left(\frac{I_{0M1}}{I_{0M2}}\right) - V_{ctrl} \}$$
(7.21)

Applying KCL at node-2:

$$I_{subM3} = I_{subM4} \tag{7.22}$$

$$I_{0M3} \exp\left(\frac{V_{\text{bias}} - V_{\text{tn}}}{n \times V_{\text{T}}}\right) = I_{0M4} \exp\{-\left(\frac{V_{\text{bp}} - V_{\text{dd}} - V_{\text{tp}}}{n \times V_{\text{T}}}\right)\}$$
(7.23)

$$\Rightarrow V_{\text{bias}} = V_{\text{dd}} + V_{\text{tp}} + V_{\text{tn}} - n \times V_{\text{T}} \ln \left(\frac{I_{0M3}}{I_{0M4}}\right) - V_{bp}$$
(7.24)

Putting expression of  $V_{bp}$  from equation (7.21) in the above equation-

$$V_{\text{bias}} = V_{\text{ctrl}} + n \times V_{\text{T}} \left[ \ln \left( \frac{I_{0\text{M1}}}{I_{0\text{M2}}} \right) - \ln \left( \frac{I_{0\text{M3}}}{I_{0\text{M4}}} \right) \right]$$
(7.25)

#### 7.2.1.2.(a.2). Saturation operating region

Applying KCL at node-1:

$$I_{satM1} = -I_{satM2} \tag{7.26}$$

$$\frac{1}{2}(\mu_p)(C_{ox})(W/L)_{M2}(V_{bp} - V_{dd} - V_{tp})^2 = -\frac{1}{2}\mu_n C_{ox}(W/L)_{M1}(V_{ctrl} - V_{tn})^2$$
(7.27)

$$\Rightarrow V_{bp} = V_{dd} + V_{tp} - \sqrt{A_1} (V_{ctrl} - V_{tn})$$
(7.28)

Where,  $A_1 = (\mu_n \times (W/L)_{M1})/(\mu_p \times (W/L)_{M2})$ 

Applying KCL at node-2:

$$I_{satM3} = -I_{satM4} \tag{7.29}$$

$$\Rightarrow \frac{1}{2}(\mu_p)(C_{ox})(W/L)_{M3}(V_{bp} - V_{dd} - V_{tp})^2 = -\frac{1}{2}\mu_n C_{ox}(W/L)_{M4}(V_{bias} - V_{tn})^2 \quad (7.30)$$

$$V_{\rm bias} = \sqrt{A_2} (V_{\rm dd} - V_{\rm bp} + V_{\rm tp}) + V_{\rm tn}$$
 (7.31)

Where,  $A_2 = (\mu_p \times (W/L)_{M4})/(\mu_n \times (W/L)_{M3})$ Putting expression of  $V_{bp}$  from equation(7.28) in the equation (7.31)-

$$V_{\text{bias}} = \sqrt{A_2} \left( \sqrt{A_1} \left( V_{\text{ctrl}} - V_{\text{tn}} \right) \right) + V_{\text{tn}}$$
(7.32)

#### **7.2.1.2.(a.3)** Expression of $V_{SL}$ when transistor $M_1$ switches to linear region

The drain voltage  $V'_{bp}$  of driver transistor reduces with the increment in control voltage  $V'_{ctrl}$ , therefore for a value control voltage  $V'_{SL}$ , the driver transistor  $M_1$  switches from saturation to linear operating region. The expression of  $V_{SL}$  can be obtained through replacement of  $V_{bp}$  by  $V_{SL} - V'_{tn}$  in the equation(7.28)-

$$V_{SL} = \frac{V_{dd}}{(\sqrt{A_1} + 1)} + V_{tn} + \frac{V_{tp}}{(\sqrt{A_1} + 1)}$$
(7.33)

#### 7.2.1.2.(a.4). Linear operating region

Here, the driver transistor  $M_1$  works in linear region and other transistors are operating in saturation region.

Applying KCL at node-1:

$$\frac{1}{2}(\mu_p)(C_{ox})(W/L)_{M2}(V_{bp}-V_{dd}-V_{tp})^2 = -\frac{1}{2}\mu_n C_{ox}(W/L)_{M1}\{2(V_{ctrl}-V_{tn})\times V_{bp}-V_{bp}^2\}$$
(7.34)

Solving the above equation for  $V'_{bp}$  and taking negative root as a solution-

$$V_{\rm bp} = \frac{A_3 - \sqrt{A_3^2 - (A_1 + 1) \times (V_{\rm dd} + V_{\rm tp})^2}}{A_1 + 1}$$
(7.35)

Where,  $A_3 = V_{tp} + V_{dd} + A_1 \times V_{ctrl} - A_1 \times V_{tn}$ 

As other transistors are in saturation region, therefore expression of  $V_{bias}$  in terms of  $V_{bp}$  is given by equation(7.31). By substituting the values of  $V'_{bp}$  from equation(7.35) in equation(7.31), the expression of  $V'_{bias}$  in terms of  $V'_{ctrl}$  is derived in linear operating region.

By using the parameters listed in Table7.1 and aspect ratios of transistors  $M_1 = 7\mu/0.35\mu$ ,  $M_2 = 150\mu/1\mu$ ,  $M_3 = 10\mu/0.35\mu$  and  $M_4 = 150\mu/1\mu$  for bias circuit, the plot of bias voltage versus control voltage is shown in Fig.7.8.



Figure 7.8: Bias voltage V<sub>bias</sub> versus control voltage (V<sub>ctrl</sub>) characteristic

#### 7.2.1.2.(a.5). Equation of CBL delay with bias circuit

The equation of propagation delay with bias circuit can be found by replacing  $V_{ctrl}$  by  $V_{bias}$  and substituting its expression for different operating regions. Rewriting equation(7.16) with bias circuit-

$$\Rightarrow T_{pr} = K_1 ln \left[ \frac{\left(1 + \lambda \left(\frac{V_{dd}}{2} - V_{dd}\right)\right)}{\left(1 + \lambda \left(V_{OL} - V_{dd}\right)\right)} \right]$$
(7.36)

Where,  $K_1 = \frac{2 \times C_{load}}{\lambda \times K_p \times (V_{dd} - V_{ctrl} + V_{tp})}$ 

Expression of constant  $K_1$  for bias circuit in sub-threshold operating region-

$$K_{1} = \frac{2 \times C_{load}}{\lambda K_{p} \left( V_{DD} - V_{bias} + V_{tp} \right)} = \frac{2 \times C_{load}}{\lambda K_{p} \left( V_{DD} - \left( V_{ctrl} + n \times V_{T} \left[ \ln \left( \frac{I_{0M1}}{I_{0M2}} \right) - \ln \left( \frac{I_{0M3}}{I_{0M4}} \right) \right] \right) + V_{tp} \right)}$$
(7.37)

Expression of constant  $K_1$  for bias circuit in saturation operating region using equation(7.32)-

$$K_{1} = \frac{2 \times C_{load}}{\lambda K_{p} \left( V_{DD} - V_{bias} + V_{tp} \right)} = \frac{2 \times C_{load}}{\lambda K_{p} \left( V_{DD} - \left( \sqrt{A_{2}} \left( \sqrt{A_{1}} \left( V_{ctrl} - V_{tn} \right) \right) + V_{tn} \right) + V_{tp} \right)}$$
(7.38)

Expression of constant  $K_1$  for driver transistor  $M_1$  in linear operating region using equation(7.31) and equation(7.35)-

$$K_{1} = \frac{2 \times C_{load}}{\lambda K_{p} \left( V_{DD} - \left( \sqrt{A_{2}} \left( V_{DD} - V_{bp} + V_{tp} \right) + V_{tn} \right) + V_{tp} \right)}$$
(7.39)

Where, 
$$V_{\rm bp} = \frac{A_3 - \sqrt{A_3^2 - (A_1 + 1) \times (V_{\rm dd} + V_{\rm tp})^2}}{A_1 + 1}$$
 and  $A_3 = V_{tp} + V_{dd} + A_1 \times V_{ctrl} - A_1 \times V_{tn}$ 

#### 7.2.1.3 Phase detector

In the design of DLL and PLL, the phase frequency detector (PFD)[refer chapter-3] frequently used for phase error detection due to its salient features of zero dead zone and large dynamic range. It is implemented using two D flip-flops and a NAND gate, as shown in Fig.??(a)(shaded block). During phase error detection between reference and delayed clock, the outputs 'UP' and 'DN' signals are at high level with different widths and their difference is equivalent to the phase error.



Figure 7.9: Modified PDF (a) schematic diagram (b) timing diagram

Thus, when phase error between clocks is theoretically zero, the 'UP' and 'DN' has equal widths, contributed by the delay in reset path produced by NAND gate. This gives an open path for direct current from  $V_{dd}$  to ground in charge pump

resulting static power consumption. Also, the charge pump withdraws and pumps the same amount of charge simultaneously from the filter capacitor to maintain the control voltage steady. However, the issues, of 'mismatch in charging and discharging currents', 'charge injection in switches' and 'delay difference in turning on the respective switches' by 'UP' and 'DN' signals, create a difference in pumped and withdrawn charges. This causes periodic ripples in the control voltage, which manifests itself as jitter in the delay characteristics.

To avoid these issues, in this work the PFD is modified to avoid the contribution of reset delay in charge pump mechanism with the help of combinational logic as shown in Fig.7.9. Here, the positive phase error represented by the pulse duration of 'DN' signal with 'UP' is constant at logic low level. The negative phase error is represented by pulse duration of 'UP' signal with'DN' is constant at logic high level.

Thus, modified PFD controls the switching in charge pump such that either charging or discharging of loop filter capacitor is carried out as per the requirement of loop correction dynamics.

However, the modified PFD has a dead zone of 90 ps as shown in Fig.7.10. It is due to the limited response speed of NAND gate used, in combinational logic. This error manifests as static phase error (difference in rising edge alignment) between the delayed and reference clock after loop has achieved the lock point. The impact of this error on the timings on each tap reduces by the number of delay elements in the delay line. In this design, 29 number of delay elements have been used, so error of 3 ps on each tap is caused by static phase error.



Figure 7.10: DC characteristics of modified PFD

#### 7.2.1.4 Charge pump and loop filter capacitor

In this design, a single ended architecture of charge pump [195][chapter-3] is chosen as the design operating frequency is moderate (100 MHz) and low power is the preferred specification. The reference current is generated by using off-chip external resisters, so that current can be conveniently adjusted. The dummy transistors have been added to ensure the match of currents. A simple capacitor is used as a loop filter with gain expressed by  $I_{cp}/SC'$ , where  $I_{cp}$  is the amount of charging and discharging current and C' is the capacitance of filter capacitor.

### 7.3 Initialization and lock range

At the time of DLL initialization, the delay set for VCDL may cause a serious problem for DLL in its achievement of correct lock point. For conventional DLLs, to achieve the one clock period locking, following two conditions [189] have to be satisfied:

$$0.5 \times T_{ref} < D_{min} < T_{ref} \tag{7.40}$$

$$T_{ref} < D_{max} < 1.5 \times T_{ref} \tag{7.41}$$

Where,  $T_{ref}$  is reference clock period,  $D_{min}$  is the minimum VCDL delay and  $D_{max}$  is maximum VCDL delay.

If these conditions are not satisfied at the time of initialization, the DLL may undergo locking to harmonics of reference clock frequencies, so called *false harmonic locking*. In this issue, the dynamic range of phase detector plays an important role. If initial phase difference is beyond its dynamic range, it will respond to wrong phase error detection in the subsequent cycles as shown in Fig.7.11. Here, the phase error between the reference  $(A_{ref})$  and delayed clock is nearly equals to the dynamic range  $(T_{ref} - resetdelay)$  of phase detector as  $D_{min}$  is much less than  $0.5 T_{ref}$ . Therefore, during this phase error detection, the phase detector is not able to detect the second edge of delayed clock, which occurs during its resetting time. This is known as a *missing edge issue in phase detector*. As a consequence, the phase detector responds by wrong (negative) phase error detection in the subsequent cycles. This leads the control loop in reverse direction for error correction. Eventually, the DLL locks to two clock periods (second harmonic frequency) of reference clock and undergoes a false harmonic locking. Similarly, if  $D_{max}$  is much greater than  $1.5 T_{ref}$  such that the phase error is nearly equals to two clock periods then due to issue missing edge of reference clock in phase detector, the DLL locks to third harmonic frequency of reference clock.

Thus at initialization, the *lock range of DLL* refers to the maximum and minimum VCDL delays, which enable the DLL to achieve a correct lock point while avoiding false harmonic locking. In this design to avoid the issue of false harmonic locking, a 'start control circuit' is designed. It sets the control voltage of filter capacitor to its minimum value of range of designed control voltage before initialization of loop correction. As a result, the delay of VCDL starts from its minimum value and gradually increases till it achieves the target delay and thereby avoids false harmonic locking.



Figure 7.11: Missing edge issue in DLL



Figure 7.12: Schematic diagram of start control circuit

Fig.7.12 shows the schematic of start control circuit, where an off-chip reference voltage source is used to preset the voltage of loop filter capacitor with the help of preset switch (S) before initialization of DLL. This preset voltage can be adjusted corresponding to either minimum delay or the target delay value. The operation of DLL starts by assertion of '*start\_DLL*' signal. It turns off the preset switch and simultaneously enables the reference clock inside VCDL. The delay introduced by VCDL in reference clock is minimum if preset voltage is chosen corresponding to minimum unit delay. In other case, if preset voltage is corresponding to target unit delay, the delayed clock will be nearer to reference edge. Here, the acquisition time will be less as compared to the former case.

# 7.4 Analytical equation for loop dynamics

Two important performance matrices of DLL are 'tracking time' or 'acquisition time' and 'jitter'. The tracking time is the time required to respond to the input phase error and jitter is the amount of phase error on the output when input is constant. Although both parameters are crucial but the performance of DLL has to encounter the trade-off between the two. The tracking time is inversely related to the bandwidth ( $\omega_{DLL}$ ) of DLL. It is defined as the rate at which DLL responds to the change in input phase. Thus wide bandwidth leads to small tracking time and faster response to phase error correction, which degrades the loop stability.

Therefore, to reduce the jitter and achieve the high loop stability, the design rule given by equation(7.42) has been reported in [189] and is followed in this design of DLL.

$$\omega_{\rm DLL} \le \frac{1}{10} \omega_{\rm loop} \tag{7.42}$$

Where, loop band width ' $\omega'_{loop}$  is defined as  $2\Pi f_{loop}$ , where  $f_{loop}$  is loop frequency and is equal to the one over the total delay ( $t_{vcdl}$ ) around the loop.

$$\omega_{\text{loopmax}} = \frac{2\pi}{t_{\text{vcdl}}} \tag{7.43}$$

To satisfy the equation(7.42), the DLL bandwidth is needed to evaluate in terms of various circuit design parameters. Therefore, an analysis of control loop is carried out with the help of basic control block diagram as shown in Fig.7.13. Here,  $\Phi_{in}$  and  $\Phi_{out}$  represents the phase of input and output clock of VCDL respectively.

The relationship between  $\Phi_{in}$  and  $\Phi_{out}$  can be expressed by-

$$\emptyset_{\text{out}} = \emptyset_{\text{in}} + t_{\text{VCDL}} \times 2\pi f \tag{7.44}$$



Figure 7.13: (a) Control model of DLL (b) simplified control model

Where  $t'_{VCDL}$  is the delay introduced by VCDL in the input signal. The transfer function of the DLL from Fig.7.13(b) is given as-

$$\emptyset_{\text{out}} = (\emptyset_{\text{in}} - \emptyset_{\text{out}}) K_{\text{PD}} K_{\text{CP}} K_{\text{VCDL}} \times 2\pi f$$
(7.45)

Where,  $K'_{PD}$  is gain of phase detector,  $K'_{CP}$  is gain of loop filter and  $K'_{VCDL}$  is gain of VCDL. The above equation can be rearranged as-

$$\emptyset_{\text{out}} + \emptyset_{\text{out}} K_{\text{PD}} K_{\text{CP}} K_{\text{VCDL}} \times 2\pi f = \emptyset_{\text{in}} K_{\text{PD}} K_{\text{CP}} K_{\text{VCDL}} \times 2\pi f$$
(7.46)

$$\Rightarrow \emptyset_{\text{out}} (1 + K_{\text{PD}} K_{\text{CP}} K_{\text{VCDL}} \times 2\pi f) = \emptyset_{\text{in}} K_{\text{PD}} K_{\text{CP}} K_{\text{VCDL}} \times 2\pi f$$
(7.47)

$$\Rightarrow \frac{\emptyset_{\text{out}}}{\emptyset_{\text{in}}} = \frac{K_{\text{PD}}K_{\text{CP}}K_{\text{VCDL}} \times 2\pi f}{(1 + K_{\text{PD}}K_{\text{CP}}K_{\text{VCDL}} \times 2\pi f)}$$
(7.48)

$$\Rightarrow \frac{\emptyset_{\text{out}}}{\emptyset_{\text{in}}} = \frac{1}{\left(\frac{1}{K_{\text{PD}}K_{\text{CPL}} \times 2\pi f} + 1\right)}$$
(7.49)

$$\Rightarrow \frac{\emptyset_{\text{out}}}{\emptyset_{\text{in}}} = \frac{1}{\left(\frac{S}{\omega_{\text{DLL}}} + 1\right)}$$
(7.50)

where,  $\omega_{DLL} = S \times K_{PD} \times K_{CP} \times K_{VCDL} \times 2\Pi f$ Putting the expressions of  $\omega_{DLL}$  and  $\omega_{loop}$  in equation(7.42)-

$$S K_{PD} K_{CP} K_{VCDL} 2\pi f = \frac{1}{10} \frac{4\pi}{T_{ref}} \qquad \text{where, } f = 1/T_{ref}$$
(7.51)

Substituting the expressions of  $K_{PD} = V_{dd}/2\Pi$  and  $K_{CP} = I_{CP}/SC$  in the above equation(7.51)-

$$\frac{I_{cp}}{c} \le \frac{1}{10} \frac{4\pi}{T_{ref} \times V_{DD} \times K_{VCDL} \times f}$$
(7.52)

The above equation is used to find out the ratio of loop filter capacitance and current to minimize the jitter problem. It also deduces that to minimize jitter, the value of loop filter capacitor should be as large as possible and charge pump current should be small. In this design, the delay characteristics of CBL delay element with bias circuit are non-linear, so gain of VCDL ' $K'_{VCDL}$  is considered as a average gain of 11.403 ns per volt. The charge pump current is design as 25  $\mu$ A and loop filter capacitor is chosen as 16 pF. With these designed values, the ratio  $I_{cp}/C$  is  $1.5 \times 10^6$  which is smaller than the right hand side value  $3 \times 10^7$  of equation(7.42), thereby satisfies the design rule.

# 7.5 Floor plan and layout of DLL

The challenge for layout of analog DLL in mixed signal design is to reduce the impact of substrate noise, supply noise, signal cross talk and mismatches. In this direction, the placement of DLL building blocks serves an important role.

The progression of clock through VCDL injects a high frequency noise to the substrate as well as supply rails. The straightforward way to reduce the impact of substrate noise is to keep a large separation between noise sensitive analog and noisy digital blocks. Therefore, VCDL is placed far from loop filter capacitor and phase detector as shown in Fig.7.14. The space between them is shielded by substrate contacts, which gives low impedance path to substrate noise towards ground. In addition to this, each block is isolated from the other by using guard ring, which are substrate ties connected to ground. The highly noise sensitive filter capacitor is enclosed by double guard ring.

To reduce the impact of supply noise, two sets of power rails has been designed, one is dedicated to VCDL and phase detector, and other is to charge pump and bias circuit.

To reduce the cross talk, the tracks of analog signals like bias and control voltage are separated from the clock signal track. Also, the clock signal tracks are separated from the analog supply tracks. The noise coupling while crossing of control voltage from digital ground track is reduced by metal-2 layer connected to ground sandwiched between the digital and analog track intersection as a shield.

The precautions taken in the design of VCDL to reduce the mismatch among the taps of VCDL are discussed in chapter-7. Also the current mirror structures in charge pump and bias circuit are designed using Inter-digitization layout technique to reduce the current mismatch.



Figure 7.14: Snapshot of designed layout of DLL

# 7.6 Performance evaluation

The DLL performance has been analyzed by simulating the extracted netlist using spice models of MOS transistors with the help of Spectre simulator.

In the first step, the CBL delay element is characterized. The area of unit delay cell is  $225 \ \mu m^2$  and static current is  $300 \ \mu$ A. To find the delay versus control voltage characteristics of delay element, a linear sweep of dc control voltage in the range of 0 to 3.3 volts in steps of 0.1 volt is applied to bias circuit. It provides the corresponding dc voltages which consequently varies the edge transition delay for applied clock signal at the input of CBL delay element. Fig.7.15 shows the identical rising and falling edge transition delays with respect to control voltage. The gain (change in delay for per unit change in control voltage ) of delay element in the operating range (0 to 1 Volt typical) of interest is 0.2 ps/mV.

The transient performance of VCDL is verified by applying a clock (100 MHz) signal and simulating it for various values of control voltage. Fig.7.16 shows the consecutive delayed clocks at the taps of VCDL with full swing and identical transition delays.



Figure 7.15: Plot of CBL delay versus control voltage on typical corner (27<sup>o</sup>c and 3.3 V)



Figure 7.16: Delayed clocks on typical corner @ 100 MHz

To verify the functionality of voltage control delay line with negative feedback loop, the preset voltage is fixed to zero volts to set the delay of VCDL at minimum value. In the first step the performance is tested without bias block and control voltage directly driving the VCDL. Fig.7.17 (a) shows the plot of control voltage versus simulation time on typical corner. The control voltage has ripples of magnitude  $\sim 1.7$  mV even after achieving the lock point. It is due to the charging sharing of filter capacitor with the parasitics capacitance ( $\sim 500$  fF) of VCDL. The ripples in control voltage manifest itself as jitter in delay characteristic of VCDL. To rectify the issue of charge sharing of loop filter capacitor, the performance of DLL has been tested including the bias circuit. Here, the control voltage drives a transistor with input capacitance of 8 fF, which is negligible as compared to 16 pF of filter capacitor, thereby minimized the charge sharing. Fig.7.17 (b) shows the plot of control voltage versus simulation time on typical corner with bias circuit. The DLL locks the delay of VCDL to 5 ns in 124 numbers of iterations (reference clock cycles). The control voltage has a periodic ripple of 40  $\mu$ V which does not affect significantly the delay of delay element for the gain of 0.2ps/mV, in the operating range of interest.



Figure 7.17: Control voltage profile of DLL (a) without bias circuit (b) with bias circuit

Fig.7.18 shows the output signals of modified PFD during the phase error correction. The phase error is represented by the down (DN) pulse, which leads to charging of loop filter capacitor. The UP signal is maintained at zero volts and thereby disables the discharging path of filter capacitor during phase error correc-

tion. At the time of lock point achievement, the DN pulse vanishes and stables at supply voltage and leads to stabilization in control voltage.



Figure 7.18: Phase detector output signal before and after locking of DLL

### 7.6.1 Characterization across process corners

To determine the existence of target delay range '170 ps-to-200 ps'across process corners, the delays has been characterized using spice models of MOS transistor at corner conditions provided by commercial CMOS process, namely, TYP(typical NMOS, typical PMOS), FF (fast NMOS, fast PMOS), SS (slow NMOS, slow PMOS), WO (fast NMOS, slow PMOS), and WZ (slow NMOS, fast PMOS). These transistor models represent the different characteristics of the transistors at extreme process corners. Fig.7.19 shows the plot of delay versus control voltage across design process corners. On WP corner, the maximum delay is much larger than target delay range, thereby ensures the locking of DLL at low temperature (0<sup>o</sup>C) and maximum supply voltage of 3.6 V. The WZ and WO characteristics also include the target delay range of 170 ps to 200 ps. However, on worst case slow corner, the smallest achievable delay is 200 ps, which limits the achievement of lock point at unit delay of 200 ps on slow corner.Fig.7.20 shows the profile of control voltage versus acquisition time of DLL across design process corners.



Figure 7.19: Delay versus control voltage characteristic of CBL Delay element across design process corners



Figure 7.20: Profile of control voltage across design process corners

### 7.6.2 Static phase error

Static phase error refers to the phase difference between the output signal ( $D_{29}$ ) of the last stage of the VCDL and the input reference signal ( $A_{ref}$ ), after achieving the lock point. In the ideal case, when DLL achieves a lock point, the phases of these two signals should be perfectly matched. However, due to the limited phase resolution of the PD and CP, static phase error of ~ 114 ps exists in our design, which is shown in Fig.7.21. Table 7.3 shows the static phase errors of the DLL across the process corners.



Figure 7.21: DLL phase alignment on typical process @ 100 MHz after achievement of locking

### 7.6.3 Lock time

Lock time refers to the time interval that a DLL takes to achieve the lock point. Table 7.3 lists the lock times under different temperature and process variations. The longest lock time is 238 cycles of reference clock (100 MHz) on WP corner at  $0^{0}$ C temperature.

| Table 7.3: Achieved | performance s | pecifications    | of CBL  | delay | lock loo | p |
|---------------------|---------------|------------------|---------|-------|----------|---|
|                     | periornance c | p center cuto no | 01 00 2 | neiny | 1000     | r |

| Corners                      | TYP<br>(27°C)       | WP<br>(0°C)          | WZ<br>(27°C)          | WO<br>(27°C)         |
|------------------------------|---------------------|----------------------|-----------------------|----------------------|
| Static Phase Er-<br>ror (ps) | 111(r)/113(f)       | 13.3(r)/216(f)       | 133(r)/71(f)          | 87(r)/10(f)          |
| Lock Time (cy-<br>cles)      | 124                 | 238                  | 178                   | 79                   |
| Rising Edge de-<br>lay (ps)  | 168                 | 158                  | 164                   | 162                  |
| Falling Edge<br>delay (ps)   | 163                 | 155                  | 161                   | 164                  |
| Lock Range                   | 11-MHz to<br>95-MHz | 43 MHz-to-100<br>MHz | 19 MHz-to-<br>105 MHz | 14 MHz-to-<br>90 MHz |
#### 7.7 Summary

This chapter discusses the design and implementation of delay lock loop to regulate CBL delays across PVT variations. Various design aspects of CBL delay element including propagation delay, output voltage swing and interface capability with the standard cells are discussed and mathematically analyzed. The performance of DLL is verified with and without utilization of bias circuit and found that bias circuit is needed to be utilized as buffer to minimize the disturbance in control voltage profile due to charge sharing from filter capacitor. The bias circuit is analyzed mathematically and deduced that the range of bias voltage can be increased by increasing the aspect ratio of PMOS transistors. Wide range of bias voltage contributes to wide delay regulation range of CBL delay element.

The PFD is modified in this design to reduce the impact of 'charging and discharging current mismatch in charge pump'as well as the 'time delay mismatch in the enabling the charging and discharging paths' of filter capacitor. Also, to avoid the false harmonic locking in DLL, a novel start control circuit is designed.

The stability of control loop is analyzed mathematically to satisfy the design rule of DLL, referenced from its literature survey. Finally, this chapter winds up with the successful performance verification of extracted netlist of DLL using SPICE simulator across process design corners. The DLL is able to achieve the lock point across all design corners (apart from worst case slow corner) for target unit delay of 174 ps. On worst case slow corner it can achieve lock for target unit delay of 200 ps. This limitation can be avoided by designing the CBL delay fast. However, this increases the static current of CBL delay element, so in view of optimization in speed and power; speed has been compromised on slow corner.

## **Chapter 8**

## Characterization and Testing of Vernier TDC ASIC

#### 8.1 Experimental results

A prototype test board is designed with features of accepting NIM, LVTTL, LVC-MOS and LVDS '*start*' and '*stop*' inputs. The board also has an FPGA to generate programmable delays and a micro-controller to receive SPI data and send it to PC. The TDC data acquisition and analysis software has been developed on the PC side using '*Lab Windows CVI*' and '*CERN ROOT*' package. The images of test board and test setup are depicted in Fig.8.1 and Fig.8.2.



Figure 8.1: Test board for testing the TDC ASIC



Figure 8.2: Test setup for testing the 4-channel TDC ASIC

## 8.1.1 Test for functionality of time interval measurement circuit and readout logic

To test the functionality of time measurement circuit and SPI based readout logic, a single shot inputs start and stop generated from FPGA are applied to single channel of TDC ASIC. Fig.8.3 depicts the slow and fast oscillator clock signals with the assertion of '*eoc*' (output of PD) signal. Fig.8.4 depicts the output of SPI corresponding to the applied input over '*MISO*' line.



Figure 8.3: Slow and fast ring oscillator clocks and eoc signal on oscilloscope



Figure 8.4: Serial output of SPI over MISO on each falling edge of micro-controller clock

#### 8.1.2 Test for resolution by calibrating *T*<sub>oscst</sub> and *T*<sub>oscsp</sub>

The values of  $T_{oscst}$ ,  $T_{oscsp}$  and  $\Delta T$  are obtained from calibration data in order to calculate the time interval ' $\Delta T'$  under measurement. The typical values of  $T_{oscst}$ ,  $T_{oscsp}$  and  $\Delta T_d$  are 7.245 ns, 7.118 ns and 127 ps respectively. Over a large number of calibration measurement cycles, the value of  $T_{oscst}$  is in the range 7.232 ns to 7.262 ns;  $T_{oscsp}$  is in the range 7.106 ns to 7.136 ns. Difference of  $T_{oscst}$  and  $T_{oscsp}$  is in the range of 125 ps to 127 ps, which indicates that LSB is nearly constant (~ 3 ps). This is due to the differential nature of Vernier technique, where if  $T_{oscst}$  and  $T_{oscsp}$  varies in same order, the difference  $\Delta T_d$  will be nearly constant.

#### 8.1.3 Test for precision

To measure the standard uncertainty or precision, fixed time intervals generated in the FPGA is applied repeatedly to the TDC ASIC and histogram for one million cumulative inputs is obtained. The measured standard deviation ( $\sigma$ ) is less than 70 ps. Fig. 8.5(a) depicts the histogram when 240 ns time interval is applied; the value of  $\sigma$  is 52 ps. The same time intervals are also simultaneously applied to commercial time interval counter. The time jitter measured using commercial time interval counter is around 15 ps less than the  $\sigma$  of the TDC. The precision is also tested with time intervals derived from cable delays. The histogram of the time measured by the TDC for a cable delay of 4 meters and with fan out logic is plotted in Fig.8.5 (b). The measured standard deviation ( $\sigma$ ) obtained in this setup is 72 ps. The timing resolution of  $1 \text{ m} \times 1 \text{ m}$  RPC is also tested by interfacing this TDC with front-end electronics (amplifier+discriminator) and is found as 2.6 ns as shown in Fig.8.5 (c), which is similar to the measured one using HPTDC[138].



(c)

# Figure 8.5: (a) Standard deviation characterizing the precision of TDC with FPGA based inputs (b) precision of TDC for inputs derived from cable (c) RMS resolution (2.6 ns) for $1m \times 1m$ RPC tested with Vernier TDC ASIC

Fig.8.6 depicts the plot of precision of TDC versus applied time intervals. The variation in precision is 20 ps over a range of time intervals.



Figure 8.6: Precision of TDC over applied time intervals

#### 8.1.4 Test for linearity

A coarse programmable delay generator is designed and implemented in FPGA to test the linearity of the TDC ASIC. The delay generator generates linear time intervals in steps of 5 ns from 0 to 1.25  $\mu$ s. The plot between applied time versus measured time by the TDC is shown in Fig.8.7 (a). The Differential Non Linearity (DNL) derived from this linearity plot is shown in Fig.8.7 (b). The measured values of DNL and Integral Non Linearity (INL) are 375 ps and 300 ps respectively.



Figure 8.7: (a) Linearity plot of TDC (b) DNL plot for time interval (step size=5 ns) from 0 to 1.25  $\mu \rm s$ 

#### 8.2 Highlights of standard cell based Vernier TDC ASIC

- The ASIC is area efficient  $`1.5mm \times 1mm'$
- The design proved to be low power as compared to earlier reported Vernier TDC ASIC in 0.35μm CMOS process[143]
- The design implementation based on Standard cell approach is fully scalable in modern CMOS Process in view of area and power
- The ASIC has Inbuilt digital time-period calibrator which avoids the requirement of power and area inefficient PLL
- SPI interface for data readout
- The resolution achieved is independent to CMOS process due to the Vernier time measurement technique

#### 8.3 Achieved specifications

| Table 8.1: Achieved specifications of Vernier TDC ASI | C |
|-------------------------------------------------------|---|
|-------------------------------------------------------|---|

| Resolution                    | 126 ps (<200 ps)   |
|-------------------------------|--------------------|
| Dynamic Range                 | $1.4 \ \mu s$      |
| Number of Channels            | 4                  |
| DNL error                     | 350 ps             |
| INL error                     | 300 ps             |
| Number of Bits in Fine Count  | 8                  |
| Number of Bits in Coare Count | 9                  |
| Single Shot Precision         | 74ps               |
| Area                          | 1.5 mm× 1 mm       |
| Power                         | 43 mW @ 3.3V/4-ch. |

#### 8.4 Summary

The TDC ASIC achieved 127 ps resolution, 1.8  $\mu$ s dynamic range and 74 ps precision. An in-built robust calibration technique to compensate for PVT variations is successfully implemented and tested. The TDC has been successfully interfaced with SPI. This TDC ASIC is well suited for portable instrumentation due to its three main features: high area efficient design (1.5 mm × 1mm), in-build periodic calibration and SPI interface. The achieved specifications are listed in Table 8.1.

## Part IV

Conclusions

## **Chapter 9**

### **Summary and Future Scope**

#### 9.1 The Work

The goal of this work is to design and develop the TDC ASICs with the features of: resolution better than 200 ps, dynamic range higher than 32  $\mu$ s, low power consumption (28 mW/channel) and multi-channel integration, to fulfill the time interval measurement requirements in the INO experiment.

In order to fulfill the design goals, a low power, 4-channel Vernier TDC ASIC with in-built SPI has been designed and fabricated using  $0.35\mu$ m CMOS technology. The implementation of this ASIC is based on standard cells available in the PDK of AMS  $0.35 \mu$ m CMOS technology. The standard cell based design approach applied for Vernier technique based TDC is the new initiative taken in this work. Also, in this design, to meet the power and area requirements, the key design blocks like ring oscillators and time-period calibration circuit are chosen and designed in a way of digital implementation using standard cells. In this endeavor, event triggrable ring oscillators with assured slight difference in their frequencies using 'different fan-out of standard cell and different feedback gate' technique has been designed. Also, to mitigate the impact of PVT variations over accuracy of time measurement, a novel digital time period calibrator circuit is conceptualized and designed.

The developed Standard cell based ASIC is characterized and the achieved RMS resolution of this ASIC is 74 ps. The design proved to be low power as compared to earlier reported (100 mW/channel) Vernier TDC ASIC in 0.35  $\mu$ m CMOS process[143]. Also, SPI based read-out scheme common for four Vernier TDC channels is the another feature over earlier reported work, compared in Table 9.1. This TDC ASIC is well suited for portable instrumentation due to its three main fea-

tures: high area and power efficient design, in-build periodic calibration and SPI interface.

The achieved results of this ASIC proves that standard cell based TDC design approach with Vernier technique is successful to achieve the requisite performance with reduced design complexity, time and effort.

| Vernier<br>TDC                           | [143]                   | [144]              | This Work             | [142]         |
|------------------------------------------|-------------------------|--------------------|-----------------------|---------------|
| Year                                     | 2007                    | 2011               | 2013                  | 1971          |
| Dynamic<br>Range                         | >50 ns                  | $1.8 \ \mu s$      | 1.4 $\mu s$           | 50 ns         |
| LSB (ps)                                 | 37.5                    | 158                | 127                   | 250           |
| <b>Design Area</b><br>(mm <sup>2</sup> ) | 0.222                   | 199 ver-<br>satile | 1.5×1                 | N/A           |
| Power<br>Consumption                     | 150 mW per<br>ch.       | 30.6<br>mW/ch.     | 48 mW/4-ch.           | N/A           |
| Used<br>Technology                       | 0.35 μm<br>CMOS         | FPGA               | 0.35 μm<br>CMOS       | MECL<br>logic |
| Calibration<br>Circuit                   | Dual phase<br>lock loop | N/A                | Digital<br>calibrator | N/A           |
| Channels                                 | N/A                     | N/A                | 4                     | N/A           |

Table 9.1: Performance comparison of Vernier TDC

#### The highlights of this ASIC are listed below:

- The ASIC is area efficient  $`1.5 mm \times 1 mm'$
- The design is proved to be low power as compared to earlier reported Vernier TDC ASIC in 0.35µm CMOS process [143]
- The design implementation based on standard cell approach is fully scalable in modern CMOS process with improved area and power efficiency
- The ASIC has in-built digital time-period calibrator, which avoids the requirement of power and area inefficient PLL
- The SPI interface to reduce the pin counts as well as complexity of data readout

The resolution achieved is independent to CMOS process due to the Vernier technique

Further, to address the added requirements like measurement of delayed event interactions as well as time over threshold (TOT), Vernier TDC design is extended with multi-hit capabilities. In this endeavor, a novel architecture of Vernier multi-hit TDC channel is conceptualized and implemented. Based on this, a 8-channel multi-hit Vernier TDC ASIC using readily available 0.35  $\mu$ m CMOS technology is designed and developed. To suit the multiple needs of the HEP experiments, this ASIC is designed in a generic way with multi-operating modes (normal time interval, common start and common stop).

In common start multi-hit mode, the trigger signal starts measurement of the occurrence time of transitions in multi-hit (discriminator) signal. In this mode, Vernier ASIC is suitable if 'multi - hit' signal arrives after 'trigger' signal. In common stop mode, an 'event reset' signal is needed to start the occurrence time measurement of multi-hit and trigger as well. This mode is suitable for accelerator based HEP experiment, where 'event reset' signal is available. However, in the INO experiment, only trigger and multi-hit signals are available for TDC, therefore if trigger signal arrives after the multi-hit, both the modes are not suitable. This paved a way to analyze the Vernier technique through simulation for time interval measurement with negative sign i.e. if 'start' arrives after 'stop' signal. This endeavor is proved successful and the feature of negative time interval measurement in Vernier technique is applied to the conceptualized architecture of Vernier multihit TDC channel. This results a utilization of same Vernier TDC channel for both common start and stop modes of time interval. However, a modification in input signal processing logic and identical width of coarse and fine counters are required to incorporate in the design. Therefore, it is planned to incorporate the feature of negative time interval in the second version of multi-hit Vernier TDC ASIC.

Further, to store the occurrence times of four transitions in the multi-hit signal, an on-chip memory is designed. The complexity of memory access is reduced by designing its architecture based on FIFO scheme.

The TDC channels are designed using manual P & R approach. The memory and interface logic blocks and logic control block are designed using automatic P & R. The top level integration of TDC channels and rest of the blocks is carried out using manual P & R. Before sign-off the design, the performance of this TDC is validated through simulation of post layout RC parasitics extracted netlist using SPICE and mixed mode simulators. The achieved typical resolution is 100 ps, multi-hit pulse width measurement of ~ 1 ns, RMS error of ~ 30.713 ps and programmable dynamic range with options of  $10\mu s/20\mu s/30\mu s/60\mu s$ . In the current phase, this ASIC is under fabrication.

The highlights of this ASIC design are listed below based on performance validation using simulation results:

- Standard cell based design, fully scalable in modern CMOS process with improved area and power efficiency leading to enhancement in integration of channels.
- Area ' $3.3mm \times 3.3mm'$  for 8– channel multi-hit ASIC.
- Multiple time measurement modes and optional parallel and SPI based data read-out to enhance scope of ASIC application.
- Robust time-period and LSB calibration circuit with accuracy of 3 ps and 3.7 ps respectively.
- Parallel and separate calibration of measurement channels in ASIC.
- Measurement of multi-hit pulse width of ~ 1 ns, independent to conversion time of Vernier technique.
- Inbuilt FIFO based 17×256 bit memory with its control logic.

The design and implementation of Flash technique based TDC ASIC is for academic interest and future performance comparison of developed Flash and Vernier TDCs in terms of linearity and event rate. The Flash technique has inherent limitation of resolution dependency on absolute gate delay. The standard current starved inverter based delay element is limited to provide smallest delay of  $\sim 400$  ps in our target 0.35  $\mu$ m CMOS technology. Therefore, in this work, the issue of smallest delay to meet the resolution requirement is successfully mitigated by using novel CBL based delay element. This CBL delay element attains least propagation delay of 150 ps (@ 3.3 V). In addition, it features almost full voltage swing and relatively low static current (300  $\mu$  A at 0.1 V) as well as identical rising and falling edge delays, controlled by single bias voltage, thereby avoids the design complexity. This delay element is capable to be interfaced with the standard cells owing to maximum value of  $V_{OL}$  at 0.25 V for 0.1 V control voltage, thereby has a potential to be incorporated in the mixed signal DLL based Flash TDC design. In Table 9.2, the performance of delay element is compared with the other existing architectures in  $0.35\mu m$  CMOS technology.

The CBL delay element based FLASH TDC channel stamps occurrence time in two parts-fine and coarse. The architecture of TDL based fine counter is modified to achieve adjustability in CBL delays with minimum value of 136 ps (@ 3.6 V and control voltage =0.1 V), and multi-hit (start) pulse width measurement of ~ 1 ns. The 12-bit coarse counter provides the dynamic range of 40  $\mu$ s by combining it with TDL with the help of novel designed synchronizer based on dual edge synchronization scheme.

The memory and read-out interface block designed for Vernier multi-hit ASIC is reused in this ASIC as the data width from FLASH TDC channels are compatible with designed specifications of memory and read-out interface block. This simplifies the data read-out in this ASIC.

A delay lock loop is also designed in the reference channel to stabilize the delay of CBL delays in time stamping block across PVT variations. The fundamental design blocks of DLL such as bias circuit, phase detector and start control circuit are modified to achieve its high performance in terms of adjustable CBL delay and elimination in false harmonic locking & lock failure.

The performance of time stamper along with DLL has been validated in terms of its linearity and robustness across design process corners. It achieves the typical resolution of 150 ps at 3.3 V & control voltage =0.3 V and RMS error of 48.644 ps. Table9.3 shows the performance comparison with earlier reported TDCs based on the same technique.

The highlights of this ASIC design are listed below based on performance validation using simulation results:

- Mixed signal design, incorporating analog DLL and standard cell based digital time stamper
- Two provisions for CBL delay tuning either by off-chip reference voltage or by control voltage provided by on- chip DLL
- Bin size calibration mode to find exact value of CBL delay
- FIFO based 17×256 bit memory with its control logic
- Pulse width measurement of  $\sim$  1 ns, independent to reference clock period
- Adjustable resolution better than 200 ps
- Time interval and common stop multi-hit measurement modes

- Optional parallel and SPI based data read-out to enhance scope of ASIC application
- User programmable dynamic range from 10  $\mu$ s to 40  $\mu$ s in steps of 10  $\mu$ s
- Theoretical RMS error of 48.644 ps in time stamping

| Delay<br>Element     | This<br>Work           | [182]                          | [189]                 | [109]                             |
|----------------------|------------------------|--------------------------------|-----------------------|-----------------------------------|
| Туре                 | CBL with bias circuit  | Current<br>starved<br>inverter | Differential<br>(CML) | Source<br>coupled<br>logic        |
| Least<br>Delay (ps)  | 140                    | 244                            | 2500                  | 29.3                              |
| Max<br>Delay (ps)    | 680                    |                                | 16000                 |                                   |
| power<br>with<br>DLL | 25 mW<br>@ 100 MHz     | 14 mW<br>@ 32 MHz              | 132 mW<br>@ 130 MHz   | 675 mW<br>@3V<br>(without<br>DLL) |
| Ν                    | 29                     | 128                            | 10                    | 64                                |
| Swing                | Full                   | Full                           | Partial               | partial                           |
| CMOS<br>Process      | <b>0.35</b> μ <b>m</b> | 0.35 μm                        | 0.35 µm               | 0.35 µm                           |

#### Table 9.2: Performance comparison of CBL delay element

#### 9.2 Future scope

The future work is planned to be the development and characterization of multihit TDC prototype ASICs based on Vernier and Flash techniques. Also, the second versions of Vernier ASICs based on Standard cell design approach can be developed in modern CMOS technologies with improved power and area efficiency. Utilizing this work and experience of TDC development, we are intent on developing time measurement instrumentation for the India based Neutrino Observatory (INO) and critical needs of the department.

## Table 9.3: Performance comparison (\*TH code-Thermometer code, \*\*CSICurrent Starved Inverter)

## \*\*\* comparison has been made in terms of architecture of basic blocks and design specification using post layout (av-extracted) simulation results

| Parameters                                               | This<br>work ***                                                    | [110]          | [135]                  | [113]                 | [114]                               | [136]              | [182]                  |
|----------------------------------------------------------|---------------------------------------------------------------------|----------------|------------------------|-----------------------|-------------------------------------|--------------------|------------------------|
| Architecture<br>of DLL                                   | Single DLL<br>with single<br>loop to control<br>both edge<br>delays | Single<br>DLL  | Array<br>of DLL        | Single<br>DLL         | PLL<br>with<br>ring os-<br>cillator | Array<br>of<br>DLL | Single<br>DLL          |
| Ν                                                        | 74                                                                  | 16             | 100                    | 16                    | 16                                  | 140                | 128                    |
| Technology                                               | 0.35 μm<br>CMOS                                                     | 1 μm<br>CMOS   | 1 μm<br>CMOS           | 1 μm<br>CMOS          | 0.3 μm<br>CMOS<br>gate<br>array     | 0.7<br>μm<br>CMOS  | 0.35<br>μm<br>CMOS     |
| Bin Size<br>(LSB)                                        | 150 ps                                                              | 1.56<br>ns     | 154 ps                 | 500 ps                | 780 ps                              | 89.3<br>ps         | 250 ps                 |
| Dynamic<br>Range                                         | Selectable ( 10<br>$\mu$ s /20 $\mu$ s /30<br>$\mu$ s /40 $\mu$ s)  | 204.8<br>ms    |                        | 4.2 ms                | $100 \ \mu s$                       | $3.2 \ \mu s$      |                        |
| Double Hit<br>Resolution                                 | $\sim 1 \text{ ns}$                                                 | Single<br>hit  | Single<br>hit          | 8 ns                  |                                     | Single<br>hit      | Single<br>hit          |
| Clock<br>Frequency                                       | 100<br>MHz                                                          | 40<br>MHz      | 65<br>MHz              | 125<br>MHz            | 80<br>MHz                           | 80<br>MHz          | 32<br>MHz              |
| Architecture<br>of Delay<br>Element                      | Current<br>Balanced<br>delay element                                | CSI**          | CSI**                  | CSI**                 | CSI**                               | CSI**              | CSI**                  |
| Power                                                    | 50<br>mW                                                            | 10<br>mW       |                        |                       | 500<br>mW                           | 800<br>mW          |                        |
| Architecture<br>of Coarse<br>Count<br>Latching<br>Scheme | Dual Edge<br>Synchronizer                                           | Dual<br>Counte | Not<br>Men-<br>rtioned | Not<br>Men-<br>tioned | Not<br>Men-<br>tioned               | Dual<br>Counte     | Not<br>Men-<br>rtioned |
| Output<br>Code of<br>Fine<br>Counter                     | Non-TH code<br>corresponding<br>to first '1' to '0'<br>transition   | *TH<br>code    | *TH<br>code            | Not<br>men-<br>tioned | Not<br>men-<br>tioned               | *TH<br>code        | Not<br>men-<br>tioned  |

Appendices

## Appendix A

#### A.1 Neutrino flavor oscillation

The neutrino flavor oscillation proposed by B. Pontecarvo in the late 1950's, is a quantum mechanical phenomenon following the superposition principle. Here, the neutrino flavor eigenstates ( $\nu_e$ ,  $\nu_\mu$  and  $\nu_\tau$ ) are linear superposition of mass eigenstates ( $v_1$ ,  $v_2$  and  $v_3$ ). In a given flavor state, the mixture of mass eigenstates are parameterized by a set of mixing angles ( $\theta_{12}$ ,  $\theta_{23}$  and  $\theta_{31}$ ). A large mixing angle implies a large mixing between the mass eigenstates. The coupling between flavor and mass eigenstates is represented as-

$$\left(\begin{array}{c}\nu_e\\\nu_\mu\\\nu_\tau\end{array}\right) = U \left(\begin{array}{c}v_1\\v_2\\v_3\end{array}\right)$$

The mixing matrix U is a  $3 \times 3$  unitary matrix parameterized in terms of four independent variables as-

$$U = \begin{bmatrix} U_{e1} & U_{e2} & U_{e3} \\ U_{\mu 1} & U_{\mu 2} & U_{\mu 3} \\ U_{\tau 1} & U_{\tau 2} & U_{\tau 3} \end{bmatrix} = U(\theta_{12}, \theta_{23}, \theta_{13}, \delta)$$

Where,  $\delta$  is the phase characterizing the possible CP (charge conjugation and parity) violation (matter and antimatter asymmetry in terms of charge and parity)

The flavor oscillation requires that the mass eigenstates of neutrinos to be distinct from each other and also from the flavor eigenstates. So that at the time of neutrino production, the flavor eigenstate wave function is a mixture of different mass eigenstates, which propagate in the medium with different speeds and becomes out of phase from each other. As a consequence, a neutrino emitted from the source in the flavor  $\alpha$ ,  $v_{\alpha}$  is detected as a neutrino of flavor  $\beta$ ,  $v_{\beta}$  at the detector with a certain probability, given by equation(A.1) in vacuum. It depends on the distance (*L*) between the neutrino source and detector, the energy (*E*) of neutrino and mass-squared difference  $\Delta m_{ij}^2 = m_i^2 - m_j^2$ . Therefore, mass oscillation probability is determined by measuring the mass squared difference instead of their absolute masses.

$$P_{\alpha\beta} = \delta_{\alpha\beta} - 4\sum Re[U_{\alpha i}U_{\beta i}U_{\alpha j}U_{\beta j}]\sin^2\frac{(\bigtriangleup m_{ij}^2L)}{4E} + 2Im\left[U_{\alpha j}U_{\beta i}U_{\alpha j}U_{\beta j}\right]\cos\frac{(\bigtriangleup m_{ij}^2L)}{2E}$$
(A.1)

On the other hand, with mass less neutrinos, the mass eigenstates will be identical and therefore will not be out of phase. Thus, the neutrino flavor oscillation leads to the existence of neutrino mass.

The oscillation phenomenon is more probable in the material medium as compared to vacuum, predicted by Mikhaev, Smirnov and Wolfenstein (MSW) in 1985. The MSW effect amplifies the neutrino mixing and stems from the fact that electron neutrino has different interaction with matter as compared to other neutrinos flavors. In particular, ' $\nu'_e$  undergoes both CC and NC elastic scattering with the ambient electrons and nucleus while ' $\nu'_{\mu}$  or ' $\nu'_{\tau}$  have only neutral current interactions with electrons. This fact gives rise to an extra-potential  $V_e = \pm (2G_F N_e)^{1/2}$ , where ' $N'_e$  is the electron density in matter, ' $G'_F$  is the Fermi constant, and the positive (negative) sign applies to electron-neutrino (anti-neutrinos). Thus, even if the mixing angle in vacuum is small, matter effects can give rise to large effective mixing angle at condition-

$$\sqrt{2} \times G_F \times N_e = \frac{\Delta m^2}{2 \times E} \times \cos 2\theta$$
 (A.2)

Here, for maximum mixing angle of  $\Pi/4$ , the mechanism is known as MSW resonance[53, 54]. Thus, apart from source-detector distance, neutrino energy and matter density, the neutrino oscillation parameters depends on six independent parameters, which includes mass squared differences ( $\Delta m_{21}^2$ ,  $\Delta m_{32}^2$ ), three mixing angles ( $\theta_{12}$ ,  $\theta_{23}$  and  $\theta_{31}$ ) and Dirac phase ( $\delta$ ).

#### A.2 Sources of neutrino particle

The sources of naturally occurring neutrinos are both terrestrial and extraterrestrial over wide range of energy and flux as shown in Fig.(A.1). Due to their extremely low interaction cross-section, the neutrinos serve as excellent messengers of the

characteristic features of the source.



Figure A.1: Neutrino sources with cross section versus energy. The figure is adopted from Ref. [55]

#### A.2.1 Supernovae

Supernova, produced by the collapse of the core of massive stars generates huge flux of neutrinos. As neutrinos are not deflected and absorbed in the space during their journey, they retain and carry energy & directional information about the innermost regions of explosion. For instance, Supernova SN1987a was detected by observing the neutrino burst with the help of neutrino telescopes on Earth. Approximate neutrino flux observed from SN1987a Supernova was  $10^{12}/(m^2/s)$ [?]. It has yielded pioneering results in neutrino astronomy.

#### A.2.2 Solar neutrinos

Eddington suggested that nuclear fusion process is the dominant source of the solar energy. The fusion mechanism in sun proceeds through the pp chain reactions as shown in Table(A.1), predicted by standard solar model. This reaction starts with protons (hydrogen nuclei) and results formation of alpha particles (helium-4 nuclei), electrons, positrons, photons and electron type neutrinos. The energy distribution of neutrinos produced by these reactions is shown in Fig.(A.2), where the sensitivity of various solar neutrino experiments due to the CC threshold is shown at the top. Since neutrinos interact weakly with matter, they emerge virtually unscathed by passage from the center to the surface of the sun. Therefore, practically they carry all the information of solar energy and thus are the perfect probes for studying the interior of the sun. The solar neutrino flux is around 65 billion/ $(cm^2/s)$  on the surface of earth facing the sun.

| pp neutrinos              | $p + p \rightarrow_1 H^2 + \nu_e + e^-$                                       |
|---------------------------|-------------------------------------------------------------------------------|
| pep neutrinos             | $p + p + e^- \rightarrow_1 H^2 + \nu_e$                                       |
|                           | $_{1}\text{H}^{2}$ + p $\rightarrow$ He <sup>3</sup> + $\gamma$               |
|                           | $He^3 + He^3 \rightarrow p + p + \alpha$                                      |
|                           | $He^3 + \alpha \rightarrow {}^7Be + \gamma$                                   |
| <sup>7</sup> Be neutrinos | $^{7}\text{Be} + \text{e}^{-} \rightarrow ^{7}\text{Li} + \nu_{e}$            |
| <sup>8</sup> B neutrinos  | $^{8}\mathrm{B}  ightarrow ^{7}\mathrm{Be}^{*}$ + e <sup>+</sup> + $\nu_{e}$  |
| hep neutrinos             | $^{3}\text{He} + \text{p} \rightarrow ^{4}\text{He} + \text{e}^{+} + \nu_{e}$ |

Table A.1: Nuclear reactions in sun producing neutrinos



Figure A.2: Solar neutrino flux predicted by SSM.The figure is adopted from Ref.[56]

#### A.2.3 Geologically produced neutrinos

Several isotopic constituents such as  ${}^{238}U$ ,  ${}^{232}Th$  and  ${}^{40}K$  of the Earth are naturally radioactive, whose decay is the major source of heat production inside the Earth core. Their decay chain involves the  $\beta$  decay, leading to the production of anti-neutrinos or neutrinos. Precise measurement of the geo-neutrino flux and

spectrum provide deep insight into the Earths chemical composition and radiogenic heat production. The flux of the geologically produced neutrinos is about  $5 \times 10^{10}/(m^2/s)$ .

#### A.2.4 Nuclear reactors

Nuclear reactors are the major source of human-generated neutrinos. It produces a flux of electron anti-neutrinos by  $\beta$  decay of neutron-rich daughter fragments, produced in the fission process. The average nuclear fission releases about 200 MeV of energy, of which roughly 6% is radiated through neutrinos with the peak energy  $\approx$  3 to 7 MeV. A standard nuclear power reactor produces about  $2 \times 10^{20}$   $\bar{\nu_e}/GWth$ .

#### A.2.5 Atmospheric neutrinos

Atmospheric neutrinos result from the interaction of cosmic rays (high energy protons) with atomic nuclei in the Earth's atmosphere. This interaction produces a showers of unstable mesons such as pions ( $\pi^{\pm}$  and  $\pi^{0}$ ) with life time of ~ 26 ns, which decay during their journey from atmosphere to earth and thereby produce muon type neutrinos, as per reaction given in Table (A.2). The muons further decays into electron and muon type neutrinos with a life time of 2.2  $\mu$ s. Therefore, the muon type neutrino flux is expected to be twice as that of electron type neutrino flux. The energy range of atmospheric neutrino is from 100 MeV to 100 GeV. It peaks just below a GeV and falls thereafter at a rate faster than  $1/E^{2}$ .

#### A.2.6 Conventional beams (particle accelerators)

Conventional beams are in energy range of  $\approx 100$  MeV to 10 GeV and consist of muon neutrinos with small (less than 1 %) contamination of other flavors. The electron neutrino (' $\nu'_e$ ) contamination of the beam limits the experiment's ability to observe ' $\nu'_e$  appearance and hence to measure mixing angle ' $\theta'_{13}$ . The J-PARC accelerator laboratory in Tokai, Japan is an example of facility to produce artificial neutrino beams. The production technique is based on smashing high energy protons to a target such as beryllium or carbon. The resulting positively charged mesons are collected, focused, and allowed to decay in a long decay pipe. After this decay, a reasonably collimated muon neutrino beam is obtained. A muon antineutrino beam is obtained by collecting negatively charged mesons rather than positively charged mesons with the help of charge sign selecting magnetic device

named '*horn*'. Fluxes of neutrino beams are parameterized in terms of number of protons on target (POT) per year. Conventional beams have POT of about  $10^{20}$  per year.

In this technique, the *muon flavor neutrinos* are produced by 2-body decay reactions as given in Table (A.2). The *electron neutrino flavors* are produced through  $K \longrightarrow \pi \nu_e e$  ( $K_{e3}$  decay) and through the decay of the muons, produced in the pion decay.

| 2-body pion decay | $\Pi^+ \rightarrow \mu^+ + \nu_\mu ; \Pi^- \rightarrow \mu \text{-} + \bar{\nu}_\mu$            |
|-------------------|-------------------------------------------------------------------------------------------------|
| Muon decay        | $\mu^+ \rightarrow e^+ + \nu_e + \bar{\nu}_\mu$ ; $\mu^- \rightarrow e^- + \nu_u + \bar{\nu}_e$ |
| Ke3 decay         | $K^+ \to \pi^0 + e^+ + \nu_e; K^- \to \pi^0 + e^- + \bar{\nu}_e; K^0 \to \pi^- + e^+ + \nu_e$   |
| 2-body kaon decay | $\mathrm{K^+}  ightarrow \mu^+ + \nu_\mu \ ; \mathrm{K^-}  ightarrow \mu^- + \bar{\nu}_\mu$     |

Table A.2: Sources of neutrino in atmosphere and accelerator experiments

#### A.2.7 Super-beam

Super-beams are technology upgraded versions of conventional beams. Neutrinos in super-beams are generated by the using the 'off-axis technology' to produce a narrow band beam, i.e. the energy spectrum has a sharp peak. In this technique, the source power is  $10^{21}$  POT per year which is higher by a factor of 10 to 50 as compared to conventional beams.

#### A.2.8 Neutrino factory

These are based on muon storage rings where it will be possible to capture roughly  $10^{20}$  muons (of either sign) per year. A muon storage ring has a racing track with long, parallel, and straight sections, which are connected at the end by semicircular sections. Beams of high energy accelerated muons ( $\approx 20$  to 50 GeV) circulate in the storage ring and can be made to decay in the straight sections. These decays produce a well collimated and intense neutrino beam. The composition and spectra of intense neutrino beams will be determined by the charge, momentum, and polarization of the stored muons. The beam consists of  $\nu_{\mu}$  and  $\bar{\nu}_{e}$  if the ring contains  $\mu$ , and it consists of  $\bar{\nu}_{\mu}$  and  $\nu_{e}$  if the ring contains  $\mu^{+}$ .

#### A.3 Structure and Physics of RPC

The resistive plate chamber (RPC)[49, 50] is a type of spark chamber with resistive electrodes, which are constructed by glass or Bakelite material. These materials are chosen due to their low cost, fine surface finish and readily availability.



#### Figure A.3: Structure of single gap RPC. This Figure is adopted from Ref.[49]

The geometry of RPC is chosen as planer as shown in Fig.(A.3) with the help of two electrodes each of dimension  $1.84 \, m \times 1.84 \, m$ , which provide a large coverage area for particle interaction. These electrodes with thickness of 2 mm are mounted apart by a 2 mm gap with the help of T-shaped cylindrical spacer. A suitable mixture of gas as medium for particle interaction is continuously flown through this gap. The bulk resistivity of electrodes is high  $\approx 10^9$  to  $10^{12} \Omega$  cm to provide a localized muon interaction with the gaseous medium. During the interaction, muon particle ionizes the gas molecules. To separate the ionized charge (electron-ion pair) as well as to create avalanche multiplication, a uniform electric field (5 kV/mm) is applied across the electrodes by charging them using HV source. This voltage is applied uniformly over the surface of electrode through a thin layer of graphite, which provides surface resistivity of 100-200  $K\Omega/cm^2$ . The surface resistivity of graphite coating is high enough to render it transparent to the electric pulses generated by the ion-electron drift in the gas gap. This electrical pulse is induced on the metallic pickup strip by a capacitive coupling to the gas gap as it is mounted on the surface of electrode separated by an insulator layer of Mylar.

To implement a detector read-out scheme, two sets each having 64 pickup strips, which are in orthogonal direction to each other, are placed on the top and

bottom surface of RPC respectively. These strips behave like transmission lines with typical characteristic impedance of about 50  $\Omega$  and serve as 128-detector channels of RPC. The center-to-center distance of pick up strips is of 30 mm, which defines special resolution of RPC (discussed in section A.3.2).

#### A.3.1 Operating modes of RPC

The type of gas filled in the detector determines the working mode of RPC either in avalanche or steamer[48, 49, 50], resulting in different performance characteristics. The filled gas is a mixture of argon, isobutane, and electronegative gas Freon (R134a) with varying ratios as per the required operating region of RPC. The argon gas acts as a target for the ionizing particle. isobutane is a polyatomic gas with high absorption probability for the photons that have resulted from the recombination of ion pair. This gas is known as quenching gas, helps to prevent the secondary ionization caused by photons in other parts of RPC. An electronegative gas Freon helps to limit the avalanche electrons, which avoids the onset of steamers.



Figure A.4: Functionality of single gap RPC. This Figure is adopted from Ref.[48]

In the avalanche mode of operation as shown in Fig.(A.4), the primary electrons and ions, produced by the interaction of muon particle with the gas medium are accelerated by the electric field. The cluster of charge triggers the avalanche of electrons in the externally applied electric field and thereby produces secondary ion-e-pairs for charge multiplication. The increasing electric field due the multiplied charge opposes the applied external electric field. Therefore, there is a drop in the electric field in the region, where discharge has occurred. The high resistivity of the electrodes prevents high voltage supply to provide electric charge immediately for maintaining the discharge across the electrodes. As a consequence, the charge multiplication stops when internal electric field becomes equal to the applied electric field after discharge time of  $\approx 10$  ns. Due to absence of electric field, this discharged area remains inactive for the time interval ' $t'_{charging}$ , given by equation(A.3), where 'R' and ' $\rho'$  are the resistance and bulk resistivity of glass or Bakelite plate, 'C' is the chamber capacitance,  $d_{plate}$ ,  $d_{gas}$  are the thickness of resistive plate and gas medium respectively,  $K_{gas}$  is dielectric constant of gas and  $\epsilon_0$  is permittivity of free space. As thickness of plate and of gas are same, therefore the charging time depends on the physical properties of gas and glass material and evaluated as  $\approx 2$  seconds for glass plate. It represents the detector dead time in the region of primary ionization, which characterizes the counting rate capability of RPC.

$$t_{\text{charging}} = R_{\text{plate}} C = \frac{\rho \, d_{\text{plate}}}{A} \frac{K_{\text{gas}} \varepsilon A}{d_{\text{gas}}}$$

$$\Rightarrow \rho K_{\text{gas}} \varepsilon = (5 \times 10^8) (\sim 4) (8.85 \times 10^{-12}) = 2 \text{ sec}$$
(A.3)

In the steamer mode, a heavy amount of charge (50 pC to few nC) is produced due to secondary avalanches. The secondary avalanche is contributed by the photons resulted from the recombination of primary ions. If charge multiplication continuous, there is a breakdown in the gas, which creates a spark. Due to the large amplitude of charge, further signal amplification is not required in this mode. The signals are discriminated against the set threshold, which makes the read-out of RPC simple.

The avalanche process is characterized by the Townsend coefficient ' $\alpha$ ' and electron attachment factor ' $\beta$ '. The Townsend coefficient represents the number of ionizations per unit length in the direction of cluster movement. The electron attachment factor represents the number of captured electrons in per unit length due to presence of electro negative gas. If x is the distance between anode and cluster, the 'n' number of electrons that will reach anode is given by equation(A.4), where ' $n'_0$  is initial number of electrons in the cluster. The ratio of final and initial amounts of cluster derives the gain factor 'M' of electrons in the gas, given by equation(A.5). If the value of gain factor 'M' is greater than  $10^8$ , primary ionization will give rise to steamers with high probability, otherwise (M <  $10^8$ ) the operating region of RPC will be restricted to avalanche. Table(A.3) gives the comparison of characteristics of avalanche and steamer mode of RPC.

$$n = n_0 \times e^{(\alpha - \beta) \times x} \tag{A.4}$$

$$M = \frac{n}{n_0} = e^{(\alpha - \beta) \times x} \tag{A.5}$$

| Avalanche Mode                               | Streamer Mode                                 |
|----------------------------------------------|-----------------------------------------------|
| Count rate capability≈10 kHz                 | Count rate capability $\approx 1 \text{ kHz}$ |
| Requires small HV for small                  | Requires high HV to achieve                   |
| gain                                         | high gain                                     |
| Smaller signal ( $\approx 1 \text{ pC}$ )    | Large signals ( $\approx$ 100 pC)             |
| Needs pre-amplifier                          | Not required                                  |
| Better long-term prognosis                   | Short life due to aging effect                |
| No multiple streamers                        | Multiple streamers                            |
| Pulse amplitude $\approx 0.5 - 2 \text{ mV}$ | pulse amplitude $\approx 100 - 200$           |
| across 50 ohm load                           | mV across 50 ohm load                         |
| Short recovery time due to                   | Long recovery time due to                     |
| small amount of charge pro-                  | heavy charge                                  |
| duced                                        |                                               |

Table A.3: Comparison between operating regions of RPC

#### A.3.2 Parameters of RPC

The hit pattern, efficiency, noise rate, and timing resolution are the important parameters for validation of RPC detector channels interfaced with front-end electronics(FE).

#### a. Noise rate

Noise rate is the rate at which random noise signal hits the strip. In the prototype arrangement the RPC detect cosmic rays from background radiation. The pulse generated by the strip interfaced with FE is counted per second within a window of 1-second duration. These counts per second (cps) indicate noise rate. For a particular channel, cps vs. time plotted is called as the Noise rate plot. Stable noise rate plot validates the RPC detector, as background radiation will produce stable count in a particular strip.

#### b. Hit pattern

Hit pattern latches the status of the 64 pickup strips in the 64-bit register. For one RPC two 64-bit registers are used corresponding to top and bottom panel of strips. Each individual bit (0 or 1) of hit pattern register indicates whether that particular strip (channel) is fired or missed by the incident particle. It determines the geometrical coordinate of fired strip by an incident particle (muon).

#### c. Spatial resolution

The spatial resolution of the detector is determined by the pitch of the pick-up strips. However, this is limited by the cross talk between the adjacent strips. Various factors such as-thickness of pickup strips, the driving scheme of the pick-up signals and electrical characteristics of strip contributes in the cross talk. The impact of cross talk is attributed by the cluster size (found from hit pattern register code), which is the number of adjacent fired strips of RPC. The cluster size should be  $\leq$  2 for the acceptable performance of RPC.

#### d. Timing resolution

It is desirable to know the timing of RPC signal to get the direction of the incoming muon. ANUPAL TDC ASIC[146] measures the timing of RPC signal with respect to the trigger signal. Histogram of TDC value is plotted, where the standard deviation $\sigma$  gives the timing resolution. Typical value of timing resolution of RPC detector is~1.6 ns.

#### e. Efficiency

To measure the RPC strip efficiency, two scintillator pedals, one on top and other at bottom of the RPC strip are placed. A trigger pulse from the coincidence of these two scintillator pedals is derived and is used in efficiency calculation. The ratio between the RPC pulse count in coincidence with the trigger (LVDS count) and trigger count gives the efficiency of RPC strip. For high performance of RPC, a large value of efficiency is desired. Fig.(A.5) shows plots of the RPC efficiency as a function of high voltage applied between the glass electrodes for different gas mixtures. Plateau efficiencies of over 90% have been obtained for all the gas mixtures beyond 8.5 kV.



Figure A.5: Efficiency versus high voltage characteristics of RPC. This figure is adopted from Ref.[47]

#### A.3.3 Introduction to multi-gap RPC

The gas gap in the RPC serves two purposes: one is to produce the primary ionization and other is the gas gain. Wide gas gap helps to produce the large amplitude signal but the time resolution is poor as larger path produces the large fluctuation in signal arrival time. To address this issue of large timing fluctuation, concept of multi-gap RPC (MRPC) is introduced[49]. The MRPC is a technique that allows the induced signal from many small gas gaps to be read out in parallel. It is constructed by a stack of resistive plates arranged in parallel, as shown in Fig.(A.6). The electrodes are applied to the outer surfaces of top and bottom plates of stack while intermediate plates are kept as floating and create a series of small gas gaps. A strong electric field is generated in each sub-gap by applying a high voltage across the electrodes. A charged particle passing through the RPC generates avalanches in the gas gaps. As the glass plate acts as dielectric, the signal can be induced on the pickup stripe by the movement of charge in any of the gas gap. Thus, the induced signal becomes the sum of avalanches, created in the gas gaps. This improves the efficiency of RPC given as  $(1 - (1 - \epsilon))^n$ , where  $\epsilon$  is efficiency of single gap RPC. In addition, the timing resolution improves as  $\sigma_t/(n)^{1/2}$ , where  $\sigma_t$  is the resolution of single gap RPC. The resolution improvement is attributed due to the small gap, which reduces the statistical variations.



Figure A.6: Structure of multi-gap RPC with five gaps. This figure is adopted from Ref.[49]

#### A.4 Types of RPC

RPCs are classified as trigger or timing [49] depending on their application in an experiment. The timing RPC is required in time of flight (TOF) experiment to achieve high resolution. Trigger RPC is used to detect the passage of minimum ionizing particle (MIP) such as muons and thereafter signaling the other co-detectors and their data acquisition systems to record the data. Large area RPCs with 2 mm single or double gaps operated in avalanche mode provide 98% efficiency up to a particle flux of several  $kHz/cm^2$ . This type is used as a trigger RPC with time resolution of~ 1 ns. For timing application, large area and small gap (0.2 to 0.3 mm) RPC in multi-gap configuration is used. It provides an efficiency of 99 % and time resolution of 50 ps.

## Appendix **B**

# B.1 Answers to Questions Asked by the Referees (Dr.Jayanta Mukherjee, IIT, Bombay and Dr.Yasuo Aria, KEK, Japan)

**Question 1.** Does phase noise of fast and slow oscillators affect the phase coincidence and performance of coarse and fine counters?

**Answer 1.** Yes, the impact of jitter (phase noise) will affect phase coincidence with maximum error of +1 LSB and will be statistically marginal. The simulated value of RMS period jitter over 1000 cycles is 0.3 ps and peak-to-peak jitter is  $7.44 \times 0.3$  ps is 2.23 ps.

Case Study-

The input signals '*start*' and '*stop*' with time interval  $\Delta T$  ( $< T_{oscst}$ ), trigger the slow and fast oscillator with time periods  $T_{oscst}$  and  $T_{oscsp}$  respectively. The time interval between the first rising edges of slow and fast oscillator will be  $\Delta T$ . On each oscillation cycle, this time interval reduces by a step size of  $\Delta T_d (= T_{oscst} - T_{oscsp})$ , which defines the LSB of time interval measurement. Therefore, at  $n^{th}$  cycle, the time interval is given as: -  $\Delta T - (n \times \Delta T_d)$ . Depending on the applied time interval there may be two cases just before coincidence:

- 1.  $\Delta T (n \times \Delta T_d) = \Delta T_d$ : It shows that at  $n^{th}$  cycle, the remaining time interval between rising edges of oscillators equals to the LSB of measurement. This leads to coincidence between the rising edges at  $(n+1)^{th}$  cycle and at  $(n+2)^{th}$  cycle, the end of conversion (EOC) signal will assert with the help of leading edge phase detector. Here, the maximum error of +1 LSB will be introduced in the measured time interval between start and stop.
- 2.  $\Delta T (n \times \Delta T_d) < \Delta T_d$ : It shows that at  $n^{th}$  cycle, the remaining time inter-

val between rising edges of oscillators is less than the LSB of measurement. This leads to direct leading of fast oscillator rising edge from the slow one followed by the assertion of EOC signal.

The impact of peak-to-peak jitter of 2.23 ps may cause interchange between case-1 and case-2 and thereby may affect the phase coincidence. However, the timing error will be maximum one LSB only. As jitter is statistical in nature, the actual contribution will be insignificant.

Question 2. Impact of jitter on the performance of coarse and fine counter?

**Answer 2.** The impact of jitter does not introduce an error in the counts of coarse and fine counters, as both the counters are stopped with one extra count by EOC signal. The reduction in one extra count from both coarse and fine counters is included in the conversion expression, explained on page no.119.

**Question 3.** Delay Lock Loops are explained in detail as an automatic delay calibration method but unfortunately words on PLL are not so many. I want to see the comparison between DLL and PLL?

**Answer 3.** In this work, the objective is to develop a power and area efficient low noise Vernier TDC with simple and elegant design idea, catering millions of detector channels. The earlier reported Vernier TDC has employed PLL for frequency stabilization of oscillators. This Vernier PLL based TDC has a design complexity due to complex architecture of PLL and its stability issues. Also, from utilization aspect of TDC in INO experimental setup, the PLLs have to run continuously (without stopping) as neutrino events are random with low rate (100 Hz). It will lead to production of heavy circuit noise and high power consumption. Therefore, in this work, the elegant alternate to PLL is the implementation of digital calibration circuit with an accuracy of 3 ps, which provides the accuracy in measured time interval.

In FLASH TDC, DLL has been used to stabilize the absolute delay of CBL delay element across PVT variations and thereby maintaining the resolution. The DLL is shared among 4 time stamping channels, thereby reducing the power per channel to fulfill the power requirements. Also, the designed architecture has a feature to accommodate the variations in CBL delay by off-chip reference voltage with minimum value of 136 ps at 3.6 V supply voltage as an alternative option.

Moreover, the DLL has superior performance over PLL as compared in Table (B.1), based on the study of reported literature.

| Phase Lock Loop (PLL)             | Delay Lock Loop (DLL)                   |
|-----------------------------------|-----------------------------------------|
| VCO, complex loop filter, PFD are | VCDL, simple loop filter, PD are key    |
| key desiqn blocks                 | desiqn blocks                           |
| Jitter Accumulation               | No Jitter Accumulation                  |
| High Order system                 | First order system                      |
| Stability issues                  | Highly stable                           |
| Hard to design                    | Easy to design                          |
| Costly to integrate loop filter   | Easier to integrate loop filter         |
| Less dependency on reference      | High dependency on reference clock      |
| clock signal                      |                                         |
| Easy frequency multiplication     | Frequency multiplication is difficult   |
| Wide lock range                   | Limited lock range                      |
| VCO can be implemented ele-       | Delay element dependent                 |
| gantly                            |                                         |
| For Accurate frequency genera-    | For delay variation locking             |
| tion                              |                                         |
| Vernier, Direct counting, multi-  | Pulse shrinking delay line, differen-   |
| phase clock based TDC             | tial delay line, Flash, Cyclic TDC etc. |
|                                   | almost all major work in TDC use        |
|                                   | DLL                                     |

Table B.1: Comparison between PLL and DLL on various aspects

**Question 4.** Since the proposed circuit to be used in an ionizing environment as discussed from page 14-19, whether the circuit is immune to ionizing radiations?

**Answer 4.** In the INO experiment, the proposed laboratory is underground with rock burden of height 1 Km, which acts as a shield for total ionization dose and SEU events. Also, the site is free from default radiation background and there is no radioactive species in the rock of concern except  ${}^{40}K$ , which is negligible. The presence of iron plates and RPC tray also work as a shield covers for the electronic circuits. Therefore, radiation hard electronics is not required in INO experiment.

## Bibliography

- J. Chadwick, 'Distribution in intensity in the magnetic spectrum of the βrays of radium', Verhandlungen der Deutschen Physikalischen Gesellschaft, 1914, vol.16, pp.383391.
- [2] N. Bohr, 'Chemistry and Quantum Theory of Atomic Constitution (Faraday Lecture)', *Journal of the Chemical Society*,1932,135:349-384.
- [3] W. Pauli, 'Letter to the participants of workshop at Tubingen', Germany, 1930.
- [4] J. Chadwick, 'Possible Existence of a Neutron', Nature, 129 (1932) 312.
- [5] E. Fermi, Zeitschr. f. Phys. 88 (1934) 161.
- [6] E. Fermi. La Ricerca Scientifica, 2:12, 1933.
- [7] E. Fermi. Z. Phys., 88:161, 1934.
- [8] H. Bethe and R. Peierls, 'The Neutrino', Nature, 133:532, 1934.
- [9] C.L.Cowan et al., 'Detection of the free neutrino' *,a Confirmation Science*, 1956, vol.124, issue:3212, pp-103-104.
- [10] F. Reines et al., 'The Neutrino', Nature, 1956, 178:446
- [11] G. Danby et al., 'Observation of High-Energy Neutrino Reactions and the Existence of two Kinds of Neutrinos' *Physical Review Letter*, 1962, vol-9,pp-36-44.
- [12] M.L.Perl et al., 'Evidence for Anomalous Lepton Production in e<sup>+</sup> e<sup>-</sup> Annihilation ', *Physical Review Letter*, 1975, vol-35, pp-1489-1492.
- [13] B. Adeva et al., '(L3 collaboration), A determination of the properties of the neutral intermediate vector boson Z<sub>0</sub> ', *Physical Letter B*, 1989, vol.231, pp-509-518.

- [14] D. Decamp et al., '(ALEPH collaboration), Determination of the number of light neutrino species ', *Physical Letter B*, 1989, vol.231, pp-519-529.
- [15] M.Z. Akrawy et al., '(OPAL collaboration), Measurement of the Z<sub>0</sub> mass and width with the opal detector at LEP ' ,*Physical Letter B*, 1989, vol.231, pp-530-538.
- [16] S.L. Glashow, 'Partial-symmetries of weak interactions', Nuclear Physics, 1961, vol.22, pp-579-588.
- [17] S. Weinberg, 'A Model of Leptons' ,Physics Review letter, 1967, vol-19, pp-1264-1266.
- [18] S. Chatrchyan et al., '(CMS collaboration). Observation of a new boson at a mass of 125 GeV with the CMS experiment at the LHC ', *Physics letter B*, 2012, vol-716, pp-30-61.
- [19] T.D. Lee and C.N. Yang, 'Proceedings of the international conference on high energy physics, Rochester ', *New York: Interscience*,1960 page 567.
- [20] B. Pontecarvo. Sov. Phys.JETP, 1957, 6:429.
- [21] B. Pontecarvo. Sov. Phys.JETP, 1958, 7:172-173.
- [22] R. Davis, 'Solar Neutrinos.II', Physical Review Letter, 1964, vol. 12, pp-303-305.
- [23] R. Davis, et.al., 'Search for Neutrinos from the Sun', *Physical Review Letter*, 1968, vol.20, pp.1205-1209.
- [24] P. Anselmann et al., 'Solar neutrinos observed by GALLEX at Gran Sasso', *Physical Letter B*,1992,vol.285,pp.376-389.
- [25] P. Anselmann et al., '(GALLEX collaboration), Implications of the GALLEX determination of the solar neutrino flux', *Physical Letter B*, 1992, vol.285, pp.390-397.
- [26] A.I. Abazov et al., 'Search for neutrinos from the Sun using the reaction 71 Ga ( $\nu_e, e^-$ )71 Ge ', *Physical Review Letter*, 1991, vol.67, pp.3332-3335.
- [27] M. Altmann et al., '(GNO collaboration), Complete results for five years of GNO solar neutrino observations', *Physical Letter B*,2005,vol.616, pp.174-190.

- [28] J.N. Abdurashitov et al., 'Measurement of the solar neutrino capture rate by the Russian-American Gallium solar neutrino experiment during one half of the 22-Year cycle of Solar activity' *J. Exp. Theor. Phys.*, 2002, vol.95, pp.181-193.
- [29] J. Boger et al., '(The SNO Collaboration), The Sudbury Neutrino Observatory measurement of the solar neutrino ', Nuclear Instrumentation Method-A, 2000, vol.449, pp.172-207.
- [30] T. Schwetz et al., 'Neutrino oscillation physics ',*Talk given at the XXIX International Conference of Theoretical Physics*, 2005,B 36:3203-3214.
- [31] Y. Fukuda et al., 'Super-Kamiokande collaboration ', *Physics Review letter*, 1998, 81, 1562-1567, hep-ex/9807003.
- [32] Y. Fukuda et al., 'Super-Kamiokande collaboration ', *Physics Review letter*, 1999, 82, 2644-2648, hep-ex/9812014.
- [33] S. Fukuda et al., 'Super-Kamiokande collaboration ', *Physics Review letter*, 2000, 85, 3999, hep-ex/0009001.
- [34] Y. Fukuda et al., 'Super-Kamiokande Collaboration ', *Physics Review letter*, 2001, 82, 2430-2434.
- [35] M. B. Smy et al., 'Super-Kamiokande Collaboration ', *Physics Review letter*, 2004, D69, 011104, hep-ex/0309011.
- [36] J. Yoo et al., 'Super-Kamiokande collaboration ',*Physics Review letter*, 2003, D68, 092002, hep-ex/0307070.
- [37] K. Eguchi et al., 'KamLAND collaboration', *Physics Review letter*, 2003, 90, 021802.
- [38] T. Araki et al., 'KamLAND collaboration', *Physics Review let*ter, 2005, 94, 081801.
- [39] M.H.Ahn et al., 'K2K collaboration', Physics Review letter, 2003, 90, 041801.
- [40] D.G. Michael et al., 'MINOS collaboration', *Physics Review letter*, 2006, 97, 191801.
- [41] P.Adamson et al., 'MINOS collaboration', *Physics Review letter*, 2011, 106, 181801.
- [42] M. Apollonio et al., 'CHOOZ collaboration', *Eur. Phys. J.*, 2003, C27, 331.
- [43] Y. Abe et al., 'Double CHOOZ collaboration', arXiv:1112.6353v3.
- [44] F.P.An et al., 'Daya Bay collaboration ', Physics Review letter, 2012, 108, 171803; arXiv:1203.1669v1.
- [45] J. Ahn et al., 'Reno collaboration', *Physics Review letter*, 2012, 108, 191802.
- [46] 'INO Collaboration, Detailed Project Report on INO-ICAL Detector Structure II',2008.
- [47] V.M.Datar, 'INO Project Report INO/2006/01', Proceeding of Nuclear Physics Symposium, Vadodara, 2006. Web pages: http://www.ino.tifr.res.in and http://www.imsc.res.in/ino.
- [48] Anushree Ghosh el al., 'Study of RPC and calculation of efficiency'. Available at: http://www.hecr.tifr.res.in/ bsn/INO/rpc-anushree.pdf.
- [49] Satyanarayana Bheesette, 'Design and characterization studies of resistive plate chambers', Ph.D thesis,2009,Department of Physics, Indian Institute OF Technology Bombay, India.
- [50] Saikat Biswas et al., 'Study of Bakelite-based RPC detector performance with cosmic ray using different gas mixtures', RPC report version 2; Tata Institute of Fundamental Research.
- [51] S. Dasgupta, N.K. Mondal et al. 'Development of trigger scheme for the ICAL detector of India-based neutrino observatory', Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment, 2012, vol.678, pp.105113.
- [52] Krzysztof Iniewski, 'Electronics for radiation detection', chapter-12, CRC Press, Taylor& Francis Group, New York.
- [53] L. Wolfenstein, 'Neutrino oscillations in matter', Phys. Rev. D, 17:2369,1978.
- [54] S.P. Mikheev and A.Y. Smirnov, 'Resonance Enhancement of Oscillations in Matter and Solar Neutrino Spectroscopy', Sov. J.Nucl. Phys., 42:913917,1985.
- [55] Formaggio, J.A. et al., 1985'Neutrino cross section across energy scale'1985, Rev. Mod. Phys., 2012, 84, 1307 arXiv:1305.7513[hep-ex]. Available at: https://inspirehep.net/record/1236362/plots

- [56] J. N. Bahcall, A.M. Serenelli, and S. Basu, Astrophysical Journal, 2005, 621, L85.
- [57] Stevens A.E, 'A time-to-voltage converter and analog memory for colliding beam detectors', *IEEE*, 1989, vol.24, pp. 1748-1752.
- [58] K.Koch, H.Hardel et al., 'A new TAC based multi-channel front end electronics for TOF experiments with very high resolution', *IEEE Transactions on Nuclear Science*, 2005, vol.52, pp.745-747.
- [59] Holger Flemming, Harald Deppe, 'Development of high resolution TDC ASIC at GSI', *IEEE Nuclear Science Symposium Conference Record*,2007.
- [60] Harald Deppe, Holgar Flemming 'The GSI event-driven TDC with 4 channels GET4', IEEE Nuclear Science Symposium Conference Record, 2009.
- [61] Y.kawakami et al., '1.2 GaAs shift register IC for dead time less TDC', *IEEE Transactions on Nuclear Science*, 1989, vol.36, pp. 512-516..
- [62] T.Ohsugi, Y.Arai et al., 'TMC-A CMOS time-to-digital converter VLSI for high energy physics' ,IEEE Transactions on Nuclear Science,1989,vol.36,pp.528-531.
- [63] Y.Arai and T.Matsumura, 'A CMOS four channel X 1k time memory LSI with 1ns/b resolution converter', *IEEE Journal of Solid State Circuits*, 1992, vol.27, pp.359-364.
- [64] Y.Arai, M.IKeno and T.Matsumura, 'A development of time-memory cell VLSI and a CAMAC module with 0.5 ns resolution', *IEEE Transaction on Nuclear Science*, 1992, vol.39, No.4 pp.784-788.
- [65] Yasuo Arai, Masahiro Ikeno, 'A time digitizer CMOS gate-array with a 250 ps time resolution' ,2007,IEEE Journal of Solid State circuits, vol.31,N0.2, pp.212-220.
- [66] K.Mitta et al., 'Time-to-digital conversion for fast and accurate laser range finding', Proceeding of International Congress on Optical Science and Engineering, 1988, vol. 1010 industrial inspection, pp 60-67.
- [67] Kostamovaara, Juha et al., 'Time-to-digital Converter with analog interpolation circuit', *Review of Scientific Instrument*,1986,vol.57.
- [68] K.Maatta, J.Kostamovaara., 'Profiling of hot surfaces by pulsed TOF laser range finding application', *Appl.opt*, 1993, vol 32.no.27, pp.5334-5347.

- [69] W.Neagle et al., 'Surface analysis techniques and applications' *Royal society* of *Chemistry, Special Publication*, no-84.
- [70] S.Kawashima, 'A traffic condition monitoring by laser radar for advance safety driving' *in Proceeding Intelligent Vehicles symposium*,1995,pp.299-303.
- [71] K.Karadamoglou, N.Paschalidis, N.Stamatopouosl, G.Kottarasl, 'A 32-bit high resolution asynchronous time-to-digital converter for space instruments', *IEEE Aerospace Conference Proceeding*, 2004, vol, 1010 industrial inspection, pp.2398-2403.
- [72] 'Available at: http://www.acam.de/products/time-to-digitalconverters/applications/ultrasonic-density-meter/'.
- [73] Brian K.Swann et al., 'A 100 ps time-resolution CMOS time-to-digital converter for positron emission tomography imaging applications', *IEEE Journal* of Solid-State Circuits, 2004, vol.39, No.11.
- [74] Barbosa A.F.,Lima H.P. et al., 'A TDC based system for X-Ray imaging detectors', *IEEE Transactions on Nuclear Science*, 2004, vol.39, No.11.
- [75] Holly Pekau, Abdel Yousif, James W. Haslett, 'A CMOS integrated linear voltage-to-pulse-delay-time converter for Time based analog-to-digital converters', *IEEE international symposium on circuits and systems*, 2006.
- [76] Staszewki R.B et al., 'All digital TX frequency synthesizer and discrete time receiver for bluetooth radio in 130 nm CMOS', *IEEE Journal of Solid State Circuits*,2004.
- [77] V.Ramakrishan, Poras T.Balsara, 'A wide range, high resolution, compact CMOS time-to-digital converter', Proceeding of 19th International Conference on VLSI design,2006.
- [78] M.J.Hsiao et al., 'A built-in parametric timing measurement unit', *IEEE Design and Test of Computers*, 2004, vol.21, no.4, pp.332-330.
- [79] K.Park and J.Park, '20 ps resolution time-to-digital converter for digital storage oscilloscope' *,IEEE Nuclear Science Symposium*,1998,vol.2,pp.876-881.
- [80] Keunoh Park et al., 'Time-to-digital converter of very high pulse stretching ratio for digital storage oscilloscope', *Review of Scientific Instruments*, 1999, vol-70, no-2.

- [81] R.Staszewski, S.Vemulapalli, P.Vallur, J.Wallberg, and P. Balsara, '1.3 v 20 ps time-to-digital converter for frequency synthesis in 90-nm CMOS', *IEEE Transactions on Circuits and Systems II: Express Briefs*, 2006, vol. 53, no.3, pp.220224
- [82] V.O.Connor and D.Phillips, 'Time correlated single photon counting' ,*Academic Press London*,1984.
- [83] Rapeli et al., 'Method and circuitry for demodulation of angle modulated signals by measuring cycle time' *,US patent*,1993.
- [84] Poki Chen et al., 'A TDC based CMOS smart temperature sensor', IEEE Journal of Solid State Circuits, 2005, vol.40.No.8.
- [85] F.Stellari et al., 'Tools for non-invasive optical characterization of CMOS circuits', Proceedings of IEDM, Washington D.C. USA,1999.
- [86] Legrele C et al., 'A one nano-second resolution time interval counter', *IEEE Transaction on Nuclear Science*, 1983, vol NS-30. No-1.
- [87] R.Reverberi et al., 'A versatile high quality time-to-digital converter', *Journal Physics-E:Science and Instrumentation*, 1985, vol.18.
- [88] Rahkonen, J.K Timo et al., 'ECL and CMOS ASIC devices for measurement of short time interval', *IEEE*, 1988, 1593-1596.
- [89] Porat DI, 'Review of sub-nanosecond time interval measurement', *IEEE Transaction on Nuclear Science*1973, vol 20, pp.35-51.
- [90] Josef kalisz, 'Review of time interval measurement techniques with picoseconds resolution', *Institute of Physics publishing*, 2004.
- [91] B.Razavi, 'Design of analog CMOS integrated circuits', *McGraw-Hill, New York*, 2001.
- [92] T.Tanimori, 'Design and performance of semi custom analog IC including two-TAC and two-current integrators for super-Kamiokande', *IEEE Transaction on Nuclear Science NS36*, 1989, pp. 497-501.
- [93] Ruotsalainen, E.Raisanen, 'A Bi-CMOS time-to-digital converter with 30ps resolution', Proceedings of the IEEE International Symposium on Circuits and Systems, 1999, vol.1, pp.278-281.

- [94] P.Chen, C.Chen, S.member, J.Zheng and Y.Shen, 'A low cost low power CMOS time-to-digital converter based on pulse stretching', *IEEE transaction* on Nuclear Science, 2006, vol.53, pp.2215-2220.
- [95] P.Xin-Gang Wang, Hai-Gang Yang, et al., 'Successive approximation timeto-digital converter based on vernier charging method', *IEICE Electronics Express*, 2006, vol.11, pp.1-6.
- [96] E.J.Gerds, S.Member, J.V.Spiegel, et al., 'A CMOS time-to-digital converter IC with two-level analog CAM', *IEEE Journal of Solid State Circuits*, 1994, vol.26, pp.1068-1076.
- [97] W.Sun, C.Chang and N.Wang, 'A time-to-amplitude conversion chain for time interval measurement', *IOP Science Measurement and Technol*ogy, 1994, vol.11, No.3.
- [98] Jeff C Gust, 'Stop watch and timer calibration' *,National Institute of Technology and Standard*.
- [99] W.Sun, C.Chang and N.Wang, 'Time interval averaging,', *Hewlett-Packard*, Application Note: 162-1,1994.
- [100] E.Raisanen-Ruosalainen, T.Rahkonen and J.Kostamovaara 'An integrated time-to-digital converter with 30 ps single shot precision', *IEEE Journal of Solid State Circuits*, 2000, vol.35, pp.1507-1510.
- [101] Jozef Kalisz, Ryszard Szplet, Ryszard Pelka 'Interpolating time counter with 200-ps resolution', *IEEE Transaction on Instrumentation and Measurement*, 2000, vol.49, No.4.
- [102] I.Nissinen, A.mantyiemi and J.Kostamovaara 'A CMOS time-to-digital converter based on a ring oscillator for A LASER RADAR', *Proceedings of the 29th European on Solid-State Circuits Conference (ESSCIRC)*,2003,pp.469-472.
- [103] Ken Martin, 'Digital integrated circuit design, differential CMOS circuits, chapter-5,' ,*Ken Martin, Oxford University Press*, 2002.
- [104] Rahkonen.T, Kostamovaara.J, & Synjkangas.S, 'Time interval measurements using integrated tapped CMOS delay lines' ,In Proceedings of 32<sup>nd</sup> Midwest Symposium on Circuits and Systems, Urbana, Illinois, USA,1989, vol.1, pp.201-205.

- [105] Rehkonen, Timo, et al., 'Circuit techniques and integrated CMOS implementation for measuring short time intervals'.
- [106] Rahkonen, et al., 'The use of stabilize CMOS delay line for the digitization of short time-interval', *IEEE Journal on Solid State Circuits*, 1993, vol.28, pp.3-10.
- [107] Pelgrom.M. et al., 'Matching properties of MOS transistors', IEEE Journal of Solid State Circuits, 1989, vol.24, No-5, pp 1433-1440.
- [108] M.Mota, 'Design and characterization of CMOS high-resolution time-todigital converters' ,*Ph.D thesis*, *Microelectronics Group*, *CERN*, *Geneva*,2000.
- [109] T.R.Mozsary, Andras, Jen-Feng Chung, Angel Rodriguez, 'Bio inspired 0.35μm CMOS time-to-digital converter with 29.3ps lsb', 32-European Solid State Circuit Conference, 2006, pp.170-173.
- [110] C.Ljuslin, J.Christianses, 'An integrated 16-channel CMOS time-to-digital converter', *IEEE Transaction on Nuclear Science*, 1994, vol.41, pp.1104-1108.
- [111] C. Herve, K. Torki 'A 75 ps RMS time resolution Bi-CMOS time-to-digital converter optimized for high rate imaging detectors' *ELSEVIER*, *Nuclear Instruments and Methods in Physics Research A* 481 ,2002, vol.1, pp.566574.
- [112] Jansson J-P, Mntyniemi A & Kostamovaara.J, 'Synchronization in a multilevel CMOS time-to-digital converter', *IEEE Transactions on Circuits and Systems I:Regular Papers*, 2009, 56(8):16221634.
- [113] F.Bigongiari, R.Roncella 'A 250-ps time-resolution CMOS multi-hit time-todigital converter for nuclear physics experiments', *IEEE Transaction on Nuclear Science*, 1999, vol.46, pp.73-77.
- [114] Y.Arai 'A multi-hit time-to-digital converter VLSI for high-energy physics experiments', *Proceedings of the Asia and South Pacific- Design Automation Conference*,2001,pp.5-6.
- [115] Rabey, 'Digital integrated circuits-A design perspective (2nd Edition), chapter-4, lumped RC-model'.
- [116] Chao Yang, 'RC-delay chain-based RF time to digital converter',2006,*Proceeding of IEEE*,*International Conference on Mechatronics and Automation*, pp.170-173.

- [117] L.Li, W.Zhou, F.Wang, et al. 'A time-to-digital converter based on time space relationship' ,2007, Frequency Control Symposium, Joint with the 21<sup>st</sup> European Frequency and Time Forum IEEE International, pp.815-819.
- [118] J.Kalisz, R.Szplet, 'Field programmable gate array based time-to-digital converter with 200 ps resolution', *IEEE Transaction on Instrumentation and Measurement*, 1997, vol.46, pp.51-55.
- [119] Piotr Dudek, et al., 'A high resolution CMOS TDC utilizing a Vernier delay line', *Proceedings of Eurosensors XI Conference*, *Warsaw*, *Poland*, *September*, 1997.
- [120] V.Hatfield, Piotr Dudek and John, 'A zero dead-time high temporal resolution time-of-flight particle detector IC', Proceeding of Eurosensors 4th conference, 1997.
- [121] Vercesi L.,Liscidini A.,et al., 'Two dimensional Vernier time-to-digital converter', *IEEE Conference on Custom Integrated Circuits*,2009.
- [122] Jianjun yu, Fa Foster Die,Richard C.Jaeger, 'A 12-bit Vernier ring time-todigital converter in 0.13μm CMOS technology', IEEE Symposium on VLSI Circuits Digest of Technical Papers, 2009, vol.3, pp.2009-2010.
- [123] A. M. Abas, G. Russell, and D. J. Kinniment, 'Design of sub 10-picoseconds on-chip time measurement circuit', *In Proceeding of. Design Automation Test Europe Conference*,2004,vol.2, pp.804809
- [124] D.K.xie and Q.C.Zang, 'Cascading Delay Line TDC with 75 ps resolution and reduced number of delay cells', *Review of Scientific Instruments*, 2005, vol.76, pp.
- [125] C.Hwang, S.Member, P.Chen and H.Tsao, 'A high-precision time-to-digital converter using two-level conversion scheme',*IEEE Transaction on Nuclear Science*,2004,pp.1349-1352.
- [126] A.Mantyniemi, T.Rahkonen and J.Kostamovaara, 'A high resolution digital CMOS time-to-digital converter based on nested delay lock loop', *Proceedings* of IEEE international symposium on circuits and systems, 1999, vol.2, pp.537-540.
- [127] A. Waizman, 'A delay lock loop for frequency synthesis of de-skewed clock', *IEEE international solid-state circuits conference digest of technical papers*, 1994, pp.289299.

- [128] R.FarjadRad, W.Dally N, 'A 0.2-2 GHz 12 mW multiplying DLL for low-jitter clock synthesis in highly-integrated data communication chips', *IEEE international solid-state circuits conference digest on technical* papers, 2002, vol.1, pp.7677.
- [129] T.S.Precision, J.Jansson, A.Mantyniemi and J.Kostamovaara, 'A CMOS timeto-digital converter with single shot resolution better than 10 ps', *IEEE journal* of solid state circuits, 2006, vol.41, pp.1286-1296.
- [130] J.Kostamovaara 'Low-power CMOS time-to-digital converter', *IEEE journal* of solid state circuit, 1995, vol.30, pp.984-990.
- [131] Poki Chen 'A low power high accuracy CMOS TDC', *IEEE international symposium on circuits and systems*, 1997.
- [132] R.Sambles 'Highly accurate cyclic CMOS time-to-digital converter with extremely low power consumption' ,*IEEE Electronics letters*,1997,vol-33, pp.858-860.
- [133] S.liu, P.Chen, 'A cyclic CMOS time-to-digital converter with deep subnanosecond resolution', *Proceedings of the IEEE on custom integrated circuits*, 1999, pp. 605-608.
- [134] C.Chen, W.Chang and P.Chen, 'A precise cyclic CMOS time-to-digital converter with low thermal sensitivity', *IEEE nuclear science symposium conference record*,2004,vol.3, pp.1364-1367.
- [135] J.Christiansen 'An integrated CMOS 0.15 ns digital timing generator for TDC's and clock distribution system', IEEE transaction on nuclear science,1995,vol.42, pp.753-757.
- [136] M.Mota, 'A four channel self calibrating high resolution time-to-digital converter for *TDC*'s and clock distribution system', *IEEE international conference* on electronics, circuits and systems, 1998, pp.409-412.
- [137] M.Mota et al., 'A high resolution time interpolator based on a delay locked loop and an RC delay line', *IEEE journal of solid state circuits*, 1998, vol.34, No.10, pp.1360-1366.
- [138] J. Christiansen, A. Marchioro, P. Moreira, M. Mota et al., 'A data driven high performance Time to Digital Converter'. Available at:https://cdsweb.cern.ch/record/478865/files/p169.pdf

- [139] M. S. Gorbics, J. Kelly, K. M. Roberts and R. L. Sumner, 'A high resolution multi-hit time-to-digital converter integrated circuit', *IEEE transactions on nuclear science*, 1998, vol.44, no.3.
- [140] W.C.Rashid, Majid Ahmadi, 'Short time interval measurement using CMOS time amplifier', *Canadian conference on electrical and computer engineering*, *CCECE*,2009,pp.000321-000324.
- [141] W.C.Rashid, Majid Ahmadi, 'A delay generation technique for narrow time interval measurement', *IEEE transactions on instrumentation and measurement*, 2009, vol.58, N0.7.
- [142] King, Barton R. D.'Two Vernier time-interval-digitizer',*IEEE transaction on nuclear instrumentation and methods*,2003,pp.359-370.
- [143] P.Chen, C.Chen, S.Member, Y.Shen 'PVT sensitive Vernier based time-todigital converter with extended input range and high accuracy', *IEEE transaction on nuclear science*, 2007, vol.54, pp.294-302.
- [144] K. Hari Prasad, V.B Chandratre, Pooja Saxena et al. 'FPGA based timeto-digital converter', Proceedings of nuclear physics symposium ,2011,vol.54, pp.294-302.
- [145] K. Hari, Menka Sukhwani, Pooja Saxena et al. 'A CMOS standard cell based TDC' *,the international conference on VLSI, VCASAN-2013.*
- [146] K. Hari, Menka Sukhwani, Pooja Saxena et al. 'A four channel time-to-digital converter ASIC with in-built calibration and SPI interface' *,Nuclear instrumentation and measurement,section-A* ,2014,vol.737, Feb.11.
- [147] Yongminpark and David.D et al. 'A cyclic vernier time-to-digital converter synthesized from 65 nm standard cell library'.
- [148] Chin-Hsin Lin, Marek Syrzycki et al. 'single stage Vernier time-to-digital converter with sub-gate delay time resolution', *Circuits and systems, scientific research*,2011,vol.2, pp.365-371.
- [149] Chu D.C, Allen M.S and Foster A.S. 'Universal counter resolves picoseconds in time interval measurement' *Hewlatt-Packard Journal*,1978,vol.29,pp.2-11.
- [150] Antti Mantyniemi, et al., 'A CMOS TDC based on cyclic time domain successive approximation interpolation method', *IEEE journal of solid state physics*, 2009, vol.44, No.11.

- [151] V.B.Chandratre, 'Thesis entitled: Analog pulse processing techniques in nuclear instrumentation for silicon strip & pin detectors', *Physics department*, *Mumbai university*, *Mumbai*, 2010.
- [152] Vishal Sharma, 'Thesis entitled: Design of low power CMOS cell structures based on sub-threshold conduction principle ',*Department of electronics and communication, Thaper university,Patiala*,2010.
- [153] JAN M.RABEY et al., 'Digital integrated circuits: A design perspective, II edition', 2002, PHI, ISBN-10:8120322576.
- [154] Ken Martin, 'digital integrated circuit design', Oxford University Press, 2000.
- [155] Syed Muhammad Masood, 'Master thesis:active loads in current mode logic (CML) topology', *Technical university of Denmark*,2006.
- [156] E.F.M. Albuquerque, M.M.Silva, 'current balanced logic for mixed signal IC ',Proceeding of the IEEE international symposium on circuit and system,1999, vol.1,pp.274-277.
- [157] Lee Eng Han et.al, 'CMOS transistor layout KungFu',2009, Available at: www.eda-utilities.com.
- [158] Cheng Jia, 'Ph.D. Thesis: A Delay-locked loop for multiple clock phases/delays generation Topology', *Georgia institute of technology*, 2005.
- [159] G. Anelli, F. Faccio, S. Florian, P. Jarron, 'Noise characterization of a 0.25 μm CMOS technology for the LHC experiments', *Nuclear instruments and methods in physics research A*,2001,vol.457, pp.361-368.
- [160] Yannis Tsividis, 'Operation and modeling of the MOS transistor', II Edition, McGraw-Hill Int. Ed., New York, 1999, vol.421, pp.426-427.
- [161] A.L.McWhorter, 'semi-conductor surface physics' ,1956, *University Pennsyl*vania Press, pp.207-227.
- [162] Z.Y. Chang, W.M.C. Sansen, 'Low-noise wide-band amplifiers in bipolar and CMOS technologies', *Kluwer Academic Publishers*,1991,pp.20.
- [163] Syed Muhammad Yasser Sherazi, 'Reduction of simultaneous switching noise in analog signal band on a chip' ,*Master thesis in ISY, Linkping Institute* of Technology,2008.

- [164] M. Masud Hasan Chowdhury, 'Ph.D. thesis: Noise analysis and design methodologies in deep sub-micron VLSI circuits', Northwestern University, Evanston, Illinois, 2004.
- [165] Kenneth.L.Shepard,et al., 'Conquering noise in deep sub-micron digital ICs', *IEEE*,1998.
- [166] Kevin T. Tang and Eby G. Friedman, 'Interconnect coupling noise in CMOS VLSI circuits', Proceedings of international symposium on physical design (ISPD),1999,pp.48-53.
- [167] Ashok Vittal, Lauren Hui Chen, Malgorzata Marek-Sadowska, Kai-Ping Wang, and Sherry Yang, 'Crosstalk in VLSI interconnections, ',IEEE transactions on computer-aided design of integrated circuits and systems,1999,vol.18,pp.1817-1824.
- [168] T. Sakurai, 'Closed-form expressions for interconnection delay, coupling, and crosstalk in VLSI', *IEEE transaction on electron devices*, 1993, vol.40, pp.118-124.
- [169] Masud H. Chowdhury, Y. I. Ismail, C. V. Kashyap, and B.L. Krauter, 'Performance analysis of deep sub-micron VLSI circuits in the presence of self and mutual inductance', *IEEE international symposium on circuits and systems* (*ISCAS*),2002,vol.4, pp.197-200.
- [170] Andrew B. Kahng, Sudhakar Muddu, and Devendra Vidhani, 'Noise and delay uncertainty studies for coupled RC interconnects', *IEEE international* ASIC/SOC conference, 1999, vol.4, pp.3-8.
- [171] N.D. Arora, K.V. Raol, R. Schumann, and L. M. Richardson 'Modeling and extraction of interconnect capacitance for multi-layer VLSI circuits ', IEEE transaction on computer-aided design of integrated circuits and systems,1996,vol.15,pp.58-67.
- [172] T.Gao and C.L.Liu, 'Minimum crosstalk channel routing ', *Proceedings of IC-CAD*,1993, pp.692-696.
- [173] J. S. Yim and C. M. Kyung, 'Reducing cross-coupling among interconnect wires in deep-submicron datapath design ',*Proceedings of DAC*,1999.
- [174] Lei He and Kevin M. Lepak, 'Simultaneous shield insertion and net ordering for capacitive and inductive coupling minimization ',proceedings of the 2000 international symposium on physical design,1993.

- [175] A. Vittal and M. Marek-Sadowska 'Crosstalk reduction for VLSI',*IEEE transaction on computer-aided design*,1997,vol.16, pp.290-298.
- [176] H. Kaul, D. Sylvester and D. Blaauw, 'Active shield: A new approach to shield global wires', *Proceeding of GLS VLSI*,2002.
- [177] Erich Barke, 'Line-to-ground capacitance calculation for VLSI: A comparison, ',*IEEE transactions on computer-aided design*,1999, vol.7,pp.295-298.
- [178] Martin Wirnshofer, 'Chapter-2:Variation Aware adaptive voltage scaling for digital CMOS circuits' ,Springer Series,2002,ISBN: 978-94-007-6195-7
- [179] M.Ercken, L.Leunissen, I.Pollentier, G.P.Patsis et al., 'Eects of dierent processing conditions on line edge roughness for 193 nm and 157 nm resists', *Metrology, inspection and process control for micro-lithography XVIII, ser. Proc. SPIE*,2004,vol.5375,pp.266-275
- [180] Hamid Mahmoodi, Saibal Mukhopadhyay, et al., 'Estimation of delay variations due to random-dopant fluctuations in nano-scale CMOS circuits', *IEEE Journal of solid-state circuit*, 2005, vol.40.
- [181] Smruti R. Sarangi, 'VARIUS: A model of process variation and resulting timing errors for micro architects', *IEEE transactions on semiconductor manufactur*ing,2008,vol.21.
- [182] Popong Effendrik, 'M.SC. thesis: time-to-digital converter (TDC) for WiMAX ADPLL in state-of-the-art 40-nm CMOS', *faculty of electrical engineering, mathematics and science, Delft university of technology*,2011.
- [183] Hector Hung and Vladislav Adzic, 'Monte Carlo simulation of device variations and mismatch in analog integrated circuits', *Proceedings of the national conference on undergraduate research* (NCUR),2006.
- [184] Chin-Kong Ken Yang, 'Delay locked loops- an overview', IEEE press, 2003.
- [185] Wu Gao, 'Design of a monolithic front-end readout chip with a high precision TDC and a time-based ADC in CMOS technology for PET imaging', Ph.D thesis, University of Strasbourg.
- [186] O. Bourrion, L. Gallin-Martel, 'An integrated CMOS time-to-digital converter for coincidence detection in a liquid Xenon PET prototype', *Nuclear instruments and methods in physics research section A*,2006,vol.563,issue.1,pp.100-103.

- [187] Nihar R.Mahapatra, et al., 'Comparison and analysis of delay elements', *IEEE* symposium on circuits and systems, 2002, vol.2.
- [188] Amir Ghaffari and Adib Abrisshamifar, 'A novel wide-range delay cell for DLLs', *International conference on electrical and computer engineering*,2006, pp.497-500.
- [189] H. Chang, J.Lin, and C. Yang, et al. 'A wide-range delay locked loop with a fixed latency of one clock cycle', *IEEE journal of solid-state circuits*,2002, vol.37,pp 1021-1027.
- [190] Jan M. Rabaey, 'Digital integrated circuits-A design perspective, chapter-10, timing issues in digital circuits, clock synthesis and synchronization using a phase-locked loop', II Edition.
- [191] W. Rhee, 'Design of high performance CMOS charge pumps in phase-locked loops', Proceeding on IEEE international symposium on circuits and systems, 1999, vol.1, pp.545-548.
- [192] M. G. Johnson and E.L. Hudson, 'A variable delay line PLL for CPUcoprocessor synchronization', *IEEE journal of solid-state circuits*, 1988, vol.23, pp.1218-1223.
- [193] I. A. Young, J. K. Greason, and K. L. Wong, 'A PLL clock generator with 5 to 110MHz of lock range for microprocessors', *IEEE journal of solid-state circuits*,1992, vol.27, pp.1599-1607.
- [194] Mark Balch, 'Chapter-5:A complete digital design- A comprehensive guide to digital electronics and computer system architecture', McGRAW-HIL
- [195] Sukhwani Menka, Chandratre V.B. et al., '500 MHz delay lock loop based 128-bin, 256 ns deep analog memory ASIC, Anusmriti ', 2011, Proceeding of IEEE symposium on computer society.
- [196] A.Wang, B. H. Calhoun, and A. P. Chandrakasan, 'Sub-threshold design for ultra low-power systems's',2006,*Springer*.