# HIGH SPEED DATA CONVERTERS FOR UTRA-WIDEBAND AND SOFTWARE DEFINED RADIO APPLICATIONS

A Dissertation

Presented in Partial Fulfillment of the Requirements for the

Degree of Doctor of Philosophy

with a

Major in Electrical Engineering

in the

College of Graduate Studies

University of Idaho

by

Islam T. Abougindia

December 2014

Major Professor: Suat U. Ay, Ph.D., P.E

# AUTHORIZATION TO SUBMIT DISSERTATION

This dissertation of Islam T. Abougindia, submitted for the degree of Doctor of Philosophy with a Major in Electrical Engineering and titled "High Speed Data Converters for Ultra-Wideband and Software Defined Radio Applications," has been reviewed in final form. Permission, as indicated by signatures and dates below, is now granted to submit final copies to the College of Graduate Studies for approval.

| Major Professor:                            |                                | Date |
|---------------------------------------------|--------------------------------|------|
|                                             | Suat U. Ay, Ph.D., P.E.        |      |
| Committee<br>Members:                       |                                | Date |
| Wembers.                                    | Aicha Elshabini, Ph.D., P.E.   | Date |
|                                             |                                |      |
|                                             | Saied Hemati, Ph.D., P.Eng.    | Date |
|                                             |                                |      |
|                                             |                                | Date |
|                                             | Ahmed Abdel-Rahim, Ph.D., P.E. |      |
| Department<br>Administrator:                |                                | Date |
|                                             | Fred Barlow, III, Ph.D.        |      |
| Dean of College<br>Of Engineering:          |                                | Date |
| 5 5                                         | Larry Stauffer, Ph.D., P.E.    |      |
|                                             |                                |      |
| Final Approval and Ac                       | ceptance                       |      |
| Dean of the College<br>Of Graduate Studies: |                                | Date |
|                                             | Jie Chen, Ph.D.                |      |

## ABSTRACT

Ultra-wideband (UWB) communications and software defined radio (SDR) have been widely researched topics for the past several years. This is mainly because of the increased demand for robust, multi-purpose, and reconfigurable high data rate wireless communication systems with low energy consumption. Moreover, limited availability of RF spectrum bands for commercial use and variations of them from one country to another demand the wireless-capable devices to have greater re-configurability. Therefore, the communication systems of the future will not only have to allow multiple application usage, but also properly operate in a variety of environments with many other communication systems. Applications of these technologies range from short range battlefield military communication, interoperability of different radio signals, and wireless indoor data transfer and connectivity, to wireless sensor networks that require continuous data transmission.

High-speed, medium resolution (5-8bits), and low-power analog-to-digital converters (ADCs) are essential components of most high data transfer rate communication systems. In UWB systems, as well as digital oscilloscopes, SDRs, and many other high speed communication applications, the speed and resolution of the ADC blocks limit communication systems from covering wide frequency bands. The intent of the work presented in this dissertation is to review and disclose the research findings on novel analog-to-digital converter structures designed for high-speed medium to high resolution, and reconfigurable structures for UWB and SDR applications.

The technical inquiry is divided into four research objectives. The first objective is to investigate fundamental principles, architectures, and transistor-circuit-system level

limitations and design challenges of available state of the art ADC topologies for UWB and SDR applications. The second objective is to develop novel ADC topologies by utilizing asynchronous signal processing functionalities and by using hybrid structures to overcome the architecture level limits. The third objective is to study the main causes of high-speed ADC imperfections and develop novel correction techniques to achieve better performance. The fourth objective is to realize those novel ADC structures and the correction techniques on silicon and compare actual measurements with the simulation results as a proof of concept.

The research outcomes include: (1) understanding the challenges of high-speed ADC design, (2) understanding the challenges in the design of flexible ADC architectures that can easily be re-configured between high data rate-medium resolution and low data rate-high resolution operations, (3) understanding the impact of circuits non-idealities on the overall ADC performance, and the different techniques used to reduce them, (4) developing novel and efficient techniques to correct for the top contributors of the circuits non-idealities, and (5) developing novel ADC architectures that can fulfill the requirements of UWB and SDR systems.

The intellectual merits include: Two asynchronous time-interleaved (TI) type ADC architectures and a novel and efficient offset correction technique targeting set goals of the research are reported in this dissertation. The novel coarse-fine-calibration (CFC) offset correction technique was developed for increasing the accuracy of comparators used in ADCs while maintaining small area and power consumption for correction circuits.

Major contributions of this research are: (1) design of a novel offset correction technique for dynamic latched comparators that are the key blocks in most ADC topologies

to improve its sampling speed and the bit resolution, (2) design of a new asynchronous successive approximation register (ASAR) ADC in a time-interleaved structure, (3) design of a novel asynchronous two bit per stage, binary search ADC with indirect reference switching (ABS-IRS) in a time-interleaved structure, and (4) design, implementation, and testing of the three structures on silicon.

## ACKNOWLEDGMENTS

I would like to thank my major professor Dr. Suat Utku Ay for giving me the opportunity to conduct research on mixed-signal design of data converters, for his continuous guidance and support in solving difficult problems throughout my doctoral studies and research, and for his patience and encouragement throughout my doctoral research. He provided me support both academically and morally throughout the entire period of my doctoral program.

I would like to thank my committee members Dr. Aicha Elshabini, Dr. Saied Hemati, and Dr. Ahmed Abdel-Rahim for their valued comments that helped focusing my research direction and for being there to support and guide me when I faltered on any matter.

I would also like to thank my colleagues in VLSI Sensors Research Group (VSRG) at the University of Idaho for their support and help throughout my research.

Finally, I would like to thank my parents, my wife and kids for their blessings, incessant support to me, and spiritual guidance that was not limited to my doctoral research, but throughout my entire life.

## **DEDICATION**

This dissertation is lovingly dedicated to my mother, Nagwan Sallam. Her support, encouragement, constant love, moral guidance, and blessings have been the main driving force throughout my entire life. Also, I would like to dedicate my beloved wife, Yasmin Abougindia, she has always been on my side, her encouragement has been my most priceless possession, and she has been the biggest inspiring force for me in all the difficult times and situations.

# **TABLE OF CONTENTS**

| AUTHORIZATION TO SUBMIT DISSERTATION                | ii  |
|-----------------------------------------------------|-----|
| ABSTRACT                                            | iii |
| ACKNOWLEDGMENTS                                     | vi  |
| DEDICATION                                          | vii |
| LIST OF FIGURES                                     | xii |
| LIST OF TABLES                                      | xvi |
| CHAPTER 1 – INTRODUCTION                            | 1   |
| 1.1 Motivation and Goals                            | 3   |
| 1.2 Contributions                                   | 4   |
| 1.3 Organization of the Thesis                      | 5   |
| CHAPTER 2 – High speed Analog-to-Digital Converters | 7   |
| 2.1 Fundamentals of Analog to Digital Converters    | 7   |
| 2.1.1 Sample-and-Hold Circuits                      | 9   |
| 2.1.2 Quantization Error                            | 10  |
| 2.2 Analog-to-Digital Converter Performance metrics | 12  |
| 2.2.1 Static Performance Metrics for ADCs           | 13  |
| 2.2.2 Dynamic Performance Metrics for ADCs          | 15  |
| 2.3 Nyquist-Rate Data Converters                    |     |
| 2.3.1 Flash ADC                                     | 19  |
| 2.3.2 Successive Approximation Register (SAR) ADC   | 21  |
| 2.3.3 Pipelined ADC                                 | 24  |

| 2.3.4   | Asynchronous Processing ADC                            |    |
|---------|--------------------------------------------------------|----|
| 2.4 Ov  | ersampling Data Converters                             |    |
| 2.4.1   | Over Sampling Concept                                  | 29 |
| 2.4.2   | Sigma-Delta ( $\sum \Delta$ ) Modulators               |    |
| 2.5 Tir | ne Interleaved (TI) Structures                         | 31 |
| CHAPTER | 3 – Comparators Offset Correction Techniques           |    |
| 3.1 Int | roduction                                              |    |
| 3.2 Co  | mparator Offset Influence on ADC Performance           | 35 |
| 3.2.1   | Offset Immune ADCs                                     |    |
| 3.2.2   | Offset Sensitive ADCs                                  |    |
| 3.3 Of  | fset Correction Approaches                             |    |
| 3.3.1   | Background Calibration Techniques                      |    |
| 3.3.2   | Foreground Calibration Techniques                      |    |
| 3.3.3   | Threshold Tuning Technique using Bulk Voltage Trimming | 43 |
| 3.3.4   | Current Trimming Technique using Shunt Devices         | 45 |
| 3.3.5   | Capacitive Trimming Techniques                         | 47 |
| 3.4 Pro | pposed Coarse-Fine-Calibration (CFC) Technique         | 48 |
| 3.4.1   | Circuit Implementation of the Proposed CFC Technique   |    |
| 3.4.2   | Post-Layout Simulation Results of CFC Technique        |    |
| 3.4.3   | Measurement Results of CFC Technique                   | 60 |
| 3.4.4   | Summary                                                | 67 |
| CHAPTER | 4 – High Speed, Energy Efficient SAR ADCs              | 68 |
| 4.1 Sy  | nchronous Versus Asynchronous SAR Processing           | 68 |

| 4.2 Asynchronous SAR ADC                                                       | 69     |
|--------------------------------------------------------------------------------|--------|
| 4.3 Proposed 8-bit, 4-Channel TI, ASAR-ADC                                     | 74     |
| 4.3.1 Circuit Implementation                                                   | 74     |
| 4.4 Simulation Results                                                         | 83     |
| 4.4.1 Single Channel Simulation Results                                        | 83     |
| 4.4.2 Four-Channel Simulation Results                                          | 85     |
| 4.5 Measurements Results                                                       | 87     |
| 4.5.1 Chip Micrographs                                                         | 87     |
| 4.5.2 Measurement Setup                                                        | 88     |
| 4.5.3 Measurements Results                                                     | 88     |
| 4.6 Summary                                                                    | 90     |
| CHAPTER 5 – Asynchronous Binary Search ADCs (ABS)                              | 91     |
| 5.1 Comparator Based Asynchronous Binary Search ADC (CABS)                     | 92     |
| 5.2 Asynchronous Binary Search-ADC with Reduced Comparator Count               | 94     |
| 5.3 Proposed Asynchronous Binary Search with Indirect Reference Shifting ABS-I | IRS    |
| ADC                                                                            | 97     |
| 5.3.1 Circuit Implementation of the Proposed 8-bit, TI ABS-IRS ADC Architect   | ure104 |
| 5.4 Simulation Results                                                         | 110    |
| 5.4.1 Single Channel Simulation Results                                        | 110    |
| 5.4.2 Four Channels Simulation Results                                         | 112    |
| 5.5 Chip Micrograph                                                            | 114    |
| 5.6 Summary                                                                    | 114    |
| CHAPTER 6 – Conclusion and Future Directions                                   | 116    |

| 6.1 Future l | Directions1 | 17 |
|--------------|-------------|----|
|              |             |    |
| REFERENCES   |             | 19 |

# LIST OF FIGURES

| Figure 1-1. Classical low-IF radio receiver front end                                 | 2         |
|---------------------------------------------------------------------------------------|-----------|
| Figure 1-2. Future radio architecture                                                 | 3         |
| Figure 2-1. Basic ADC block diagram                                                   | 8         |
| Figure 2-2. (a) ADC input/output characteristics; (b) ADC quantization error          | 10        |
| Figure 2-3. Quantization error probability density function                           | 12        |
| Figure 2-4. Transfer curve illustrating ADC offset error.                             | 13        |
| Figure 2-5. Transfer curve illustrating ADC gain error                                | 14        |
| Figure 2-6. Transfer curve illustrating DNL errors of a 3-bit non-ideal ADC           | 15        |
| Figure 2-7. Transfer curve illustrating 3-bit ADC INL error.                          | 16        |
| Figure 2-8. Flash analog-to-digital converter.                                        | 20        |
| Figure 2-9. General Architecture of SAR ADC                                           | 22        |
| Figure 2-10. 3-bit SAR ADC operation                                                  | 23        |
| Figure 2-11. General architecture of pipelined ADC                                    | 25        |
| Figure 2-12. Comparison between commonly used Nyquist rate ADCs                       | 27        |
| Figure 2-13. Conventional first order sigma-delta ADC.                                | 31        |
| Figure 2-14. Simplified TI ADC architecture with timing diagram                       | 32        |
| Figure 3-1. (a) Ideal comparator characteristics, (b) comparator characteristics with | offset.36 |
| Figure 3-2. Digitally controlled offset trimming [32].                                | 41        |
| Figure 3-3. Digitally controlled trimming example                                     | 42        |
| Figure 3-4. Bulk trimming correction of PMOS input pair.                              | 44        |
| Figure 3-5. Current trimming scheme using shunt devices.                              | 46        |

| Figure 3-6. Capacitance trimming scheme, [47].                                            | 47 |
|-------------------------------------------------------------------------------------------|----|
| Figure 3-7. CFC calibration technique operation and timing diagram, [48].                 | 49 |
| Figure 3-8. Dynamic latched comparator with enhanced resolution, [48].                    | 50 |
| Figure 3-9. Dynamic latched comparator with CFC, [48]                                     | 52 |
| Figure 3-10. Layout of the dynamic comparator with CFC in AMS 0.35µm process              | 53 |
| Figure 3-11. Dynamic comparator with CFC and the supplementary circuits                   | 54 |
| Figure 3-12. Simulated bulk trimming voltage versus offset voltage.                       | 56 |
| Figure 3-13. Simulated shunt trimming gate voltage versus offset correction               | 57 |
| Figure 3-14. Post layout Monte Carlo simulation of 1000 offset sample                     | 58 |
| Figure 3-15. Post layout simulation of bulk trimming calibration (A) versus CFC (B)       | 59 |
| Figure 3-16. AMS chip layout showing the comparators with CFC.                            | 61 |
| Figure 3-17. Micrographs of the fabricated chip in AMS 0.35µm CMOS process                | 61 |
| Figure 3-18. Designed custom PCB (front and back) for testing.                            | 62 |
| Figure 3-19. CFC test setup.                                                              | 62 |
| Figure 3-20. Measured offset variation samples                                            | 63 |
| Figure 3-21. Measured bulk trimming voltage versus corrected offset voltage               | 64 |
| Figure 3-22. Measured shunt trimming gate voltage versus corrected offset                 | 65 |
| Figure 3-23. Measured offsets levels before and after correction using bulk trimming only | У  |
| with 10-bit DAC resolution versus CFC with 4-bit coarse and 4-bit fine DACs.              | 65 |
| Figure 4-1. Asynchronous 6-bit SAR ADC using single comparator, [54]                      | 70 |
| Figure 4-2. Loop-unrolled asynchronous SAR ADC and its timing diagram, [56]               | 72 |
| Figure 4-3. Critical path for one bit conversion.                                         | 73 |
| Figure 4-4. Improved 8-bit, single channel, ASAR ADC architecture.                        | 75 |

| Figure 4-5. Parasitic capacitance issue in the C-2C DAC topology                     | 77   |
|--------------------------------------------------------------------------------------|------|
| Figure 4-6. Dynamic latched comparator schematic.                                    | 79   |
| Figure 4-7. 8-bit, R-2R segmented DAC architecture.                                  | .80  |
| Figure 4-8. Four channels TI, 8-bit, ASAR architecture.                              | 81   |
| Figure 4-9. (a) Four phase clock generator schematic, (b) timing diagram.            | 82   |
| Figure 4-10. (a) Chip layout showing the 4-channels TI AADC, (b) layout of a single  |      |
| channel                                                                              | 83   |
| Figure 4-11. Simulated DNL and INL of the proposed ASAR ADC                          | 84   |
| Figure 4-12. Frequency spectrum of the proposed 8-bit ASAR ADC for 1MHz sine input.  | .85  |
| Figure 4-13. Simulated DNL and INL of the proposed 4-channels TI, ASAR ADC           | 86   |
| Figure 4-14. Spectrum of the proposed 4-channels, 8-bit ASAR ADC for 1MHz sine inpu  | t.87 |
| Figure 4-15. AMS chip micrograph showing the proposed 4-channel, TI ASAR ADC         | 88   |
| Figure 4-16. Measurement setup for the ASAR-ADC.                                     | 89   |
| Figure 4-17. FFT spectrum of the proposed ASAR ADC.                                  | 89   |
| Figure 5-1. 3-bit implementation of CABS ADC architecture [61].                      | 92   |
| Figure 5-2. 3-bit Asynchronous BS-ADC with reference range detection [62].           | 95   |
| Figure 5-3. 5-bit Asynchronous BS-ADC with reference range detection [62]            | 96   |
| Figure 5-4. Architecture of the proposed 8-bit ABS-IRS ADC.                          | 100  |
| Figure 5-5. Conceptual architecture and operation of the proposed 8-bit ABS-IRS ADC. | 101  |
| Figure 5-6. Simplified schematic diagram of the track and hold circuit.              | 105  |
| Figure 5-7. Dynamic comparator used in the ABS-IRS ADC                               | 106  |
| Figure 5-8. C-8C charge redistribution DAC.                                          | 107  |
| Figure 5-9. Asynchronous logic block schematic                                       | 108  |

| Figure 5-10. Chip layout of the proposed 8b, 4 Channel TI, ABS_IRS ADC in          |     |
|------------------------------------------------------------------------------------|-----|
| IBM 0.13µm                                                                         | 109 |
| Figure 5-11. Simulated DNL and INL of the proposed ABS-IRS ADC.                    | 110 |
| Figure 5-12. FFT of the proposed 8-bit, ABS-IRS ADC for 1MHz sine input            | 111 |
| Figure 5-13. Simulated DNL and INL of the proposed 4-channels TI, ASAR ADC         | 112 |
| Figure 5-14. Spectrum of the proposed 4-channels, 8-bit ABS-IRS ADC for 10MHz sine | )   |
| input.                                                                             | 113 |
| Figure 5-15. IBM chip micrograph                                                   | 114 |

# LIST OF TABLES

| Table 3-1. Comparison summary between CFC and bulk trimming with various DAC  |     |
|-------------------------------------------------------------------------------|-----|
| resolutions                                                                   | 66  |
| Table 4-1. Performance summary of the proposed ASAR-ADC.                      | 85  |
| Table 4-2. Performance summary of the proposed 4-channels TI, ASAR-ADC.       | 87  |
| Table 4-3. Measured performance summary of the proposed TI-ASAR-ADC           | 90  |
| Table 5-1. Maximum jitter $\Delta T_s$ , for 0.5 LSB sampling uncertainty     | 105 |
| Table 5-2. Performance summary of the proposed ABS-IRS ADC.                   | 111 |
| Table 5-3. Performance summary of the proposed 4-channels TI, ABS-IRS ADC     | 113 |
| Table 5-4. Comparison summary between ABS-IRS ADC and the state-of-the-art BS |     |
| ADCs.                                                                         | 115 |

## CHAPTER 1 – INTRODUCTION

Ultra-wideband (UWB) research started more than 45 years ago. The research has a very interesting story moving from the laboratory to military applications, and finally into the commercial marketplace. The UWB showed great performance improvement for wireless voice, data, and video communications. Moreover, due to its wide bandwidth, it enabled both high data rate wireless connectivity for personal area network (PAN) [1], and for long range low data rate communication applications [2]. Recently, UWB research has been shifted towards lower rate indoor communications.

The U.S. Federal Communication Commission (FCC) used to allocate spectrum on a request by request basis. Due to the shortage of frequency spectra and the explosive demand for new wireless applications, the FCC made UWB a reality. On February 14, 2002 the FCC designated an unlicensed radio spectrum ranging from 3.1GHz to 10.6 GHz as UWB spectrum expressly targeting for public-safety, enterprise, and consumer applications [3]. Some advantages of UWB communications are the ability for: high data communication rates, resistance to multipath fading and jamming, and reduced cost of prototyping.

Figure 1-1 shows the block diagram of a classical low intermediate frequency (IF) radio frequency (RF) receiver front end. The receiver mainly consists of a band pass filter, a low-noise amplifier (LNA), a mixer to down convert the signal, a low pass filter (LPF), a variable gain IF amplifier, and an analog-to-digital-converter (ADC) that digitizes the signal and then passes it to a digital signal processor (DSP) for further processing.



Figure 1-1. Classical low-IF radio receiver front end.

Most of the UWB research has been focused on pushing the ADC as close as possible to the antenna to reduce the analog pre-processing while increasing the utilization of the digital post-processing more. This is mainly because of the fact that, as the integrated circuits (IC) process technologies have been scaled down in several tens of nanometers, they have allowed for higher integration of digital processing circuits. At the same time, the use of digital logic circuits becomes more power and cost efficient, while providing reduced design complexity, and improved system flexibility, programmability, and speed. To take advantage of this trend, alternative radio architectures have been exploited.

Figure 1-2 shows architecture of the future radio transceiver. The ultimate goal of this new topology is implementing an all-digital transceiver yet to be realized in the future [4]. The aim is to define all functional blocks of a classical radio receiver on software level. Addressing this vision, the concept of software-defined radio (SDR) was first proposed by Mitola in 1993 [5]. The new concept aimed to build a universal platform for receiving and transmitting at a reasonable data rate. Mitola defined the SDR architecture where the only analog component is an RF ADC at the receiver, and an RF DAC at the transmitter. All of the other communication functions are implemented in the digital domain using DSPs.



Figure 1-2. Future radio architecture

For most RF systems, bipolar transistors (BJT) and GaAs MESFETs show supreme performance in terms of noise, speed, and linearity. However, their high production cost and low yield restrict their widespread use. In recent decades, CMOS process technology has become a viable alternative with scaled feature sizes of a few tens of nanometers and improved transit time and hence higher cutoff frequency (ft). As a result, CMOS scaling made it possible to integrate complete low cost giga-hertz (GHz) radio receivers and transmitters (transceivers) on a single chip including front end analog and back end DSP.

Although the CMOS transceivers are smaller, cheaper, and consume less power, their design is more challenging. The undesired outcomes of the CMOS scaling for analog front ends are: the decreased system supply voltage and the intrinsic gain  $(g_m r_o)$  of transistors.

### 1.1 Motivation and Goals

The main focus of this research is to develop novel ADC architectures for UWB-SDR applications. To achieve this goal, the proposed new ADCs have to be able to fulfill the requirements of UWB and SDR applications. For UWB applications, the ADC should be designed for high data rate and medium to low resolution. Also, one of the most challenging

parts of multi-standard SDR systems is the ADC, because of the varying sampling rates and resolutions required to handle the wide range of signals corresponding to each individual operation mode. Therefore, for the SDR applications, ADC at the system-circuit-transistor levels has to be designed to achieve flexibility and re-configurability, operating at a wide range of speeds and resolutions. Thus, the research efforts were focused on three aspects to realize these ADC requirements: (1) reducing the delays of ADC circuits per code conversion by investigating and developing new ADC architectures, (2) enhancing the ADC's flexibility for re-configurability by breaking the exponential growth of the number of required subblocks (i.e. comparators, resistors, capacitors) with every bit resolution increase, (3) accurately correct non-idealities of sub-circuits that limits the ADC accuracy.

#### 1.2 Contributions

This thesis focuses on the design of high-speed, medium to high resolution ADCs, with a particular emphasis on their implementation and circuit non-idealities in deep sub-micron CMOS processes. This research resulted in several advancements in the field of high speed data converters for UWB and SDR applications. This thesis contributes architectural level and circuit level improvements to: (1) Classical digitally controlled trimming for offset correction techniques, where a novel two-step, hybrid calibration technique that can calibrate more accurately without increasing the size of calibration circuits reduces the calibration time, and consumes less power and area; (2) Successive approximation register (SAR) ADCs which enable operation at high sampling frequency, and high resolution with breaking the exponential growth of the number of components, facilitates the design complexity, and more flexible for programmability; (3) new ADC architecture targeted for

high to medium speed, medium to high resolution, which implements a novel technique called "indirect-reference-switching" that facilitates the design complexity, and thus making the architecture more flexible for programmability.

### 1.3 Organization of the Thesis

This thesis is organized in six chapters. Chapter 1 introduces a general ADC concept for UWB-SDR applications, research motivation, and goals. It also summarizes the research contributions and the organization of this thesis.

Chapter 2 gives general background information on different ADC architectures. The first section covers general information on the fundamentals of analog-to-digital converters. The second section covers a brief background on different ADC performance metrics. The third section covers popular Nyquist rate data converters architectures, and their functional sub-blocks and design requirements. The fourth section briefly describes the over sampled data converters. Last, the fifth section covers the time interleaved data converter structures.

Chapter 3 gives general background information on different offset correction techniques, and introduces the novel correction technique. The first section covers the main causes of the offset in the ADC most important block, which is the comparator. The second section covers the ADC sensitivity to the comparator offset based on the ADC architecture. The third section covers a brief introduction to the main two offset calibration methods while focusing on the popular digitally controlled trimming techniques. Finally, the fourth section introduces the novel coarse-fine-calibration technique, showing the concept of operation and its effectiveness versus the other techniques, accompanied with the simulation results and actual measurements from the fabricated chip using the AMS 350nm CMOS process.

Chapter 4 focuses on the successive approximation register (SAR) ADCs, and introduces the proposed asynchronous SAR-ADC architecture. The first section shows the difference between the classical synchronous-operation versus the asynchronous-operation in SAR ADCS. The second section covers the state-of-the-art asynchronous SAR ADC architectures and operation, showing their pros and cons from both the circuit level and system level perspectives. The last three sections introduce the proposed improved asynchronous SAR ADC showing both architecture and circuit level, the concept of operation, and the simulation result and actual measurements for an 8-bit, four channels time-interleaved prototype that was designed and fabricated in the AMS 350nm CMOS process as a proof of concept.

Chapter 5 focuses on the binary-search (BS) ADCs, and introduces the novel asynchronous binary-search ADC with indirect-reference switching (ABS-IRS). The first two sections cover the advances and improvements on BS ADC architectures in details. The third section covers the new asynchronous binary-search ADC architecture which improves the speed, resolution, and shows architecture and circuit level along with the concept of operation. The last two sections shows the simulation results and the micrograph of the fabricated 8-bit, four channels time-interleaved prototype in the IBM 130nm, cmrf8sf CMOS process as a proof of concept.

Chapter 6 summarizes major accomplishments achieved in this thesis and presents new ideas for future research and development of the next generation of high speed data converters.

## CHAPTER 2 – High speed Analog-to-Digital Converters

Analog-to-digital converter (ADC) is a mixed-signal electronics system that interfaces the analog domain signals to the digital world. An ADC samples the analog signal in one form (i.e. voltage, current, time, charge, etc.) and quantizes it to the equivalent discrete digital codes (1s and 0s) that can be further processed by digital backend processors. ADCs are traditionally divided into two main types: Nyquist rate and oversampling data converters. In this chapter, the fundamentals of ADC and the popular ADC architectures that are suitable for high data rate applications and power efficient are explained briefly. Also the performance metrics of ADCs are presented.

## 2.1 Fundamentals of Analog to Digital Converters

An analog signal is converted into the digital domain in three processing steps. The first two steps involved filtering and sampling of the analog signal. These two steps band limit the analog signal, clean the high frequency noise on the signal, and change the filtered signal from the continuous time nature to discrete time. The third step deals with quantization of the sampled discrete analog signal and assigning a digital code to each of the quantized levels. The basic block diagram of an analog-to-digital converter is shown in Figure 2-1.

The analog input signal  $X_a(t)$ , a continuous time analog signal, is first applied to a low pass filter (LPF) known as an anti-aliasing filter (AAF). The AAF removes any signal outside of the desired frequency band, and prevents out of band signals to fold back into the desired band after the sampling process reducing the "aliasing" effect. Then the filtered continuous



Figure 2-1. Basic ADC block diagram

time analog signal,  $X_o(t)$ , is sampled using a sample-and-hold circuit (S/H). The S/H operates at a sampling frequency ( $f_s$ ) that is always higher than the Nyquist frequency [6].

The Nyquist frequency or Nyquist rate is the minimum sampling rate that allows proper restoration of the sampled signal. After the continuous analog signal is converted into a discrete DC level and hold at that level by the S/H circuit, the ADC starts conversion of the sampled signal into discrete digital codes. The result of the conversion is the digitized bits  $(B_{N-1}...B_0)$  where (N) is the resolution of the ADC. The conversion process can be represented mathematically using equation (2.1).

$$X_a(t) = \left(\sum_{m=0}^{n-1} B_m 2^m + q_e\right) \times \Delta$$
 (2.1)

Where  $X_a(t)$  is the continuous time analog input signal,  $(B_m 2^m)$  is the binary weighted digital bits,  $(q_e)$  is the quantization error that results from the conversion, and  $(\Delta)$  is the ADC quantization step.

#### 2.1.1 Sample-and-Hold Circuits

The sample and hold (S/H) circuit as obvious from its name, is an analog circuit that samples the continuously varying analog signal and then holds its value as a constant DC level for a specified period of time. The S/H circuit can be considered as analog memory device. The main benefit of placing the S/H circuit before the ADC is to eliminate variations in the analog input signal  $X_o(t)$  that can corrupt the data conversion process.

In most analog-to-digital converters, the S/H circuit samples the analog signal at a continuous rate of ( $f_s$ ). Frequency response of the sampled signal is given by equation (2.2).

$$X_s(f) = \frac{1}{T_s} \sum_{k=-\infty}^{\infty} X_a \left( f - k f_s \right)$$
(2.2)

Where  $(f_s)$  is the ADC sampling frequency and is equals to the reciprocal of the sampling clock period  $(T_s)$ , and  $X_a(f)$  is the frequency spectrum of the filtered analog input signal.

Data converters can be classified into two categories based on the ratio of their sampling rate ( $f_s$ ) to the bandwidth of the sampled signal. The first category is the Nyquist rate data converters, where the sampling rate is usually 3 to 20 times larger than the signal bandwidth. The second type is the oversampled data converters, where the sampling rate is much faster than the Nyquist rate, usually 4 to 1024 times (or more) faster. The ratio of the ADC sampling rate ( $f_s$ ) to the Nyquist rate is given by the over-sampling ratio (OSR). One of the main advantages of the over-sampling data converters is that the design specifications of the anti-aliasing filter are more relaxed compared to the brick wall shaped filters required for the Nyquist rate data converters.

#### 2.1.2 Quantization Error

During quantization, the continuous time analog signal with infinite signal levels loses some information as they are represented with quantized discrete levels depending on the ADC resolution and input range. Thus, the quantization process adds an irreversible loss or error to the sampled analog signal. Figure 2-2 (a) shows the transfer characteristics of a 3-bit ADC, where the dashed line represents the analog input applied to the input of the ADC. The x-axis of the stair case represents the discrete analog levels used by the ADC quantizer to map sampled analog signal to digital domain. Figure 2-2 (b) shows the corresponding quantization error.



Figure 2-2. (a) ADC input/output characteristics; (b) ADC quantization error.

The number of quantization levels is given as:

Number of Quantization levels = 
$$2^{N}$$
 (2.3)

Where N is the ADC resolution, which is equal to 3 in this case. Therefore, for the ADC of Figure 2-2, we have 8 different quantization levels. For an N-bit quantizer, least significant bit (LSB) equivalent analog quantization level ( $\Delta$ ) is defined as:

$$\Delta = \frac{X_{FS}}{2^N} \tag{2.4}$$

Where  $(X_{FS})$  is the full scale input range of the quantizer.

An analog input signal,  $(X_j+q_e)$ , is ideally quantized to  $(X_j)$ , where  $(X_j)$  is the quantization level. The quantization error  $(q_e)$ , ideally lies in the range  $(-\Delta/2 < q_e \le +\Delta/2)$ . Typically, quantization error is a nonlinear function of the input signal level [7]. The quantization error can be modeled as white Gaussian noise under certain conditions; (a) the input signal stays in the quantizer full scale range otherwise it saturates, (b) the input signal amplitude covers a wide range of quantization levels, (c) the quantization errors are not correlated, and (d) the quantizer has an infinite number of levels. Under these conditions, the quantization error will have the probability density function shown in Figure 2-3. It can also be represented mathematically as in equation (2.5).

$$P(q_e) = \begin{cases} \frac{1}{\Delta}, if - \frac{\Delta}{2} < q_e < \frac{\Delta}{2} \\ 0, otherwise \end{cases}$$
(2.5)

Where  $(q_e)$  is the quantization noise,  $P(q_e)$  is the quantization noise probability density function, and ( $\Delta$ ) is the ADC quantization step.

 $\langle \mathbf{a} \mathbf{a} \rangle$ 



Figure 2-3. Quantization error probability density function.

The input refereed root-mean-square (RMS) of the quantization error (or noise) level can be represented as:

$$q_{e,rms} = \sqrt{\left[\frac{1}{\Delta} \int_{-\Delta/2}^{\Delta/2} \Delta^2 d\Delta\right]} = \frac{\Delta}{\sqrt{12}}$$
(2.6)

While the RMS value of the input signal level for N-bit ADC can be represented as:

$$X_{rms} = \frac{2^N \Delta}{2\sqrt{2}} \tag{2.7}$$

Therefore, the signal to noise ratio can be represented as the ratio of these two as:

$$SNR = 20 \log \frac{X_{rms}}{q_{e,rms}} = 20 \log \frac{\frac{2^N \Delta}{2\sqrt{2}}}{\frac{\Delta}{\sqrt{12}}} = 6.02N + 1.76 \, db \quad (2.8)$$

## 2.2 Analog-to-Digital Converter Performance metrics

The analog-to-digital converter performance in the presence of circuit non-idealities can be determined by its performance metrics. The performance metrics can be divided into two categories: static and dynamic. In this section, methods used to evaluate the ADC performance metrics in the presence of circuit non-idealities will be illustrated.

#### 2.2.1 Static Performance Metrics for ADCs

The non-ideal static ADC errors referred to the errors that are stationary over time and signal levels. They are the offset errors, gain errors, integral nonlinearity (INL) error, and differential nonlinearity (DNL) error.

#### 2.2.1.1 Offset Error

Offset error, also known as 'zero-scale' error, is a measure quantifying how well the actual transfer curve matches the ideal transfer curve at a single point. For an ideal ADC quantizer, the first transition occurs at 0.5LSB above zero input level. While for a non-ideal ADC, the zero-scale voltage is applied to the analog input and is increased until the first transition occurs. The offset error is than defined as the difference between the value of the first actual code transition and the ideal code transition values as shown in Figure 2-4.



Figure 2-4. Transfer curve illustrating ADC offset error.

#### 2.2.1.2 Gain Error

The gain error of a non-ideal ADC quantifies how well the slope of an actual ADC transfer curve matches to the slope of the ideal transfer curve which has a gain of unity. Gain error is usually determined at full scale code ranges after the offset error is removed as shown in Figure 2-5. It is expressed in LSB or as a percent of full-scale range (%FSR).



Figure 2-5. Transfer curve illustrating ADC gain error.

#### 2.2.1.3 Differential Nonlinearity (DNL) Error

For an ideal ADC, the analog input levels that trigger any two consecutive output codes should differ by only one LSB. After the offset and gain errors are removed, any deviation from one LSB difference is defined as differential nonlinearity (DNL). Figure 2-6 illustrates the transfer curve of a non-ideal 3-bit ADC, and associated DNL errors. As shown on the figure, DNL is always a good metric for identifying missing codes that occurs at when measured DNL is -1LSB.



Figure 2-6. Transfer curve illustrating DNL errors of a 3-bit non-ideal ADC.

#### 2.2.1.4 Integral Nonlinearity (INL) Error

The integral nonlinearity (INL) error in non-ideal ADCs is defined as the difference between the non-ideal ADC transition points and the ideal ADC transition points after removing the offset and gain errors. Figure 2-7 shows an INL error example of a 3-bit non-ideal ADC. INL can also be defined as the cumulative summation (integral) of the DNL error as given in equation (2.9).

$$INL[m] = \sum_{i=1}^{m-1} DNL[i]$$
 (2.9)

#### 2.2.2 Dynamic Performance Metrics for ADCs

For non-ideal ADCs, dynamic performance for both Nyquist rate and oversampled ADCs is specified using parameters obtained via frequency-domain analysis.



Figure 2-7. Transfer curve illustrating 3-bit ADC INL error.

#### 2.2.2.1 Signal-to-Noise Ratio (SNR)

The SNR is defined as the ratio of the full scale RMS amplitude of the input signal to the RMS amplitude of the noise signal at a given band. The noise is not limited to the quantization noise, but it also includes 1/f noise, thermal noise, and other noises introduced by the ADC itself. Ideal SNR of an N-bit ADC is given in equation (2.8).

#### 2.2.2.2 Signal-to-Noise and Distortion Ratio (SNDR or SINAD)

The SNDR is defined as the ratio of the full scale RMS input signal amplitude to the total RMS magnitude of the converter noise plus distortion at the converter output, when the input is a sinusoidal waveform. RMS noise plus distortion includes all spectral components up to the Nyquist frequency, excluding the fundamental and the DC offset. SNDR is typically expressed in dB and used to quantify the bit resolution performance of the ADC.

#### 2.2.2.3 Spurious-Free Dynamic Range (SFDR)

SFDR is the ratio of the RMS amplitude of the fundamental (maximum signal component) to the RMS magnitude of the next largest spurious harmonic component measured at the output of the ADC, excluding DC offset. SFDR is specified in decibels relative to the carrier (dBc).

#### 2.2.2.4 *Total Harmonic Distortion (THD)*

For ADCs, THD is the ratio of the RMS sum of the selected harmonics of the measured output signal to the fundamental itself. Only harmonics within the Nyquist limit are included in the measurement.

#### 2.2.2.5 Dynamic Range (DR)

Dynamic range of an ADC is defined as the ratio of the maximum output signal level to the minimum output signal level for which maximum SNR is achieved. The DR is always related to the ADC resolution. Dynamic range can be expressed mathematically as shown in equation (2.10).

$$DR = 20 \log \left[\frac{2^N - 1}{1}\right] \approx 6.02 \times N dB \qquad (2.10)$$

#### 2.2.2.6 *Effective Number of Bits (ENOB)*

The ENOB is used to quantify the ADC performance in term of the number of bits resolution it can produce effectively in a non-ideal environment at a specific input frequency and sampling rate. Ideally, ADC's error consists only of quantization noise. As the input frequency increases, the overall noise (particularly THD) also increases, thereby reducing the

SNDR and ENOB. The ENOB is calculated from the SNDR when a full scale input is applied to the ADC. It is expressed mathematically as shown in equation (2.11).

$$ENOB = \frac{SNDR - 1.76 \, dB}{6.02} \tag{2.11}$$

#### 2.2.2.7 Figure-of-Merit (FOM)

The ADC FOM is used to quantify and compare different ADC topologies combining dynamic performance parameters in a single equation including power consumption, effective number of bits, and the sampling frequency ( $f_s$ ). The FOM also gives valuable information of the amount of energy required per conversion step. The FOM can be expressed mathematically as:

$$FOM = \frac{Power}{(2^{ENOB}) \times f_s} (pJ/step)$$
(2.12)

### 2.3 Nyquist-Rate Data Converters

The Nyquist rate ADCs usually can sample the analog input at a minimum of 1.5 times the Nyquist rate which is equal to double the bandwidth of the analog input signal. This type of ADCs offer a wide range of bandwidths with low-to-medium bit resolutions. The Nyquist rate ADCs are subject to circuit non-idealities that directly affects their performance metrics. The ADC non-idealities are not just emanating from the component mismatches due to environmental or process variations, but also from different types of noises, comparator hysteresis, and finite operational amplifier OPAMP gain and bandwidth [8]. Different

techniques were developed to correct and compensate for such non-idealities [9]. Based on the application requirements, the designer should select the appropriate ADC topology to satisfy the required resolution, power consumption, area, and bandwidth specifications. This section examines different types of ADC topologies and discusses existing trade-offs among accuracy, power, speed, and area.

#### 2.3.1 Flash ADC

It is clear from its name "flash" or sometimes called parallel ADC that this topology has the highest speed among the ADC topologies. The flash ADC converts the analog signal into a digital code by comparing the sampled analog input with fixed analog reference levels as shown in Figure 2-8. Number of the fixed reference levels determines the resolution of the digital output code. The flash ADC uses parallelism to achieve high quantization rate at the cost of large number of comparators. To achieve N-bit resolution, the input signal level is simultaneously compared against ( $2^{N}$ -1) of reference levels, usually generated by a resistor ladder while the comparison is done by using ( $2^{N}$ -1) comparators. The comparators compare the input against the reference levels and generate the digital output code in a thermometer code format with a size of  $2^{N}$ -1 bits. The thermometer code is than converted into the binary code with a size of N-bits using a thermometer-to-binary decoder.

Main advantage of the flash ADC structure is that it can achieve very high conversion speed, since all  $(2^{N}-1)$  number comparators complete the quantization in one clock cycle. The conversion period is the time for one comparison plus the time required for the digital processing. Due to its superior speed performance, the flash topology has been used for high-speed ADCs with conversion rates more than 500MHz [10] [11] [12].



Figure 2-8. Flash analog-to-digital converter.

On the other hand, the flash topology has many drawbacks. The flash topology suffers from limited resolution, because the number of comparators and reference ladder resistors grow exponentially with the resolution making them less attractive for energy efficient applications. For an N-bit flash ADC, (2<sup>N</sup>-1) number of comparators and resistors are required. Therefore the area and power consumption increases rapidly with resolution. Hybrid flash ADC topologies, such as interpolating [13] and folding [14] flash ADCs can reduce the number of preamplifiers and latches, but the exponential growth of the number of comparators remains a fundamental problem with the flash topology.

In addition to the high power consumption and large area requirements of the flash ADC topology, many circuit non-idealities severely affect its performance. The random offset
error of the  $(2^{N}-1)$  comparators is the major contributor. To achieve N-bits resolution, the offset in each comparator has to be corrected to the full analog LSB of the ADC resolution (as given in equation (2.4)) using offset correction techniques [15]. Size of each comparator could be scaled down if offset correction circuits are used, resulting in reduced comparator size. However, the additional area required for each offset correction circuit would still increase the overall area while adding extra design complexity making the design of flash ADC not only more challenging but also power inefficient.

Another serious issue with the flash ADC topology is the high input capacitive loading caused by the large number of comparators that are connected to the sample and hold circuit at the same time. This limits the input signal bandwidth while rendering it impractical to increase the bandwidth by time interleaving many flash ADCs.

Also, it is worth mentioning that the 2-bit flash ADC topology is a special case, because it needs only 3 comparators which would allow energy efficiency operation. Moreover, according to [16], for small number of bits resolution, the flash topology might be more energy efficient than the binary search ADC topologies such as successive approximation register (SAR) ADCs which is known as the most energy efficient ADC as will be shown latter in this chapter.

#### 2.3.2 Successive Approximation Register (SAR) ADC

The successive approximation register (SAR) ADC topology is widely used because of its energy efficient operation. The power consumed by the SAR ADC scales linearly with its resolution, causing it to be the best choice for medium resolution applications. The SAR ADC uses a single comparator to quantize the analog input in a feedback loop achieving area and energy efficiency while trading off the conversion speed. As the CMOS process technology scales down, the ability of achieving faster processing speeds in SAR ADCs become more and more pronounced.

The classical topology of a SAR ADC is shown in Figure 2-9. It is consisted of a comparator, an N-bit digital-to-analog converter (DAC), and digital SAR logic that implements the successive approximation (SA) or binary search algorithm. Unlike the flash ADCs, SAR ADCs are sequential converters, that in each clock period, one bit is quantized from the most significant bit (MSB) to the least significant bit (LSB). The SA algorithm uses N quantization cycles to find the closest quantization step to the sampled input signal.



Figure 2-9. General Architecture of SAR ADC

For better understanding of the SAR ADC operation, consider a 3-bit SAR ADC, and a sampled input signal of  $V_{in}$  between  $6V_{ref}/8$  and  $5V_{ref}/8$  as shown in Figure 2-10. In the first clock cycle, the analog input signal ( $V_{in}$ ) is sampled at the input of the ADC. Then, the DAC inputs D<2:0> is set to the middle code <100> to generate a reference voltage signal equal to the middle step ( $V_{ref}/2$ ). If the input signal is larger than ( $V_{ref}/2$ ), the MSB will remain 1; otherwise, it is set to 0. In this example, the D<2> remains 1. In the next clock cycle, DAC input is set to <110> to generate a voltage signal equal to ( $3V_{ref}/4$ ). Again, if the input is larger than ( $3V_{ref}/4$ ), then D<1> will remain 1, otherwise, it is set to 0. In this example D<1> is set to 0, because the generated reference voltage becomes larger than the sampled input voltage. During the last iteration, the DAC input is set to <101> to generate a voltage signal equal to (5Vref/8) and the LSB D<0> is determined through the comparison. In this example the final output digital code is D<101>. From this example, we can conclude that the analog input signal is sampled and quantized in 4 clock cycles. In general, SAR ADCs require (N+1) clock cycles to complete one conversion as opposed to one for the flash ADC.



Figure 2-10. 3-bit SAR ADC operation

The advantage of the SAR ADC is that it requires few analog components to implement N-bit ADC, which results in a simple design, efficient power consumption, and compact silicon area. However, there are limitations on the maximum operating frequency for the SAR ADC. For an N-bit SAR ADC, a clock frequency of (N+1) times the Nyquist rate for sampling and converting the analog input signal is required. For example digitizing 200MHz analog signal with an 8-bit SAR ADC running at Nyquist rate and OSR of one, a 3.6GHz clock is required. Generating and routing an on-chip 3.6GHz clock consumes huge dynamic power compared to the power consumed by the SAR ADC itself.

Another serious issue that limits the maximum operating frequency of a SAR ADC is the DAC settling. The speed of SA operation is based on the total delay on the feedback loop. Therefore the total conversion time is equal to the summation of the comparator quantization delay, the digital SAR logic propagation delay, and the DAC settling time which is the largest delay in the loop. As the sampling frequency increases, accomplishing DAC settling in the allowed time frame becomes more challenging. If the DAC output was not settled properly at any bit quantization levels other than the LSB level, the output code would have an error larger than 1 LSB which will have a negative impact the overall ADC performance.

In recent years many design efforts was spent to improve the SAR operating speed while maintaining its low energy consumption [17] [18] [19] [20].

#### 2.3.3 **Pipelined ADC**

Pipelined ADC topology achieves the area saving by using series pipelined operation of low resolution sub-ADCs while taking advantage of the parallelism in the pipelined stages. Thus, a pipeline ADC consists of a cascade of pipelined low resolution ADC stages. Each pipeline stage is composed of a sample and hold amplifiers SHA, low resolution sub-ADCs and DACs, operational amplifier (OPAMP) based residue amplifiers, and a subtractor as shown in Figure 2-11.

Each stage processes its input as follows: First, it samples the analog input  $V_i$  using the SHA which works as an analog memory between the pipelined stages [21]. Then the n-bit sub-ADC quantizes the sampled analog input ( $V_i$ ) and feeds its digital output to the sub-DAC which converts the ADC outputs into an analog signal. The analog output of the DAC is subtracted from the sampled analog input signal. The subtractor output is considered as the "residue" or the unconverted part of the input signal. The OPAMP based residue amplifier with the gain of  $2^n$  amplifies the residue signal which is often brought back to full-scale level, and becomes the analog input to the next pipeline stage ( $V_{i+1}$ ). The combined digital outputs of all sub-ADCs form the final ADC digital output code.



Figure 2-11. General architecture of pipelined ADC

For an N-bit pipelined ADC, (N/n) number of pipeline stages are required where n is the resolution of the sub-ADCs and DACs in each stage. Unlike the flash ADC, the pipeline ADC reduce the number of comparators in a flash ADC by breaking the flash operation into multiple steps [22]. Since all the pipelined stages work concurrently, the conversion speed of a pipelined ADC is high. Moreover, by using the pipelined ADC topology, a high-resolution converter can be broken into multiple stages, which greatly reduces the total number of comparators when compared to the flash ADC topology.

Despite the smaller area and power consumption of pipelined ADC topology compared to flash ADCs, pipelined ADCs have many disadvantages. The main disadvantage emanates from the strict requirements of OPAMP residue amplifier's gain, slew rate SR, and bandwidth. Moreover, the accuracy of the first stage that resolves the MSB is critical, since a slight error in this stage propagates through the converter and results in large errors. Also, the sub-DAC linearity must be better than the remaining next stage bits. This means that the sub-DAC in the first stage needs to have a resolution higher than the resolution of the whole ADC.

Figure 2-12 shows a comparison between the three Nyquist rate ADCs: flash, pipeline, and SAR ADC. The comparison is mainly focused on the speed and power consumption of Nbit resolution. Since the flash ADCs possess the highest speed performance and they finish the conversion in single clock cycle, conversion speeds of the others are normalized with it. Also, since the flash ADCs uses (2<sup>N</sup>-1) comparators for N-bits quantization, we can roughly consider its power consumption as 2<sup>N</sup>. Pipelined ADC reduces the required exponential increase of the number of comparators in flash ADCs. For N number of bits, the pipelined ADCs will use N number of comparators as 1-bit per stage quantization. However, because of the OPAMP residue amplifier used in every stage, we are going to consider the power consumed by the pipelined ADC to be larger than N with a speed less than 1. In SAR ADC, the conversion is divided into several comparison cycles as in the pipelined ADCs, but the SA algorithm runs sequentially rather than in parallel as in the pipelined and flash ADCs. Since SAR ADC uses only a single comparator to quantize N bits, we can normalize its power consumption roughly to 1 and its speed is 1/N [23]. The ratio of power and speed represents the total energy consumed per bit conversion. It is clear that for large bit resolution [16], the SAR ADCs are the most power efficient but it is of the worst speed because of its sequential operation. This conclusion was the starting point for new research approaches that aims for removing the speed limitation of Nyquist-rate ADCs. Some of these approaches are the time interleaved structures [24] [25], hybrid structures [26], and the asynchronous processing structures [27] [28].



Figure 2-12. Comparison between commonly used Nyquist rate ADCs

#### 2.3.4 Asynchronous Processing ADC

One of the research approaches towards increasing the conversion efficiency of Nyquist rate ADCs is the asynchronous clocking of the ADCs. The main target of such an approach is pushing the speed and power limits in ADCs to achieve further improvements in their conversion efficiency. Different techniques were used to achieve the asynchronous processing [29]. Some examples will be discussed in details in CHAPTER 3 and CHAPTER 4.

# 2.4 Oversampling Data Converters

As mentioned before in section 2.1.1, the oversampling data converters sample the analog input with a rate much higher than the Nyquist rate which is used to define the oversampling ratio (OSR). Because of their less sensitivity to circuit non-idealities, oversampling data converters have been the best choice for low-speed and high-resolution applications. Most of the digital audio systems that require resolution of 18-bits or higher rely on oversampling ADCs. This is because digital signal processing circuits are used instead of complex and precise analog signal processing circuits which are minimized in oversampling ADCs. Thus, the accuracy of the converter does not depend on the component non-idealities. Unlike the Nyquist rate converters, the oversampling converters do not need precise anti-alias filtering. They also do not require dedicated sample and hold circuit because comparison is performed by a modulator, and the encoding in most cases is done using a digital filter.

#### 2.4.1 **Over Sampling Concept**

The concept of over sampling can be used to reduce the in-band quantization noise of data converters. The analog signal is sampled with a sampling rate higher than the Nyquist rate which is used to define the oversampling ratio OSR. For a better understanding of how the over sampling improves the converter resolution, consider the quantization noise that was discussed in section 2.1.2 as having equal probability of lying anywhere in the range of  $\pm 0.5\Delta$ , and its mean square value was given in equation (2.6). Also assume that all the power is in the positive range of frequencies. When a quantized signal is sampled at frequency  $f_s$ , all of its power folds into the frequency band ( $0 \le f \le f_s/2$ ). Then consider the quantization noise as a white Gaussian noise, thus the spectral density of the sampled noise can be given by [30]:

$$E(f) = q_{e,rms} \sqrt{\frac{2}{f_s}}$$
(2.13)

Therefore, the noise power that falls into the signal band can be given by:

$$P_{q_e} = \int_{0}^{f_{in}} q_e^2(f) \, df = q_{e,rms}^2 \left(\frac{2f_{in}}{f_s}\right) = \frac{q_{e,rms}^2}{OSR} \tag{2.14}$$

Assuming a sine wave input, the signal power can be given by:

$$\boldsymbol{P}_{\boldsymbol{S}} = \frac{\Delta^2 \mathbf{2}^{2N}}{\mathbf{8}} \tag{2.15}$$

Therefore, we can consider the maximum signal-to-quantization noise as:

$$SQNR_{max} = 10 \log\left(\frac{P_{q_e}}{P_S}\right) = 6.02N + 1.76 + 10 \log(OSR) \quad (dB)$$
 (2.16)

The last term in equation (2.16) is the enhancement caused by the over sampling. Doubling the OSR increases the quantizer resolution by 3dB or 0.5 bit. The drawback of this improvement is the increased power consumption of the quantizer. Therefore, increasing the OSR does not efficiently improve the ADC performance. To overcome this drawback, deltasigma ( $\Delta\Sigma$ ) (also referred as sigma-delta ( $\Sigma\Delta$ )) modulators were introduced.

The advantages of oversampling are that it relaxes the transition band requirements for the anti-aliasing filter, reduces the baseband quantization noise power, and utilizes low cost digital filtering. The main disadvantage of oversampling ADCs is that it trades the speed for resolution.

#### 2.4.2 Sigma-Delta ( $\sum \Delta$ ) Modulators

The  $\Sigma\Delta$  modulator uses the oversampling technique in conjunction with another technique called "noise shaping" [31]. Both techniques increase the SNDR of the converter which translates in increased effective bit resolution or ENOB. As the order of the  $\Sigma\Delta$  modulator increases, the bit resolution is increased but its stability becomes an issue.

In this section, Figure 2-13 shows the first-order  $\Sigma\Delta$  modulators architecture. The converter consists of 1-bit DAC, 1-bit ADC or a comparator, and analog integrator circuit, and a digital decimation filter.

The difference between the input signal and the DAC output is fed to the integrator, and then the integrated signal is quantized by the 1-bit ADC. The negative feedback loop in the  $\Sigma \Delta$  architecture forces the time-average of the quantizer output to converge to the input signal value. The digital filter then decimates the quantizer output and removes the out-ofband noise.



Figure 2-13. Conventional first order sigma-delta ADC.

## 2.5 Time Interleaved (TI) Structures

In time-interleaved ADC, high speed operation is achieved by using relatively low speed sub-ADCs working in parallel as shown in Figure 2-14. This way, if M numbers of sub-ADCs are time-interleaved, such that each sub-ADC is operating with conversion rate of  $f_s$ , the effective ADC speed will be M times the conversion rate of a single sub-ADC.

The idea of time-interleaving has been proposed to achieve very high conversion rates that are beyond the conversion rate limits of the state-of-the art single ADCs.

There are many practical implementation issues related to the time-interleaved ADC structures. This is due to the channel to channel mismatches caused by offset, gain, and time skew variations.

Moreover, the sample and hold circuit implementation in TI structures is always challenging. A single front end S/H circuit can be placed to drive all channels. In this situation, the S/H has to operate at the effective speed of ( $Mf_s$ ). The capacitive loading of the N number of parallel channels will limit the bandwidth of S/H making its design a challenging

task. An alternative way would be placing a separate S/H circuit for each channel which relaxes the capacitive loading and the speed by M times. However, the mismatch between the M numbers of S/H circuits can degrade the SNDR of the overall TI ADC. Consequently, calibration circuits are always required for this kind of ADCs.



Figure 2-14. Simplified TI ADC architecture with timing diagram

# **CHAPTER 3** – Comparators Offset Correction Techniques

As discussed in previous chapter, most of the data converters suffer from circuit nonidealities. Some examples for such non-idealities are input referred DC offset of the dynamic comparators and OPAMPs, clock jitter, KT/C noise, kickback noise, clock feed through, and quantizer resolution. These non-idealities affect the overall performance of the data converter. The non-linear input referred DC offset can be considered as the most important issue affecting the ADC performance metrics especially the architectures that utilize multiple quantizers, such as flash ADCs [32]. In this chapter, the causes of offset in circuits, the ADC sensitivity towards various offsets, and different techniques for cancelling offsets in critical circuits specifically dynamic latched comparators are discussed.

## 3.1 Introduction

Comparators are considered the most important blocks in ADCs that are widely used as the decision-making circuit during converting the analog domain signal into the digital domain counterparts. Comparators could be designed to work continuously (static) without requiring a clock signal that initiates comparison operation or clocked (dynamic).

In recent years, dynamic comparators have been used to minimize power consumption without trading off speed. The minimum size transistors are often used in the dynamic comparator circuits to reduce parasitic node capacitances and thus increasing the speed and bandwidth while reducing power consumption. Dynamic comparators typically use positive feedback to achieve high gain, high speed, low power, and full signal swing at the outputs. Because the supply current is being supplied only when the clock signal is activated, mainly dynamic power is consumed during the comparison or evaluation phase.

The key performance parameter of any type of comparator is the input-referred random offset, which is mainly the direct result of using minimum device size transistors which exhibits large physical and electrical mismatches even if they were built close to each other on silicon. These mismatches could be categorized as static such as threshold voltage  $(V_{th})$  and trans-conductance ( $\beta$ ) and dynamic ones. The dynamic clocked comparators often exhibit larger offset when compared to the static comparators due to the fact that in addition to static transistor mismatches, they also suffers from dynamic mismatches due to imbalance of parasitic capacitors between the internal nodes during the evaluation phase [33]. With technology scaling, both static and dynamic mismatches of the transistors affect the offset performance more severely and cannot be relieved by layout techniques. It was reported in [34] that a capacitive imbalance of only 1fF can lead to offset voltages of several tens of millivolts in a typical 0.18- $\mu$ m CMOS latch. A detailed mathematical derivation of mismatche sensitivity of a latched comparator by using a perturbation method can be found in [35].

Variation could occur during the fabrication process of the metal oxide semiconductor (MOS) transistors. As a result of this, the parameters of two identical devices on an integrated circuit show a random variation that is considered as device mismatch [36]. The static mismatch of MOS transistor's threshold voltage ( $V_{th}$ ) and the current gain ( $\beta$ ) can be modeled as in the following equations (3.1), (3.2) [8] [37].

$$\sigma_{V_{th}} = \frac{A_{V_{th}}}{\sqrt{WL}} \tag{3.1}$$

$$\sigma_{\beta} = \frac{A_{\beta}}{\sqrt{WL}} \tag{3.2}$$

Where  $\sigma_{Vth}$  is the standard deviation of the threshold voltage,  $\sigma_{\beta}$  is the standard deviation of the current gain,  $A_{Vth}$  and  $A_{\beta}$  are process dependent parameters, W is the transistor channel width, and L is the transistor channel length.

Random parameter mismatch of the MOS transistors of the dynamic latched comparator in ADCs lead to random DC offsets. As a result, some ADC topologies that utilize multiple comparators with different random offsets suffers direct degradation of the performance parameters including differential nonlinearity (DNL), integral nonlinearity (INL), and signal-to-noise and distortion ratio (SNDR).

## 3.2 Comparator Offset Influence on ADC Performance

As they are considered the most important block in most ADCs, comparators have big influence on the overall performance of ADCs including SNDR, ENOB, INL, DNL, and, monotonicity. Accuracy of a comparator is often associated with its input referred offset voltage. As the technology minimum feature sizes scale down, random offset of comparator impact the yield of ADCs more severely.

Figure 3-1 shows the symbol, circuit model, and the transfer characteristics of ideal and non-ideal comparator. Gain of the comparator determines its resolution (noise margin or uncertainty region) which is equal to  $(V_{IH}-V_{IL})$ . In Figure 3-1 (a), the resolution is centered on the reference ground voltage, while in Figure 3-1 (b), the transfer curve is shifted due to the dc offset introduced by the random process mismatch.



Figure 3-1. (a) Ideal comparator characteristics, (b) comparator characteristics with offset ADCs could be categorized as offset immune and offset sensitive topologies.

#### 3.2.1 Offset Immune ADCs

All single-channel and single-comparator ADC topologies such as successive approximation register (SAR) and first-order delta-sigma ( $\Delta\Sigma$ ) are considered as offset immune structures. This is because the offset of a single decision making comparator will only drift the transfer curve based on the offset polarity. If the comparator offset is static, thus for every quantized digital output, it will have the same dc offset in a linear ADC characteristics.

#### 3.2.2 Offset Sensitive ADCs

All single-channel with multiple comparators and multiple-channel ADC topologies are considered as offset sensitive structures. It is obvious that as the number of comparators increase in an ADC topology, the more random offset variations could be observed resulting in the more nonlinear ADC operations.

ADCs could not achieve speeds beyond the process limits; except for novel topologies that utilize parallel structures such as time interleaved ADCs, and asynchronous structures. Unfortunately, most of these novel structures rely on multiple comparators structures that have large random offset variations. Therefore, implementing offset correction circuits became very important for maintaining efficient ADC performance metrics.

# 3.3 Offset Correction Approaches

It became more necessary to correct the random offset variation of comparators as most of the novel ADC structures rely mainly on multiple comparators from (N) number of them in asynchronous ADC structures to  $(2^{N}-1)$  in flash ADCs.

The offset correction techniques can be categorized into two; background and foreground. In background correction, the correction is applied during the regular ADC or comparison operation and correction value needs to be refreshed periodically. In foreground correction, the correction is applied only once during the startup of the ADC or comparison phase and the correction amount is determined before its regular operation.

#### 3.3.1 Background Calibration Techniques

The background calibration can correct both the static and dynamic offsets. It requires extra clock cycle per conversion which limits its usage to low-speed ADC applications. Some common techniques for background calibration are explained below.

#### 3.3.1.1 Offset Averaging Technique

This technique is limited for flash ADCs where it averages the random offsets using lateral resistors between the outputs of adjacent preamplifiers and the comparators. This technique causes speed degradation because the ADC is affected by the averaging resistors' parasitic capacitances [38], moreover it is not suitable for correcting large offsets [39].

## 3.3.1.2 *Auto-Zeroing Technique*

In this technique, the comparator offset is sampled by switched capacitors on input side of the comparator and subtracted from input signal, thus cancelling the offset in the next clock cycle [40]. One disadvantages of this technique is that it needs several switched capacitors which limit the sampling rate and operation bandwidth. Also, it can be used only with clocked comparators. Thus, it does not allow continuous conversion in flash ADCs, preventing high speed operation. Offset sampling techniques introduce additional capacitance in the signal path. They also do not cancel the offset voltages of the latched comparators completely, because the cancellation is limited by the charge-injection mismatches of the MOS switches used by the correction circuits.

#### 3.3.1.3 Correlated Double Sampling (CDS) Technique

The CDS technique is an auto-zeroing technique followed by an analog sample-andhold circuit [41]. This technique compensates for the input referred offsets as well as the 1/f noise which helps increasing the input common mode range of the comparator. The disadvantages of this technique are that it needs input capacitors which limit the sampling rate and the bandwidth. Also it needs a complex clocking of three non-overlapping clocks to operate and the correction depends strongly on the signal amplitude, limiting the output swing [42]. This technique is not suitable for continuous time comparators or high speed ADCs.

#### 3.3.1.4 *Chopper Stabilization Technique*

In random chopping calibration technique, the polarity and random offset of a chopping comparator is detected by monitoring the digital output code density, and then binary feedback is used to digitally compensate the offsets until it is minimized [43]. One of the disadvantages of the chopping calibration technique is that it needs a chopping circuit which is composed of four switches in front of the comparator. These analog switches in the analog signal path may cause signal distortion and degrades the signal bandwidth. Finally, the chopping technique cannot be used for continuous time comparator or high speed ADCs.

#### 3.3.2 **Foreground Calibration Techniques**

The foreground calibration can correct only for the static offsets. Implementing the foreground calibration is not as complicated as the background calibration. Moreover the foreground calibration can be implemented with any ADC structure and is more suitable for high-speed ADCs. Some common techniques used for foreground calibration are explained below.

#### 3.3.2.1 *Threshold Tuning using Floating-Gate Transistors*

This technique uses nonvolatile floating-gate charge storage element to store and trim the threshold voltage of the transistors of the comparator to correct the offset voltages [44]. The main disadvantage of this technique is that it requires special processes steps to build stacked gates of which one is floating that makes it process dependent.

## 3.3.2.2 Redundancy and Re-Assignment Technique

This technique is mainly used in flash ADCs [45]. More than two comparators for each reference voltage are integrated in the flash topology, but only one is chosen to be active for normal operation while the rest are powered off. During start up, core calibration circuit reassigns one of the redundant comparators that is with the least offset drift from the reference voltage. The main disadvantage of this calibration technique is that it needs complex correction circuits (ex. summing encoder) which cause speed degradation. Also using large number of redundant comparators requires large silicon area.

#### 3.3.2.3 Digitally Controlled Trimming Techniques

The digitally controlled trimming of offset voltage is a popular calibration technique developed for high-speed ADCs and can be implemented using different approaches. In this technique a trimming signal is applied to compensate for the input referred offset by either imposing counter imbalance or threshold tuning. In both approaches, the offset is not sampled. Therefore external circuits have to determine the amount of correction either by linearly searching or statistically equalizing the output codes of the data converter. The trim value is stored in a register and converted to current, voltage, or charge with extra digital-to-analog converters (DACs). An example for the digitally controlled trimming including the external circuits is shown in Figure 3-2 [32]. The correction circuits are composed of: DACs that are responsible for supplying the comparator with the linear trimming signals, control monitoring the comparator outputs and controlling the DACs, memory circuit that is



Figure 3-2. Digitally controlled offset trimming [32].

responsible for storing the final trim values that are determined for minimum comparator offset, and logic that is responsible for the linear search operation.

There are two main drawbacks in this technique. The first drawback is related to correction DACs that they have to tradeoff among the correction DAC step size, calibration time, power consumption, silicon area, and the correction resolution. If the correction step size (which is the amount of offset voltage reduced per one LSB increment of the DAC trim signal) is set small by using high resolution DACs to achieve fine correction resolution, then the calibration time will be longer. Longer the calibration time, larger the power the correction circuits consumed. Also, high resolution DACs requires large silicon area. On the contrary, if the correction step size is set large using low resolution DACs to have a course correction resolution, then the correction will take shorter calibration time, the external correction circuits will consume small power and silicon area.

The second drawback is related to the linearity of the correction. Although the DAC trimming step is set to 1-LSB, the correction effect of each step is not constant over the whole

offset range. This is due to the nonlinear relation between the DAC trimming signal and the amount of offset reduction which leads to a nonlinear correction process.

Figure 3-3 shows an example for correcting an offset voltage of a comparator using the digitally controlled trimming technique. In this example two different DACs, such that DAC1 is of 4 bit resolution and DAC2 is of 5 bit resolution are used. Assuming a linear calibration over the entire offset range, the two stair case signals produced by the DAC1 and DAC2 have different step sizes and amount of offset voltage cancelled for every 1 LSB increment/decrement of the trimming signal.



Figure 3-3. Digitally controlled trimming example.

The comparator has a positive static offset voltage of  $(V_{off-i})$ . Both DACs will start calibration at time  $(T_0)$ . Each DAC will increment/decrement the trimming signal by 1 LSB at every correction clock while monitoring the comparator outputs. Thus each 1 LSB will decrease the offset  $(V_{off-i})$  by  $\Delta V_1$  and  $\Delta V_2$  for DAC1 and DAC2, respectively. Since the LSB size of the 4-bit DAC is larger than that of the 5-bit one,  $\Delta V_1$  will be larger than the  $\Delta V_2$ . This

process will continue reducing the initial offset until the comparator outputs flip polarity from "10" to "01". The comparator outputs flip polarity in one of two cases; either the final offset voltage is reduced to within the meta-stability region ( $V_{inH} < V_{off-F1,2} < V_{inL}$ ) which would be the best achievable correction result, or the final offset voltage is larger than the meta-stability region ( $V_{off-F1,2} > V_{inL}$ ). The final value of the offset voltage after correction will determine the precision of the correction process. The more closer the final offset voltage ( $V_{off-F1,2}$ ) to the meta-stability, the more precise the correction will be. As shown in Figure 3-3, DAC1 and DAC2 will finish the calibration at time T<sub>1</sub> and T<sub>2</sub> respectively, such that (T<sub>1</sub> < T<sub>2</sub>) with a final offset voltage value of ( $V_{off-F1}$ ) and ( $V_{off-F2}$ ) where ( $V_{off-F1} > V_{off-F2}$ ).

From the previous example we can come to the conclusion that using a trimming DAC of a high resolution will correct the offset more precisely than using a lower resolution DAC. However, using a trimming DAC with high resolution will correct the offset slower than using a low resolution DAC. Also using a high resolution trimming DAC means large area and power consumption for correction electronics. Therefore there is a tradeoff among how precise the correction process and the speed, power, and area of the correction.

Based on the type of the trimming signal applied to the comparator under correction, we can list the most commonly used techniques as explained below.

#### 3.3.3 Threshold Tuning Technique using Bulk Voltage Trimming

In this technique, the trimming voltage signals are applied to the bulks of the input transistors (M2, M3) of the comparator as shown in Figure 3-4. Since, no additional devices are connected to the input transistors, no extra capacitive loading is introduced in the analog



Figure 3-4. Bulk trimming correction of PMOS input pair.

signal path which makes the bulk voltage trimming technique more suitable for high speed ADCs [15] [32].

This technique is mainly dependent on imposing a counter imbalance by tuning the threshold voltage of one of the two input transistors in the comparator, based on the polarity of the dc offset voltage. Tuning the threshold voltage will directly change the drain current of the transistor, reducing the offset voltage imposed by the devices mismatch.

Equation ( 3.3 ) shows the relation between the threshold voltage of a PMOS transistor ( $V_{thp}$ ) and the trimming source to bulk voltage signal ( $V_{TRIM}$ ).

$$|V_{thp}| = |V_{thp0}| + |\gamma| \left( \sqrt{2\phi_f + |V_{TRIM}|} - \sqrt{\phi_f} \right)$$
(3.3)

Where  $V_{thp0}$  is the threshold voltage for  $V_{TRIM}=0V$ ,  $\gamma$  is the body-effect coefficient, and  $\phi_f$  is the Fermi level. Normally for a PMOS transistor the bulk terminal is tied to the source terminal or the supply voltage ( $V_{DD}$ ) which is the highest voltage in the circuit.

If channel length modulation is ignored, equation (3.4) shows the drain current equation of a PMOS transistor ( $I_{DSp}$ ), where  $\mu_p$  is the hole mobility,  $C_{ox}$  is the transistor gate oxide capacitance, W and L are the transistor's channel width and length, and  $V_{SG}$  is source to gate voltage of the transistor.

$$I_{DSp} = \frac{1}{2} \mu_p C_{ox} \frac{W}{L} (V_{SG} - |V_{thp}|)^2$$
(3.4)

By increasing the bulk voltage above the source voltage, the threshold voltage ( $V_{thp}$ ) will increase, which will in turn cause a reduction in the drain current ( $I_{DSp}$ ). The drawback of this technique is that it can only be applied to a PMOS transistor if N-Well CMOS process is used or a triple-well CMOS process has to be used which makes it process dependent. Moreover, this technique requires an extra power supply for the bulk trimming DACs that is larger than the circuit's supply voltage. This is to ensure that the source-bulk junction is always reverse biased for normal transistor operation. Moreover, equation (3.4) is a nonlinear equation which means that the offset correction process will be a nonlinear function of the trimming source-bulk voltage. Since the offset correction is nonlinear, the offset calibration accuracy in terms of DAC LSB size is of an average accuracy.

#### 3.3.4 Current Trimming Technique using Shunt Devices

In this technique, calibration is achieved by adding two extra transistors ( $M_6$  and  $M_7$ ) in parallel to input transistors ( $M_2$  and  $M_3$ ) as shown in Figure 3-5. The trimmed drain currents of  $M_6$  and  $M_7$  are added to the drain currents of the input transistors. Therefore, slew rate of the comparator is no longer dependent only on the input voltages, but is also dependent on the amount of trimming currents. By trimming the gate voltages ( $V_{G1}$  and  $V_{G2}$ ) of transistors  $M_6$  and  $M_7$ , the effective offset voltage of the comparator can be changed [46].

The main drawback of this technique is that adding extra transistors  $M_6$  and  $M_7$  will add extra capacitive loading on the analog signal path and thus decreasing the comparator speed. However, this can be minimized by using minimum size transistors. Moreover, as could be inferred from equation ( 3.4 ), the nonlinear relation between the added currents of transistors  $M_6$  and  $M_7$  and the trimming gate-source voltages  $V_{GS1}$  and  $V_{GS2}$  will cause the correction to be nonlinear. Also this technique will increase the power dissipation due to the static drain currents of transistors  $M_6$  and  $M_7$  during the normal comparator operation, if it is used in continuous time comparators.



Figure 3-5. Current trimming scheme using shunt devices.

#### 3.3.5 Capacitive Trimming Techniques

In this technique, digitally controlled capacitance loading of comparator nodes is used for correcting the comparator offset. A binary weighted array of variable capacitors is added to the output nodes of the input differential pair as shown in Figure 3-6 [47]. Applying different capacitive loads to nodes  $N_A$  and  $N_B$  will effectively change the comparator's input reference voltage and correct for the input referred offset voltage. The variable capacitors can also be implemented using MOSFET varactor by shorting the drain and source terminals while controlling its gate voltage.

The main drawback of this technique is the extra capacitances load in the signal path that limits the operating frequency of the comparator. Moreover, the nonlinear characteristics between the gate voltage of the MOSFET varactor and its capacitance value will affect the linearity and range of the calibration.



Figure 3-6. Capacitance trimming scheme, [47].

## 3.4 Proposed Coarse-Fine-Calibration (CFC) Technique

Based on the different foreground digitally controlled trimming techniques that was discussed in the previous sections, we can come to the conclusion that tradeoff exist between the trimming DACs resolution and how precise is the calibration process. Moreover, all of the available techniques are nonlinear over the full offset range.

One of the goals of this research was to develop a hybrid calibration architecture that can guarantee precise calibration or comparators and ADCs that might be compatible to high resolution trimming DACs but with shorter calibration time, smaller silicon area, smaller power consumption, and linear calibration over large offset ranges. The proposed coarse-finecalibration (CFC) technique that achieves these goals are explained in the next section

The CFC technique is implemented by sub-ranging the calibration process into two steps using hybrid calibration schemes as shown in Figure 3-7 [48]. In the first step; the calibration process starts at time (T<sub>0</sub>) with an initial positive dc offset value (V<sub>offi</sub>). A digitally controlled bulk voltage trimming is used to perform the coarse calibration. The trimming signal is generated using a low resolution coarse-DAC of LSB size of ( $\Delta V_c$ ) and with a reference voltage larger than the comparator's supply voltage.

The coarse calibration process continues reducing the offset until the comparator changes its polarity at time ( $T_1$ ). At this time the dc offset is reduced to a lower negative dc offset of  $V_{off-c}$ . At this point, the coarse calibration process ends, the coarse-DACs holds the trimming signal, and then the fine calibration process starts.

At time  $(T_1)$ , the second step starts during which a digitally controlled current trimming using shunt devices is used to perform the fine calibration. A fine trimming signal is



Figure 3-7. CFC calibration technique operation and timing diagram, [48].

generated using a secondary low-resolution fine-DAC of LSB size of  $(\Delta V_f)$ . The fine calibration process continues reducing the negative dc offset  $(V_{off-c})$  until the comparator changes its polarity again at time (T<sub>2</sub>). The entire calibration process ends with a final reduced positive dc offset of  $(V_{off-F})$ .

Three questions arise about implementing this technique on silicon. The first question is: How are we going to have a two-step calibration applied to a single comparator? The second question is: How are we going to have step-2 fine calibration using a low resolution fine-DAC? The third question is: How are we going to have a linear calibration process over the entire offset range? Answers to these questions are given below in details.

#### 3.4.1 Circuit Implementation of the Proposed CFC Technique

#### 3.4.1.1 *Comparator Architecture*

A high-speed dynamic latched comparator is designed in order to investigate the effectiveness of the proposed CFC calibration technique. The circuit diagram of the comparator is shown in Figure 3-8. The comparator is inspired from [49], but it has been modified to increase the comparator sensitivity [48].



Figure 3-8. Dynamic latched comparator with enhanced resolution, [48].

The comparator comprises of three stages: the first stage is a clocked differential pair formed by transistors,  $M_1$ - $M_5$ . It provides extra gain amplifying the differential input signals,  $V_{ip}$  and  $V_{in}$ . The second stage is a gain and reset stage formed by the common source

transistors,  $M_6$  and  $M_7$ . The third stage is basically a cross-coupled dynamic latch formed by transistors,  $M_8$ - $M_{13}$ . The outputs of the first stage,  $V_{op1}$  and  $V_{on1}$ , are used as a reset clock and as a driving signal for the third stage through transistors  $M_6$ - $M_9$ . The design is a single-phase clocked dynamic latched comparator and thus reducing the design complexity when compared to other two-phase clock dynamic latched comparators as in [50].

When the clock signal (CLK) is asserted high, the comparator goes in the "reset phase" such that  $V_{op1}$  and  $V_{on1}$  nodes are reset low through M4 and M5 while  $V_{on2}$ ,  $V_{op2}$ ,  $X_1$ , and  $X_2$  nodes are reset high through transistors M<sub>6</sub>, M<sub>7</sub>, M<sub>8</sub>, and M<sub>9</sub> respectively, in order to eliminate the memory effect of the decision circuit. Transistors M<sub>6</sub> and M<sub>7</sub> help reducing the kick back noise caused by the cross coupled inverters in stage three. When CLK goes low the comparator is in the "decision phase". Transistors M<sub>12</sub>, M<sub>13</sub>, and tail transistor M<sub>1</sub> are turned ON, while both  $V_{op1}$  and  $V_{on1}$  nodes charge up with different slew rates that are proportional to the input signals,  $V_{in}$  and  $V_{ip}$ , respectively. Then the signal  $V_{op1}$  and  $V_{on2}$  and  $V_{op2}$  discharges with different slew rates driving the cross coupled latch to make the proper decision. The cross coupled inverter that uses positive feedback regenerate the output to a rail-to-rail signal. The two outputs  $V_{op2}$  and  $V_{on2}$  are then buffered using the tapered buffers to reduce the noise and to be able to drive large capacitive loads without increasing the propagation delay.

The CFC technique using two step hybrid calibration schemes is implemented on the comparator as shown in Figure 3-9 [48] During the first step, a digitally controlled bulk voltage trimming technique is used to perform the coarse calibration. The threshold tuning is achieved by applying calibration signals to the bulks of the two input devices  $M_2$  and  $M_3$ 

directly without introducing any extra capacitive loading to the analog signal path. A coarse trimming signal is generated using a low resolution DAC and applied directly to one of the two bulk terminals  $B_1$  or  $B_2$ . For every 1 LSB change on the DAC output, the threshold voltage ( $V_{th}$ ) of one input transistor ( $M_2$  or  $M_3$ ) will change accordingly, which in turn will cause a change in its drain current as a counter imbalance to reduce the offset.



Figure 3-9. Dynamic latched comparator with CFC, [48].

During the second step, a digitally controlled current trimming technique using shunt devices is used to perform the fine tuning. The counter imbalance approach is applied by adding two extra devices  $M_{14}$  and  $M_{15}$  in parallel to the comparator's second stage outputs. A fine trimming signal is generated using another low resolution DAC and applied directly to one of the gates of transistors  $M_{14}$  and  $M_{15}$  through the gate terminals  $G_1$  or  $G_2$ . Minimum

device sizes are used for  $M_{14}$  and  $M_{15}$  to reduce the capacitive loading of the dynamic latch. For every 1 LSB change in the DAC output, the drain current of transistor  $M_{14}$  or  $M_{15}$  will change accordingly, which in turn will cause a change on the charge/discharge slew rate of node  $V_{on2}$  or  $V_{op2}$  as a counter imbalance to reduce the offset.

By implementing the coarse trimming on the first stage and the fine on the second, we could be able to use the hybrid structure on a single comparator. As a result, this approach made the CFC technique applicable to any dynamic comparator structure.

Although a low-resolution DAC was used for generating the fine calibration signal, the offset correction achieved is same as if a single high-resolution DAC is used. This is because by introducing it into the comparator's second stage, its input referred effect will be amplified by a ratio equals to the gain of the comparator's second stage amplifier [48].

Figure 3-10 shows the layout of the proposed dynamic comparator with CFC implemented in the AMS 0.35µm process.



Figure 3-10. Layout of the dynamic comparator with CFC in AMS 0.35µm process.

#### 3.4.1.2 Supplementary Circuits for CFC Technique Implementation

Figure 3-11 shows the proposed comparator with CFC and the supplementary circuits for proper offset correction algorithm operation. The calibration is initiated by the user triggering the start calibration signal (STC). At this point, the control logic shorts the two comparator inputs  $V_{ip}$  and  $V_{in}$  to a common mode voltage while keeping all of the four trimming ports  $B_1$ ,  $B_2$ ,  $G_1$ , and  $G_2$  shorted to the supply voltage  $V_{dd}$ . Due to the processvoltage-temperature (PVT) variations and devices mismatch, the comparator will encounter



Figure 3-11. Dynamic comparator with CFC and the supplementary circuits.

an input referred offset and its output will have a certain polarity of "10" or "01" based on the direction of offset. Then the control logic circuit enables the course calibration by setting the (CE) signal to logic high which in turn will enable the counter and the latches.

Based on the output polarity, the control logic circuit will short one of the two bulk trimming terminals  $B_1$  or  $B_2$  to the output of DAC1 while keeping the other terminal shorted to  $V_{dd}$  using one of the two signals ENC1 or ENC2. At this point the counter starts counting and its digital output is transferred to the DAC1 input through the coarse calibration latches.

The output of DAC1 will increase 1 LSB every clock cycle while the control logic is monitoring the comparator output, OP. At the clock edge that OP flips polarity; the control logic circuit disables the course calibration by setting the (CE) signal to logic low which in turn disables the counter, and the coarse calibration latches storing the last counter digital code that caused OP to change polarity.

At the next clock edge, the control logic resets the counter using signal RST. For the next clock edge, the control logic enables the fine calibration by setting the FE signal to logic high which in turn will enable the counter and the fine calibration latches. Based on the output polarity, the control logic circuit will short one of the two bulk trimming terminals  $G_1$  or  $G_2$  to the output of DAC2 while keeping the other terminal shorted to  $V_{dd}$  using one of the two signals ENF1 or ENF2. At this point the counter starts counting and its digital output is transferred to the DAC2 input through the latches. The output of DAC2 will decrease 1 LSB every clock while the control logic is monitoring the comparator output OP again. At the clock edge that OP changes polarity again; the control logic disables the fine calibration by setting the FE signal to logic low which in turn disables the counter, and the fine calibration latches stores the last counter digital code that caused OP to change polarity.

#### 3.4.2 **Post-Layout Simulation Results of CFC Technique**

To guarantee a linear calibration process over full offset range, we had to predict the linear range by simulating both the digitally controlled bulk trimming calibration and the current trimming calibration used in the proposed CFC technique. For each technique, we apply a 1-LSB change at a time using a high resolution DAC while monitoring its effect on offset cancellation.

Figure 3-12 shows the post layout simulation for the coarse bulk trimming voltage applied to one of the bulk terminals based on the offset direction and its effect on offset cancellation. It shows that the bulk trimming can compensate for an offset range of  $\pm 150$ mV



Figure 3-12. Simulated bulk trimming voltage versus offset voltage.
with a linear normalized range from approximately 10mV to 150mV. This linear calibration range makes it suitable for CFC as it operates as coarse trimming for large offset voltages, correcting large offsets to a minimum value less than 10mV while achieving a linear correction process as required.

Figure 3-13 shows the post layout simulation for fine current trimming gate voltage applied to one of the gate terminals of the shunt devices based on the offset direction and its effect on offset cancellation. It is shown that the current trimming calibration can compensate for an offset range of  $\pm 150$ mV but with a linear normalized calibration range from approximately the comparator resolution to 10mV only while being nonlinear over the rest of



Figure 3-13. Simulated shunt trimming gate voltage versus offset correction.

the range. The linear calibration range makes it very suitable for CFC as it operates as fine trimming for small offset voltages, correcting small offsets to a minimum value to within the comparator's meta-stability range while being linear.

From the previous results, we can come to the conclusion that using the bulk trimming scheme in the comparator's first stage and using the current trimming scheme in the comparator's second stage in the CFC technique, we have a full linear offset correction range from  $\pm 150$  mV to within the comparator's meta-stability range.

Before fabrication, we should have an estimate of the offset range based on the variations of the used process which is the AMS 0.35µm process. Therefore, a post layout Monte Carlo simulation was done for 1000 offset sample to get the distribution of the offset variations. The Monte Carlo simulation result is shown in Figure 3-14. The simulation



Figure 3-14. Post layout Monte Carlo simulation of 1000 offset sample.

showed that the offset variation in this process is of a normal distribution with a mean of -70.5mV and a standard deviation ( $\sigma$ ) of 56mV. The calibration circuits have been design to accommodate for this static offset voltage range.

The post layout simulations for the designed comparator with CFC technique applied are shown in Figure 3-15. This simulation shows the effectiveness of using the CFC technique over the other conventional high-resolution techniques. The designed dynamic comparator that was explained in section 3.4.1.1 was used in this simulation along with the traditional bulk trimming technique with a trimming DAC of 8-bit resolution, and the CFC with two trimming DACs each of 4-bit resolution. At each run, a static DC offset was imposed to the



Figure 3-15. Post layout simulation of bulk trimming calibration (A) versus CFC (B).

comparator under test and then the test process goes on three steps. The first step is recording the initial static dc offset before the correction (X-axis in Figure 3-15). The second step is running the conventional bulk trimming calibration with the 8-bit trimming DAC and recording the final dc offset after the correction (A). The third and final step is running the CFC calibration and recording the dc offset after correction (B). The simulation results shows that any static offset voltage in the range of  $200\mu$ V to 100mV will be corrected to an average offset value of -2.17mV for the bulk trimming as shown in (A), and an average offset value of 140  $\mu$ V for the second scheme (B) with a 15X ratio of improvement for the proposed CFC technique over the single-stage calibration.

#### 3.4.3 Measurement Results of CFC Technique

The CFC technique was implemented and fabricated in AMS 2P3M, 3.3V, 0.35µm CMOS process. An array of 4 comparators with different orientations has been fabricated such that each comparator will have a different offset variation. The proposed CFC occupies an area of (265.65µm X 56.05µm) of the total AMS chip area as shown in Figure 3-16. The fabricated chip micro graph is shown in Figure 3-17.

A full-custom test bench was designed and fabricated for testing the fabricated IC composing of the comparators with CFC. The PCB containing the AMS chip and two 16-bit resolution DACs and other supplementary circuits is shown in Figure 3-18.

The test bench is composed of a personal computer (PC) workstation, the custombuild PCB with supplementary ICs, an FPGA auxiliary board, and the fabricated chip plugged on a socket as shown in Figure 3-19.



Figure 3-16. AMS chip layout showing the comparators with CFC.



Figure 3-17. Micrographs of the fabricated chip in AMS  $0.35 \mu m$  CMOS process.



Figure 3-18. Designed custom PCB (front and back) for testing.



Figure 3-19. CFC test setup.

The PC workstation has a user interface program that allows monitoring and controlling the calibration process. The control and monitoring signals are transmitted from the PC to the test chip or other auxiliary ICs on the PCB through the USB2 interface and

managed by the FPGA board. A firmware running on FPGA board manages both communication with the PC and the generating the control signals for the test PCB components and the test chip.

Figure 3-20 shows the static offset variation range for 24 comparator samples from six different chips, four comparators in each chip. The measurements show a mean offset of 52mV and a standard deviation of 32.07mV. The maximum reported offset value was 118.154mV and the minimum reported offset value was 9.443mV.



Figure 3-20. Measured offset variation samples.

Figure 3-21 shows the bulk trimming voltage applied to one of the bulk terminals based on the offset direction and its effect on offset cancellation. It is shown that the bulk trimming calibration can compensate as much as 105mV with a linear calibration range (linear curve fit with  $R^2$ =99.8%) from approximately 10mV to 105mV. This linear calibration range almost matches the post layout simulation results.



Figure 3-21. Measured bulk trimming voltage versus corrected offset voltage.

Figure 3-22 shows the current trimming voltage applied to one of the gate terminals of the shunt devices based on the offset direction and its effect on offset cancellation. It is shown that the current trimming calibration can compensate for an offset range of up to 110mV. As expected, the calibration down to 10mV offset is fairly nonlinear (power series fit with  $R^2$ =98.1%), while the compensation of small offset voltages is very linear (linear fit with  $R^2$ =98%). The linear calibration range almost matches the post layout simulation results.

Figure 3-23 shows the measurement results of the offset voltages before and after correction by using the single digitally controlled bulk trimming scheme that uses 10-bit DAC resolution and the proposed CFC technique that uses two 4-bit DACs for coarse and fine calibration steps. It shows how effective each scheme is compensating for the offset voltages



Figure 3-22. Measured shunt trimming gate voltage versus corrected offset.



Figure 3-23. Measured offsets levels before and after correction using bulk trimming only with 10bit DAC resolution versus CFC with 4-bit coarse and 4-bit fine DACs.

up to 100mV. The single digitally controlled bulk trimming is implemented using 6 bit, 8 bit, and 10 bit resolution DACs, and only 10-bit is shown on Figure 3-23.

It is clear from measurements that the CFC is very competitive with the calibration that uses 10-bit resolution DAC for bulk trimming only scheme, and is better than any bulk trimming with a DAC resolution of less than 8 bits. Table 3-1 shows the comparison summary for the proposed CFC technique versus the single bulk trimming technique with different resolutions.

|                                | CFC                          | Single stage bulk trimming |            |            |  |
|--------------------------------|------------------------------|----------------------------|------------|------------|--|
| DAC resolution                 | 4-bit coarse<br>+ 4-bit fine | 10-bit                     | 8-bit      | 6-bit      |  |
| DAC size*                      | 32                           | 1024                       | 256        | 64         |  |
| Area                           | 1X                           | 32X                        | 8X         | 2X         |  |
| Number of calibration clocks** | 1X                           | 32X                        | 8X         | 2X         |  |
| Power                          | 1X                           | 32X                        | 8X         | 2X         |  |
| Calibration Linearity          | Linear                       | Non linear                 | Non linear | Non linear |  |
| FoMc***                        | 1X                           | 32768X                     | 512X       | 8X         |  |

 Table 3-1.

 Comparison summary between CFC and bulk trimming with various DAC resolutions

\* Based on schemes using resistive ladder or capacitive charge redistribution trimming DACs.

\*\* Based on worst case maximum offset range.

\*\*\* FoMc (Figure of merit calibration) = Area X Time X Power.

The CFC scheme uses less number of DAC components (resistors or capacitors) that makes it 32 times smaller, 32 times faster, and 32 times lower power, while performing linear calibration over the full scale offset correction range when compared to the single stage bulk trimming case with 10-bit DAC resolution. This is based on only the DAC size. If we consider the other auxiliary correction circuits, the CFC would outperform other techniques too. For example using 4-bit resistive ladder DAC requires 32 resistors, 4-bit tree decoder (30

transistors), 4-bit counter (4 D-type flip-flops, DFF), and a 4-bit register (4 DFF). A 5-bit resistive ladder DAC requires 64 resistors, 5-bit tree decoder (62 transistors), 5-bit counter (5 DFF), and 5-bit register (5 DFF). As we increase the trimming DAC resolution, the consumed area and power will grow dramatically making the calibration process useless.

A figure of merit was defined to quantify the correction efficiency (FoMc) and cost as the multiplication of the area, correction time, and total power consumption. The CFC outperforms the others with same correction capabilities with several orders as listed on the Table 3-1.

#### 3.4.4 Summary

A novel course-fine-calibration (CFC) technique has been introduced. The actual measurements showed superior performance when compared to the available digitally controlled trimming techniques that uses high-resolution trimming DACs. The new technique gives the ability to have a very precise offset correction almost equivalent to a single stage correction technique with 10-bit DAC by implementing two different trimming techniques on two steps hybrid structure, such that each step uses a 4-bit trimming DACs. Also, it provides a linear correction over the full offset range.

# CHAPTER 4 – High Speed, Energy Efficient SAR ADCs

Successive approximation register (SAR) is the most energy efficient ADC topology; however it is limited to medium to low speed applications. The limitation is due to the incomplete settling of the DAC, and the required high-speed sampling clock. Thus the speed of a standard SAR ADC structures is limited to 100MS/s. In this section, more advanced SAR architectures are introduced and a novel architecture is proposed to improvement of SAR ADCs speed further.

### 4.1 Synchronous Versus Asynchronous SAR Processing

The SAR ADC architectures could be divided into synchronous processing [51] and asynchronous processing [52]. The synchronous processing types typically use an internal high-frequency clock to divide the conversion phases into equally timed slots as the conversion proceeds from the MSB to LSB. For them, if the targeted bit resolution is N bits, then the internal clock frequency should be at least (N+1) times the sampling frequency of the ADC,  $f_s$ . Equation (4.1) gives the delay time for a strong-Arm dynamic latched comparator.

$$t_{comp} = \frac{2C_L V_{th}}{I_{tail}} + \frac{C_L}{g_m} Ln \left( \frac{1}{V_{th}} \sqrt{\frac{I_{tail}}{2\beta}} \frac{\Delta V_{out}}{\Delta V_{in}} \right)$$
(4.1)

Where  $C_L$  is the capacitive load of the comparator,  $I_{tail}$  is the tail current of the differential input stage,  $V_{th}$  is the transistor's threshold voltage,  $g_m$  is the trans-conductance of the input transistor,  $\beta$  is a technology related parameter,  $\Delta V_{out}$  is the rail-to-rail output range, and  $\Delta V_{in}$ 

is the input difference. Equation (4.1) reveals that among the SAR components, only the comparator delay ( $t_{comp}$ ) dependents on the input signal magnitude.

For the SA algorithm, there is only one quantization step that will lead to a very long comparison delay ( $t_{comp}$ ) when the sampled signal level is within  $\frac{1}{2}$  LSB difference from the reference voltage. Thus, the comparator will have different conversion delay for every conversion step during conversion phase. Therefore, the internal clock frequency must be relaxed to ensure that every bit conversion is fully completed.

On the contrary, the asynchronous processing does not require any internal clock, and clocking is handled dynamically such that every bit quantization is triggered by the previous quantization event. In [53], it has been proven that in SAR ADCs, the asynchronous processing achieves the fastest conversion speed for large inputs. Also, as the number of required bit resolution increases, the asynchronous processing may run as much as two times faster than that of the synchronous processing.

The main drawback of the asynchronous processing emanates from the comparator meta-stability, which makes the comparator spend an unbounded time on resolving a small input difference. To avoid this issue, the comparator resolution and regeneration speed must be increased which in turn will increase its power consumption or some other digital calibration techniques can be used to correct the output digital code error caused in the case of meta stability [54].

## 4.2 Asynchronous SAR ADC

In 2006, the first conventional asynchronous SAR architecture using a single comparator and a charge redistribution network was introduced as shown in Figure 4-1 [54].

The advantages of this architecture are that it utilizes only a single comparator which makes the implementation less complex similar to the conventional SAR ADC. Another benefit of using a single comparator is that it doesn't have to correct for the offset voltage in the analog domain, and the offset can still be subtracted in the digital domain. Moreover, the charge redistribution capacitor network is used to sample the input signal and serves as a DAC for creating and subtracting the reference levels required for the SA algorithm. Thus, sample and hold circuit is not needed. Each comparison result is stored in an SR latch which acts as a temporary bit buffer, while the comparator outputs are detected by a ready signal generator as a conversion completion flag of each comparison cycle. This ready signal is then used to drive the sequencer to provide the asynchronous clocks for switching logic and the SR latch. The pulse generator is used to create the reset phase for the comparator.



Figure 4-1. Asynchronous 6-bit SAR ADC using single comparator, [54].

The major disadvantage of this topology is that the overall conversion speed is slowed down because the comparator must rest after every conversion cycle. Also, similar to the conventional SAR architecture, the bottle neck in this asynchronous architecture is the digital logic delay during operation of the SA algorithm. After the comparator quantizes, the digital output must be sent into a digital SAR logic (Switch Logic block as shown in Figure 4-1) first before being sent to the capacitive network DAC for setting the charge before the next bit quantization. As pointed in [55], this logic gate delay would occupy up to 75% of the cycle time, thus limiting any possibility for increasing the speed in the same process.

The loop unrolled SAR architecture was first introduced in [56], as shown in Figure 4-2. Unlike the asynchronous single comparator SAR ADC that uses a single comparator followed by digital logic to determine, store, and transfer the comparison results, this architecture uses N number of comparators for N-bit conversion, latching each comparison result into the digital output of each comparator. The digital outputs are connected to two different paths: The first path is directly fed to the capacitive network DAC, thus the DAC can respond immediately to the outputs and generate the SA analog reference without being delayed by any digital logic. The second path is to a digital clock generator that generates the asynchronous clock after it detects the completion of the current quantization and then generates a "ready signal" starting with the MSB and ending with the LSB.

The advantages of this architecture are that all comparators are set to reset mode at the same time during the sampling phase, no additional digital logic is required for the SA process and SAR logic is distributed generating timing signals for the next SA step. Also, next comparator clock delays could be adjusted stage by stage allowing optimization of the speed of the overall ADC.



Figure 4-2. Loop-unrolled asynchronous SAR ADC and its timing diagram, [56].

The major disadvantage of this architecture is its critical signal path delay for every one-bit conversion as shown in Figure 4-3. The critical signal path delay ( $T_{critical}$ ) is determined by the time delay required for generating the asynchronous ready clock ( $t_{ready}$ ) which must be made longer than the DAC settling time ( $t_{DAC}$ ); otherwise the comparator's input reference voltage will not be settled at the edge of the quantization clock.

When a comparator is clocked by  $clk_i$ , the comparator generates its outputs Op < i> and On < i> after propagation delay time ( $t_{comp}$ ). They are then passed through the ready logic that experiences additional propagation delay ( $t_{ready}$ ) before generating and asynchronous clock



Figure 4-3. Critical path for one bit conversion.

signal ( $clk_{i-1}$ ) for the next stage. Addition of these two represents the critical time delay that needs to be less than the DAC settling time depicted in Figure 4-3 that could be given with the following equations:

$$T_{critical} = t_{comp} + t_{ready} \tag{4.2}$$

$$t_{ready} > t_{DAC} \tag{4.3}$$

Another disadvantage for this topology is that the capacitive DAC size increases exponentially with the required bit resolution. Thus, as the DAC size increases, the input signal bandwidth will be limited by the DAC capacitances plus the input parasitic capacitances of N number of comparators.

As also seen from Figure 4-2, N number of comparators are required for an N-bit ADC running in series. Thus, power consumption, area, and correction circuits will grow as the number of bits increased.

### 4.3 Proposed 8-bit, 4-Channel TI, ASAR-ADC

By investigating the disadvantages of the loop unrolled asynchronous SAR (ASAR) ADC topologies discussed in the previous section and developing unique solutions, we can achieve further improvement on the speed and operation performance of ASAR topologies. Equations (4.2) and (4.3) show that the key limiting factors for the conversion speed are: the comparator's propagation delay ( $t_{comp}$ ), the asynchronous logic propagation delay ( $t_{ready}$ ) assuming that the condition in equation (4.3) is true.

The time required to convert 1-bit in an ASAR ADC can be given as:

$$T_{one-bit} = T_{critical} = t_{comp} + t_{async-logic}$$
(4.4)

Where  $T_{one-bit}$  is the total time required for 1-bit conversion, ( $t_{async-logic}$ ) is the asynchronous logic propagation delay, and  $t_{comp}$  is the comparator's propagation delay. Therefore, the time required to convert N-bits can be given as:

$$T_{N-bit} = N * \left[ T_{one-bit} + t_{sample} \right]$$
(4.5)

Where  $(T_{N-bit})$  is the total time required for converting N-bits, (N) is the ADC resolution, and  $(t_{sample})$  is the time required for the ADC to sample the analog input properly. To increase the overall ADC speed, we need to decrease  $T_{one-bit}$  and  $t_{sample}$  as much as possible.

#### 4.3.1 Circuit Implementation

Figure 4-4 shows the proposed architecture of the 8-bit, single channel, ASAR-ADC. Two major issues of the loop unrolled asynchronous SAR ADC of [56] were addressed in this ADC topology.



Figure 4-4. Improved 8-bit, single channel, ASAR ADC architecture.

The first issue addressed is related to the speed. Three circuit modifications were proposed for speed improvement: (1) A new different asynchronous logic circuit was developed for decreasing logic delay ( $t_{async-logic}$ ), (2) different DAC structure was used for reducing both ( $t_{DAC}$ ) and ( $t_{sample}$ ), and (3) the speed is further improved by placing the single-channel ASAR ADC in a four-channel time-interleaved (TI) configuration.

The asynchronous logic circuit composes of cascaded XNOR and NOR gates in the original design [56]. The proposed topology uses three different asynchronous logic circuits as shown in Figure 4-4. The *digital core0* is used for the MSB, *digital core2* is for the LSB, and *digital core1* is for the remaining 6-bits. The *digital core0* is composed of two-input NOR and OR gates, while *digital core2* is of two NOR gates and an inverter, and *digital core1* is of two NOR gates. Since each bit of the original loop unrolled SAR architecture of [56] uses 20 transistors (for the XNOR and NOR gates) a total of 160 transistors is required to build for 8-bit resolution ASAR ADC. The proposed asynchronous logic requires 10 transistors in both *digital core0* and *digital core2*, while only 8 transistors in *digital core1* resulting in a total of 68 transistors for 8-bit resolution. Assuming delay is related to the number of transistors, it is expected more that 2X reduction of delay time in the proposed asynchronous logic compared to the original design.

Choice of the DAC topology directly effects the settling time as well as the sampling time. A binary-weighted capacitive DAC topology was used in the original design [56]. There are two alternative topologies available to the original DAC topologies that are the C-2C and R-2R. For the C-2C topology, the DAC size increases linearly with the required bit resolution, therefore consumes less area and energy than a binary-weighted DAC. Moreover, the capacitor sizes are fixed, making the switches see the same capacitive load for every

conversion, therefore making the switch design as well as their layout easier. Also, there is no need for charging and discharging large capacitors which makes the C-2C topology settles much faster than the binary-weighted capacitive DAC.

However, in contrast to the binary-weighted DAC, the C-2C DAC has a major drawback due to the parasitic capacitances at the interconnecting nodes inside the DAC as shown in Figure 4-5. The parasitic capacitances at theses nodes change the capacitor ratios and change the radix, which if not calibrated using additional calibration circuits will cause a sever limitation to the ADC accuracy. Also, the radix is not only modified but also becomes bit-dependent, making the calibration a challenging process. Many techniques were used for calibrating such errors [57] [58], but adding more extra calibration circuits that consumes energy and needs extra calibration time will limit the ADC speed.



Figure 4-5. Parasitic capacitance issue in the C-2C DAC topology.

For the above mentioned reasons, an R-2R segmented DAC structure with current switching was implemented in the improved ASAR-ADC. Using R-2R segmented DAC structure allows the usage of a separate sample-and-hold circuit instead of using the top plates of the charge redistribution capacitive DAC capacitors. In the conventional loop unrolled

ASAR; the analog input signal is sampled using 48 times the unit capacitance ( $C_u$ ) as shown in Figure 4-2. In the proposed structure, the analog input signal is sampled using the unit capacitance ( $C_u$ ) as long as its capacitance satisfies the kT/C noise limits of the input. This means that the sampling time is reduced almost 48 times. Moreover, using the R-2R segmented DAC structure with current switching makes it easier to increase the number of bit resolution, such that for every extra bit we need a binary weighted current source which is mainly a MOS transistors, as long as the current matching issue is resolved.

Overall speed quadruples if four of them integrated and operated in time-interleaved fashion. Only penalty would be the increased power consumption which increases linearly with the number of TI channels and associated silicon area.

The second issue addressed is related to the overall performance of the ASAR topology. The performance improvement is achieved by implementing programmable delay blocks in the asynchronous clock signal generation path as shown in Figure 4-4. Although adding extra delays in the asynchronous signal path means a reduction in the overall speed, but it is very important having them to make sure monotonic operation. In the conventional loop unrolled SAR; it will be difficult to guarantee that the condition in equation (4.3) is satisfied for every bit conversion, because of the process variations. That is why, the designer should run thousands of Monte Carlo simulations to estimate the maximum DAC settling time, and then design the asynchronous logic with a fixed delay per bit conversion such that it is larger than the maximum DAC settling to guarantee proper operation. Also, adding an extra delay as a safety margin will be a good design practice. In the Proposed technique, a variable programmable delay is added to (N-1) bits such that to guarantee that every comparator is

asynchronously clocked after the DAC is settled properly. These delays can be programmed on a bit by bit bases, with an option of being almost with zero delay.

### 4.3.1.1 Dynamic Latched Comparator

Eight dynamic comparators are used in the proposed 8-bit improved loop unrolled asynchronous SAR. In ASAR ADCs, the comparators are the main contributors for power consumption. Therefore, a comparator structure that does not consume static power was used. Figure 4-6 shows the schematic of the dynamic latched comparator used.



Figure 4-6. Dynamic latched comparator schematic.

The regenerative latch formed by transistors  $M_8$ - $M_{11}$  is in the pre-charge mode when CLK is high to eliminate the comparator's memory effect, while it is in the compare mode when CLK is low. Transistor  $M_{14}$  has been added to the comparator to reduce the effect of kick back noise which affects the ADC's overall performance. By controlling the gate voltage (Pbias) of transistor  $M_{14}$ , we can reduce the kick back noise by limiting the current flow through  $M_{14}$ . Although limiting the drain current of transistor  $M_{14}$  reduces the kick back noise, it also slows down the comparator speed.

#### 4.3.1.2 8-bit, R-2R Segmented DAC

Figure 4-7 shows the architecture of the R-2R segmented DAC used in the proposed ADC. The DAC consists of two sections, the first section is an R-2R voltage mode DAC with unit current sources for the first six LSBs, this architecture was first implemented by Bernard M. Gordon [59]. The second section is the segmented current DAC with binary weighted current sources for the two MSBs. In this architecture the output impedance of the DAC is equal to 2R. It is resistive, not capacitive as in C-2C topology which does not limited speed as severely. That is why R-2R structures are used in high speed ADC applications [60].



Figure 4-7. 8-bit, R-2R segmented DAC architecture.

#### 4.3.1.3 *Time-Interleaved 4-channel ADC Structure*

Figure 4-8 shows the architecture of the designed four-channel TI ADC structure. Each channel is an exact replica of the proposed 8-bit ASAR ADC shown in Figure 4-4. A four phase clock generator is designed to generate the required four phase clock timing ( $\Phi$ 1,  $\Phi$ 2,  $\Phi$ 3, and  $\Phi$ 4) for proper time interleaving operation. Each channel operates at one fourth the clock frequency (Clock). Each channel samples the analog input signal using its own sample-and-hold circuit. The sampled signal is then digitized by the proposed asynchronous ADC structure, and then 8-bits are latched with a data ready flag "Latch" generated for synchronization purposes. The data bytes (Data1, Data2, Data3, and Data4) are then multiplexed to form the digitized form of the analog input at an effective sampling frequency that is four times the sampling speed of a single-channel ADC.



Figure 4-8. Four channels TI, 8-bit, ASAR architecture.

#### 4.3.1.4 Four Phase Clock Generator

Figure 4-9 shows the schematic diagram and timing diagram of the four phase clock generator that is used in the TI structure. The circuit generates four different clock phases ( $\Phi_1$ ,  $\Phi_2$ ,  $\Phi_3$ , and  $\Phi_4$ ), each with an effective frequency of one-fourth of the master clock (CLK) frequency.



Figure 4-9. (a) Four phase clock generator schematic, (b) timing diagram.

### 4.3.1.5 *Chip Layout*

The proposed 8-bit, 4-channel TI, ASAR ADC structure was implemented in the 2P3M, 3.3V, 0.35µm CMOS process. Figure 4-10 shows the chip layout such that (a) shows the four channel, 8-bit, time interleaved structure of the proposed ADC including the clock

generator and the supplementary circuits. The complete architecture occupies an area of  $(1\text{mm X } 291\mu\text{m})$  of the total AMS chip area. Figure 4-10 (b) shows the layout of single channel of the proposed ASAR ADC which includes the R-2R DAC, the (8X1) comparator array, the asynchronous logic circuits, the programmable delays, memory blocks, and the scan chain. The single channel occupies an area of (244.3 $\mu$ m X 170.4 $\mu$ m).



Figure 4-10. (a) Chip layout showing the 4-channels TI AADC, (b) layout of a single channel.

### 4.4 Simulation Results

#### 4.4.1 Single Channel Simulation Results

The proposed 8-bit, ASAR ADC was designed in the 3.3V AMS 0.35µm CMOS process. The simulations were performed at a sampling frequency of 25MHz. Simulation

results show that the proposed ADC achieves a peak SNDR of 49.1dB. Figure 4-11 shows the simulated DNL and INL of the proposed ASAR. The simulated DNL was 0.23/-0.21 LSB, and the INL was 0.12/-0.26 LSB. The proposed ASAR consumes a total power of 9.725mW in the comparators, R-2R DAC, and the clock drivers.



Figure 4-11. Simulated DNL and INL of the proposed ASAR ADC.

Figure 4-12 shows 2048 point FFT spectrum of the proposed ASAR ADC for a 0.8V peak-to-peak sine wave input at 1MHz. The SNDR and SFDR for this input frequency are 49.1dB and 63.8dB, respectively, achieving ENOB of 7.86 bit. Table 4-1 summarizes the simulation results of the proposed ASAR ADC performance.



Figure 4-12. Frequency spectrum of the proposed 8-bit ASAR ADC for 1MHz sine input.

| renormance summary of the proposed riskit ribe. |        |                       |        |         |              |            |  |
|-------------------------------------------------|--------|-----------------------|--------|---------|--------------|------------|--|
| Process                                         | Supply | Sampling<br>Rate (fs) | SNDR   | Power   | FoM          | Resolution |  |
| 350nm                                           | 3.3    | 25M                   | 49.1dB | 9.725mW | 1.67pJ/Conv. | 8 bits     |  |

Table 4-1. Performance summary of the proposed ASAR-ADC

### 4.4.2 Four-Channel Simulation Results

Four of the proposed 8-bit, ASAR ADC were placed in a time interleaved structure and simulated. The auxiliary circuits including four-phase clock generator and multiplexing logic were also included in the simulation. The simulations were performed at a sampling frequency of 100MHz. Figure 4-13 shows the simulated DNL and INL of the proposed 8-bit ASAR-ADC. The simulated DNL was 0.23/-0.21LSB, and the INL was 0.2/-0.27LSB. The proposed ASAR ADC consumes total power of 42.79mW in the four ADC channels, four-phase clock generator, and the clock drivers.



Figure 4-13. Simulated DNL and INL of the proposed 4-channels TI, ASAR ADC.

Figure 4-14 shows 2048 point FFT spectrum of the proposed 8-bit, 4-channels TI ASAR ADC for a 0.8V peak-to-peak 1MHz sine wave input sampled at 100MHz. The SNDR and SFDR for this input are 48.8dB and 59.76dB, respectively, achieving ENOB of 7.83 bits. Table 4-2 summarizes the simulated performance of the proposed TI ASAR ADC.



Figure 4-14. Spectrum of the proposed 4-channels, 8-bit ASAR ADC for 1MHz sine input.

Table 4-2.Performance summary of the proposed 4-channels TI, ASAR-ADC.

| Process | Supply | Sampling<br>Rate (fs) | Power   | Resolution | SNDR   | FoM          |
|---------|--------|-----------------------|---------|------------|--------|--------------|
| 350nm   | 3.3    | 100M                  | 42.79mW | 8 bits     | 48.8dB | 1.56pJ/Conv. |

## 4.5 Measurements Results

As a proof of concept, the proposed 8-bit, four channels time-interleaved ASAR-ADC was taped-out in 2P3M, 3.3V, AMS 0.35µm CMOS process.

### 4.5.1 Chip Micrographs

Figure 4-15 shows the micrograph of the fabricated chip with the proposed ADC structure highlighted. The ADC occupies (1mm X 291 $\mu$ m) of the total chip area.



Figure 4-15. AMS chip micrograph showing the proposed 4-channel, TI ASAR ADC.

### 4.5.2 Measurement Setup

The test board developed for comparator testing was also capable of testing individual and TI ASAR ADCs. In this test, the FPGA and the USB2 communication was used to control the ADC setting. A mixed-signal oscilloscope (Agilent MSO-X 2024A) with 2GS/s sampling speed and 1MB storage memory was used. Digital outputs of the single and TI ASAR ADCs are captured and stored in the scope and transferred to PC for analysis. A MATLAB code is used for post processing of the captured data. Figure 4-16 shows the actual measurement setup for testing the ASAR-ADC.

### 4.5.3 Measurements Results

The ASAR-ADC and all of the supplementary circuits, and ESD PADs consumes 160µA from a single 3.3V power supply. The FFT spectrum of a 0.7V peak to peak sine wave

analog input at 13.46 KHz sampled at 0.5MHz sampling clock is shown in Figure 4-18. Table 4-3 summarizes the overall ADC performance.



Figure 4-16. Measurement setup for the ASAR-ADC.



Figure 4-17. FFT spectrum of the proposed ASAR ADC.

#### Table 4-3.

Measured performance summary of the proposed TI-ASAR-ADC.

| Process | Supply | Sampling<br>Rate fs | Resolution | SNDR    | Power   | FoM      |
|---------|--------|---------------------|------------|---------|---------|----------|
| 350nm   | 3.3    | 0.5M                | 8 bits     | 31.6 dB | 0.53mW* | 34pJ/con |

\* The measured power includes the total ADC power as well as the power dissipated by all other supplementary components like the analog and digital I/O PADS.

## 4.6 Summary

An 8-bit improved loop-unrolled asynchronous SAR ADC was designed and fabricated in 0.35µm 2P3M, 3.3V CMOS process. Individual ASAR ADC could run as fast as 25MS/s sampling rate. A 4-channel time-interleaved (TI) ADC structure containing improved ASAR ADCs was also fabricated as a proof of concept. Post layout simulations confirmed 100MS/s with 7.83-bit ENOB operation.

# **CHAPTER 5** – Asynchronous Binary Search ADCs (ABS)

As discussed in the previous sections, most digital ultra-wide band (UWB) receivers require an ADC at their front end to digitize the high-frequency RF signal. This ADC for UWB applications is preferred to be power efficient in order to extend the lifetime of batterypowered portable devices. Moreover, most software defined radio (SDR) receivers also require a front-end ADC to digitize various bandwidth signals. Based on the radio application, the required ADC resolution also varies. This ADC is expected to be power efficient and its architecture should possess flexibility towards both speed and resolution.

For UWB and SDR applications, flash ADCs have been used due to their supreme data rate and low-latency. However, power consumption of the flash ADC is limited by the number of active comparators that exponentially increase with the ADC resolution as discussed in section 2.3.1. On the other hand, SAR ADCs are considered to be the most power efficient ADC topologies, but they achieve this at the cost of reduced data rate and high latency because of their sequential mode of operation and settling issues as discussed in section 2.3.2. Thus, the research presented in this chapter focused towards innovating new ADC topologies that can combine the high-speed of a flash ADC with the energy efficiency of SAR ADCs. Employing the asynchronous processing technique made it possible to implement such ADC topologies. One of those new topologies is based on the binary search algorithm. The proposed binary search ADC can be considered as a hybrid structure combining the flash ADC and the successive approximation ADC architectures while utilizing the asynchronous conversion processing. Different asynchronous binary search ADC topologies were introduced, and discussed in the following sub-sections in details.

## 5.1 Comparator Based Asynchronous Binary Search ADC (CABS)

In 2008, the comparator-based asynchronous binary-search (CABS) ADC was the first reported ADC that has a structure similar to the flash ADC while utilizing asynchronous processing based on the binary search algorithm of the SAR ADC [61]. For a better understanding of the technique and the operation of the binary search ADC, consider the circuit shown in Figure 5-1 which is a 3-bit CABS ADC assuming  $V_{ref}$ =1 V.



Figure 5-1. 3-bit implementation of CABS ADC architecture [61].
Similar to the flash ADC, the CABS architecture uses  $2^{N-1}$  comparators and resistors. Unlike the parallel structure of the comparators in flash ADC, the  $2^{N-1}$  comparators are placed on a binary search tree structure in CABS ADC. The CABS topology asynchronously triggers only one comparator per resolved bit to execute the binary search algorithm. In this architecture, the single comparator of stage 1 is clocked by the synchronous global clock Clk.

During the quantization phase of Clk; the comparator compares the input signal with zero to determine the MSB bit  $D_2$ . Based on the first comparator's output, one of the two comparators of stage 2 will be triggered asynchronously. If the analog input was larger than zero, then the input will be compared to  $\frac{1}{2}$ , otherwise it will be compared to  $-\frac{1}{2}$ . Bit  $D_1$  will be generated using the asynchronous logic as shown in Figure 5-1 based on the outputs of the two comparators in stage two. The asynchronous logic in this topology is an OR gate. Similarly, based on the outputs of the comparators in stage 2, one of the four comparators in stage 3 will be triggered asynchronously, closing in on the input signal. Based on the digital outputs of the comparators in stage 3, the LSB bit  $D_0$  will be generated by the asynchronous logic.

During the reset phase of the global Clk, the analog input signal is sampled, and the first comparator in stage 1 will be pre-charged, and thus asynchronously pre-charging the remaining comparators, stage-by-stage like dominos.

CABS ADC has many advantages over the flash and SAR ADCs. Since only N comparators are clocked in this architecture per conversion, the power consumption is significantly reduced compared to the flash ADC with parallel processing topology. Moreover, since only the first comparator of stage 1 is clocked synchronously by the global

clock and all other internal clocks are generated asynchronously, the high-speed clock generation and routing requirements of the conventional SAR ADCs vanish.

However, there are many drawbacks for using the CABS ADC topology. Despite lower power consumption, circuit complexity and silicon area increases exponentially with the resolution. Even though the CABS ADC conversion speed per bit is not limited to the DAC settling time as in the SAR ADC, the overall conversion speed is still limited to N comparator delays plus the asynchronous logic delays. Hence, the CABS ADC is roughly N+1 times slower than the flash ADC.

## 5.2 Asynchronous Binary Search-ADC with Reduced Comparator Count

In 2010, a new asynchronous binary-search ADC topology was proposed in [62]. This asynchronous BS-ADC is based on the CABS topology and utilizes a new technique called reference range prediction. This structural modification reduces the number of required comparators from  $2^{N-1}$  to 2N-1 comparators.

Figure 5-2 shows the architecture of a 3-bit asynchronous BS ADC with reference range prediction. Similar to the CABS ADC, global clock signal CLK is applied to the first comparator only. The output signals of the first comparator are the trigger signals for the 2<sup>nd</sup> stage comparators. Once the first comparator makes the decision, it will trigger one of the 2<sup>nd</sup> stage comparators. The output of the comparator in the first stage also serves as the control signal for the reference switching network for the third stage. The reference switching network is mainly a 4 to 1 multiplexer with two bits control. There are four possible reference levels for the LSB, if the MSB is 1 which means that the input is in the range of values larger



Figure 5-2. 3-bit Asynchronous BS-ADC with reference range detection [62].

than half the reference voltage, then only 5/8 and 7/8 are the possible references for the LSB comparisons. The selected reference is then connected to the LSB comparators via the multiplexer.

The main advantage of this ADC topology is the reduced number of comparator. It only uses 2N-1 comparators unlike the CABS ADCs using  $2^{N-1}$  comparators. Therefore, the number of comparators will grow linearly with the bit resolution. Cost of reduced comparator count is the complicated reference switching network design. The reference switching network will grow exponentially with the bit resolution. More control signals will be necessary for the LSB stages, which will result in complicated switching network design [62]. For a clear view for this issue, consider the 5 bits version of this topology in Figure 5-3.



Figure 5-3. 5-bit Asynchronous BS-ADC with reference range detection [62].

Every bit requires two comparators, except for the MSB which requires a single comparator only (Comp 4). Only three comparators (Comp 4, Comp3H, and Comp 3L) in the first two stages ( $B_4$  and  $B_3$ ) have their reference voltages directly from the reference ladder, while the comparators in the remaining stages ( $B_2$ ,  $B_1$ , and  $B_0$ ) are connected to their references indirectly through multiplexers as shown in Figure 5-3. The third stage comparators (Comp 2H, Comp 2L) connect to their reference voltages indirectly through a 2 to 1 multiplexer with a two bits control ( $B_4$  and  $B_3$ ). The fourth stage comparators connect to their reference voltages through an 4 to 1 multiplexer with three bits control ( $B_4$ ,  $B_3$  and  $B_2$ ). For the LSB  $B_0$ , comparators will connect to their reference voltages through 8 to 1 multiplexer with a four bits control ( $B_4$ ,  $B_3$ ,  $B_2$ , and  $B_1$ ).

The reference switching network formed by the multiplexers and comparators outputs will grow exponentially with the resolution. The exponential growth and design complexity is shifted from the analog domain to the digital domain. This structure still uses large silicon area. Therefore, this ADC topology will not be feasible for resolution more than 5-bits just like the CABS. Thus it is not suitable for SDR applications.

# 5.3 Proposed Asynchronous Binary Search with Indirect Reference Shifting ABS-IRS ADC

By investigating the different asynchronous binary search ADCs, we can conclude that the classical CABS ADC in [61] suffers from the exponential growth of the number of comparators with increased resolution. Also the asynchronous BS with reference range detection ADC in [62] suffers from the exponential growth of the reference switching network with the accuracy. Other asynchronous binary search ADCs reported so far have not solved this exponential component count growth problem. For instance in [63], an asynchronous BS-ADC with 2 bit flash quantizers were proposed. Even though this topology can operate two times faster than the CABS, it still requires 2<sup>N</sup>-1 comparators. Therefore, in all of the previously reported asynchronous BS ADCS, the resolution was limited to 6 bits only.

The proposed asynchronous binary search with indirect reference shifting ABS-IRS ADC addresses the problem of the exponential growth of the required number of components with resolution.

Figure 5-4 shows the block diagram of the proposed 8-bit ABS-IRS ADC. The ADC is composed of four stages. Each stage quantizes two bits. Stage 1 is composed of asynchronous logic block, global sample and hold circuit, and three comparators  $C_1$ ,  $C_2$ , and  $C_3$ . Comparator  $C_1$  is clocked by the global synchronous clock SHC, generating the MSB B<sub>7</sub>. The positive input for comparators  $C_1$ ,  $C_2$ , and  $C_3$  is connected to the appropriate reference node in the resistive ladder which is not shown in Figure 5-4 for simplicity. The negative

input for comparators  $C_1$ ,  $C_2$ , and  $C_3$  is connected to the output of the global sample and hold circuit  $V_{sh}$ . Comparator  $C_1$  outputs will trigger either comparator  $C_2$  or comparator  $C_3$ asynchronously. The asynchronous logic block is responsible for generating bit  $B_6$  and the asynchronous clock for the next stage which is in this case, stage 2.

Stage 2 uses three 4 to 1 multiplexers Mux-1, Mux-2, and Mux-3 for reference switching. They are controlled by two bits  $B_7$  and  $B_6$ , and their 12 inputs are connected directly to the resistive ladder. The multiplexers connect the negative inputs of comparators C4, C5, and C6 with the appropriate reference voltage. The positive inputs for comparators C4, C5, and C6 are connected directly to the output of the global sample and hold circuit V<sub>SH</sub>. Bits  $B_5$  and  $B_4$  are generated asynchronously by stage 2.

Stage 3 is similar to stage 1 except for two differences. The first difference is that it has an extra C-8C charge redistribution DAC. The second difference is that the positive inputs of comparators  $C_7$ ,  $C_8$ , and  $C_9$  are connected to the top plates of the capacitive DAC instead of the global sample and hold. During the sampling phase of the global clock SHC, the analog input signal is sampled by the global sample and hold circuit of stage 1 and the top plates of the capacitive DAC. Based on the four output bits  $B_7$ ,  $B_6$ ,  $B_5$ , and  $B_4$ ; the capacitive DAC will level shift the sampled input signal. Bits  $B_3$  and  $B_2$  are generated asynchronously by stage 3.

Stage 4 is similar to stage 2 except one difference. Unlike the comparators in stage 2, the positive inputs of comparators  $C_{10}$ ,  $C_{11}$ , and  $C_{12}$  are connected to the top plates of the capacitors in the DAC.

Figure 5-5 shows the conceptual architecture and timing diagram of the proposed 8-bit ABS-IRS ADC summarizing the concept of indirect reference shifting. The 4 stages in Figure 5-4 are divided into 8 slices, such that each stage is composed of two consecutive slices. Starting from the MSB  $B_7$  in slice 0 and ending with the LSB  $B_0$  in slice 7. Each slice represents the sequential processing that is executed every clock, such that one bit is evaluated per slice. Based on the binary search algorithm K<sup>th</sup> slice requires 2<sup>K</sup> comparison references.

After the analog input signal is sampled by the global sample and hold circuit and the capacitive charge redistribution DAC, the conversion phase starts with an undefined output digital byte (XXXXXXX). The first slice (slice-0) is triggered by the global synchronous clock. Slice-0 requires a single reference voltage ( $2^0$ ) to quantize the MSB B<sub>7</sub>. The MSB B<sub>7</sub> is quantized by comparator C<sub>1</sub> by comparing the sampled analog input with V<sub>ref</sub> /2 from the resistive ladder. After the MSB is quantized, the second slice (slice-1) is asynchronously triggered by one of the two outputs of comparator C<sub>1</sub>. Slice-1 is composed of two comparators C<sub>2</sub> and C<sub>3</sub> that are connected directly to the reference ladder and the global sample and hold circuit. Slice-1 requires two reference voltages ( $2^1$ ) to quantize bit B<sub>6</sub>. Based on the outputs of comparator C<sub>1</sub>, either the sampled analog input will be compared with 3 V<sub>ref</sub> /4 or V<sub>ref</sub> /4 V<sub>ref</sub>. After bit B<sub>6</sub> is generated, the third slice (slice-2) is triggered by the outputs of comparators C<sub>2</sub>, and C<sub>3</sub> Slice-2 is composed of a single comparator C<sub>4</sub> and a single 4 to 1 multiplexer Mux1.

Slice-2 requires four reference voltages  $(2^2)$  to quantize bit B<sub>5</sub>.The four inputs of Mux1 are connected to the reference ladder. Mux1 will connect comparator C<sub>4</sub> to the appropriate node on the reference ladder based on bits B<sub>7</sub> and B<sub>6</sub>. After bit B<sub>5</sub> is quantized, the fourth slice (slice-3) is asynchronously triggered.

Slice-3 is composed of two comparators (C5, and C6), and two 4 to 1 multiplexers (Mux2 and Mux3). Slice-3 requires eight reference voltages (23) to generate bit B4. The eight inputs of Mux2 and Mux3 are connected to the reference ladder. Mux2 and Mux3 will



Figure 5-4. Architecture of the proposed 8-bit ABS-IRS ADC.



Figure 5-5. Conceptual architecture and operation of the proposed 8-bit ABS-IRS ADC.

connect comparators C<sub>5</sub> and comparator C<sub>6</sub> to the appropriate nodes on the reference ladder based on bits  $B_7$  and  $B_6$ . After bit  $B_4$  is generated, the fifth slice (slice-4) is asynchronously triggered. Slice-4 is composed of a single comparator  $C_7$ . The positive input of comparator  $C_7$ is connected to the C-8C capacitive DAC. The other input is connected to 31  $V_{ref}$ /32 node of the reference ladder. Slice-4 requires 16 reference voltages  $(2^4)$  to generate bit B<sub>3</sub>. The top plates of the DAC holds the sampled analog input signal, while the bottom plates are controlled by the invert logic of the first four output bits as shown in Figure 5-5. By switching the bottom plates of the C-8C DAC between V<sub>ref</sub> and the virtual ground, the input is level shifted and compared with the constant reference voltage. For "0000" DAC control signals, the sampled analog input V<sub>in</sub> is not level shifted, and for "1111" control signals, the analog input is level shifted by an offset of 30  $V_{ref}$ /32. Consider the case C-8C DAC digital control is "1110" to demonstrate how this level shifting works. This means the first 4 bits were "0001" and the sampled analog input  $V_{in}$  is between  $V_{ref}/16$  and  $2V_{ref}/16$ . The next comparison in regular binary search algorithm should be comparing V<sub>in</sub> with 3V<sub>ref</sub>/32. The DAC shifts the analog input up to (V<sub>in</sub> + 28 V<sub>ref</sub> /32), and now the comparison will be (V<sub>in</sub> + 28 V<sub>ref</sub> /32 > 31  $V_{ref}$  /32), which is equivalent to ( $V_{in} > 3 V_{ref}$  /32). Without the capacitive DAC, the input of comparator C<sub>7</sub> would connect to one of the 16 possible nodes on the reference ladder. Using the capacitive DAC, comparator  $C_7$  is connected to a single fixed reference point.

Now after bit  $B_3$  is generated, the operation will continue with the same asynchronous processing as mentioned before. The sixth slice (slice-5) needs 32 reference voltages (2<sup>5</sup>) to generate bit  $B_2$ . Two comparators connected to 61 V<sub>ref</sub> /64 and 63 V<sub>ref</sub> /64 nodes on the reference ladder can do the comparison for the whole range since the signal input is shifted to

the correct sub range by the C-8C DAC. And the process will continue in a similar way for the last two slices (slice-6 and slice-7) generating the last two bits  $B_1$  and  $B_0$ .

The proposed ABS-IRS eliminates the addressed exponential growth of component count issue with the classical CABS or the other BS-ADCs. The proposed ADC uses only 12 comparators, six 4 to 1 multiplexers, and C-8C charge redistribution DAC with a total capacitance of  $16C_{unit}$ . If the resistive reference ladder and multiplexers were used without the capacitive DAC, slice-4 and slice-5 would require three 16 to 1 MUXs, slice-6 and slice-7 would require three 64 to 1 MUXs. Using a single 4 bit capacitive DAC as a level shifter for the last 4 slices, the exponential growth problem of the MUXs is eliminated. The ADC is similar to a pipeline structure. If a pipeline structure was used to build an 8 bit ADC using two 4-bit ADCs, a residual amplifier would amplify the residual signal of first 4 bits and compare the amplified signal with  $V_{ref}$ . In this ADC, the  $2^{nd}$  4bit ADC uses an effective reference of  $V_{ref}/16$  and compares the non-amplified residual signal against this reference.

The Proposed ADC computes the output digital bits in 8 clock cycles. The only concern for the proposed ADC is the DAC settling.  $8C_{unit}$  capacitor switches after the decision of comparator  $C_1$  and should settle before comparator  $C_7$  is triggered.  $C_{unit}$  capacitor is switched after the decision of comparator  $C_5$  or  $C_6$  and should settle before comparator  $C_7$  is triggered. Therefore, the DAC settling times are adjusted so that the corresponding comparator and logic delays are sufficient for each capacitor in the DAC to settle.

## 5.3.1 Circuit Implementation of the Proposed 8-bit, TI ABS-IRS ADC Architecture

#### 5.3.1.1 Track and hold circuit

Random variation of the sampling instance is known as aperture jitter or the sampling clock uncertainty. The aperture jitter is a particularly critical issue in high speed data converters. It mainly originates from the sampling clock generator phase noise and sampling circuit noise. As a result, if a periodic analog input signal is sampled periodically at the same value, slight variations in the hold value would occur, thus creating sampling error that limits the converter dynamic performance.

Therefore, digitizing high speed signals to a high resolution requires careful selection of a clock oscillator and track and hold circuit that will not degrade the sampling performance of the. Data converter [64] [65]. The signal-to-noise ratio is limited by clock jitter. The relation between SNR and clock jitter is given in equation (5.1).

$$SNR = -20 \log(2\pi f_{in} \Delta T_s) \tag{5.1}$$

Where  $f_{in}$  is the frequency of the sampled input signal, and  $\Delta T_s$  is the RMS value of the clock jitter.

Table 5-1 shows the clock oscillator stability requirements for a data converter with N bit resolution when an aperture error less than 0.5 LSB is required for the data converter with a sampling frequency of 100MHz.

Track and hold circuits can be categorized into two groups: active and passive. The passive track and hold is mainly a CMOS switch and a sampling capacitor. A series dummy CMOS switch is used to reduce charge injection to the sampling capacitor.

### Table 5-1.

| Resolution (N) | Stability (ppm) | $\Delta T_{s}$ (ps) |
|----------------|-----------------|---------------------|
| 6              | 5000            | 50                  |
| 8              | 1250            | 12.5                |
| 10             | 312.5           | 3.125               |
| 12             | 78.1            | 0.78                |

Maximum jitter  $\Delta T_s$ , for 0.5 LSB sampling uncertainty.

The active track and hold is similar to the passive one but with an added source follower at its output. For high speed ADCs, using the active track and hold will be very challenging because the source follower requires wide bandwidth at high sampling rates to be able to track the high speed input. Also the source follower limits the dynamic range of the input signal, especially in modern processes that utilize low supply voltage. Figure 5-6 shows a simplified schematic diagram of the single ended passive track and hold that was implemented in the ABS-IRS ADC.



Figure 5-6. Simplified schematic diagram of the track and hold circuit.

## 5.3.1.2 Dynamic Latched Comparator

Eight dynamic latched comparators with automatic offset calibration are used in the proposed 8-bits ABS-IRS ADC. In this ADC, comparators and the resistive ladder consume most of the power. Therefore, a comparator structure that does not consume static power is used in the ADC. Figure 5-7 shows the schematic of the designed dynamic latched comparator with the trimming transistors (calibration circuits are not shown in this section). The regenerative latch formed by transistors ( $M_{10}$ - $M_{13}$ ) is in the pre-charge mode when CLK is high to eliminate the comparator's memory effect, and it is in the quantize mode when CLK is low.



Figure 5-7. Dynamic comparator used in the ABS-IRS ADC.

It compares its inputs at the falling edge of CLK. Transistors  $M_{14}$  and  $M_{15}$  have been added to the comparator for compensating the offset voltage which affects the ADC overall performance. By controlling one of the gate voltages of transistors  $M_{14}$  and  $M_{15}$ , extra drain current is injected into one of the two nodes  $V_{op2}$  or  $V_{on2}$  which will counter-imbalance the current mismatch caused by the offset voltage.

## 5.3.1.3 C-8C Charge Redistribution DAC

Figure 5-8 shows the schematic diagram for the C-8C charge redistribution DAC.  $C_p$  is the parasitic capacitive load. The DAC is required for the reference indirect shifting for the last 4 LSBs, thus it should be accurate to 8 bit resolution. Carefully designed MOS switches were implemented to guarantee fast switching, while not imposing parasitic capacitive



Figure 5-8. C-8C charge redistribution DAC.

loading that might affect the DAC accuracy. Fixed unit capacitor  $C_{unit}$  was removed to compensate for the parasitic capacitive loading caused by the six comparator inputs connected to the top plates of the DAC. Moreover,  $V_{ref}$  applied to the capacitive DAC is trimmed in order to compensate for the gain error caused by the capacitive loading due to comparator parasitics.

### 5.3.1.4 Asynchronous Logic Block

The asynchronous logic block is connected to the four outputs of two comparators that are responsible for quantizing bit  $B_N$  as shown in Figure 5-9. The asynchronous logic block is responsible for three tasks. The first task is to trigger the next stage asynchronously for generating bit  $B_{N-1}$  using the asynchronous clock signal Aclk. The second task is to evaluate the required bit at that stage bit  $B_N$ . The third task is generating the control signals for the capacitive C-8C DAC using signal NOT ( $B_N$ ). This is limited to the first two stages only.



Figure 5-9. Asynchronous logic block schematic.

## 5.3.1.5 Chip Layout

The Proposed ABS-IRS 8-bits, four channels time interleaved structure was designed and taped out using the IBM 0.13µm cmrf8sf process. Figure 5-10 shows the layout of the proposed chip. The chip total area is 4mm<sup>2</sup>. The chip contains four channels of the proposed



Figure 5-10. Chip layout of the proposed 8b, 4 Channel TI, ABS\_IRS ADC in IBM 0.13µm.

ADC and 4-phase clock generator for the time interleaving. The four highlighted blocks in Figure 5-10 show the four channels. Each channel includes its own the resistive ladder, capacitive DAC, and the offset correction logic.

## 5.4 Simulation Results

#### 5.4.1 Single Channel Simulation Results

The proposed 8-bit, ABS-IRS ADC was designed in the IBM 130nm cmrf8sf with 1.4V supply, CMOS process. The simulations were performed at a sampling frequency of 100MHz. Simulation results show that the proposed ABS-IRS ADC achieves a peak SNDR of 49.08dB. Figure 5-11 shows the simulated DNL and INL of the proposed ABS-IRS ADC. The DNL was 0.29/-0.4LSB, and the INL was 0.37/-0.33LSB. The proposed ABS-IRS ADC consumes a total power of 0.15mW.



Figure 5-11. Simulated DNL and INL of the proposed ABS-IRS ADC.

Figure 5-12 shows 2048 point FFT spectrum of the proposed ABS-IRS ADC for a 0.7V peak-to-peak sine wave input at 1MHz. The SNDR and SFDR for this input are 49.08dB and 63.35dB respectively.



Figure 5-12. FFT of the proposed 8-bit, ABS-IRS ADC for 1MHz sine input.

Table 5-2 summarizes the proposed ABS-IRS ADC performance.

Table 5-2.

| Process | Supply | Sampling<br>Rate fs | Resolution | SNDR     | Power  | FoM        |
|---------|--------|---------------------|------------|----------|--------|------------|
| 130nm   | 1.4    | 100M                | 8 bits     | 49.08 dB | 0.15mW | 6.24fJ/con |

### 5.4.2 Four Channels Simulation Results

Four channels of the proposed 8-bit, ABS-IRS ADC were placed in a time interleaved structure. A four phase clock generator was also placed to generate the four global synchronous clocks. Also the multiplexing logic was placed that is responsible for organizing the four output bytes of four channels into a single channel output. The simulations were performed at a sampling frequency of 400MHz. Simulation results show that the proposed ADC achieves a peak SNDR of 39.98dB. Figure 5-13 shows the simulated DNL and INL of the proposed 8-bit ASAR-ADC. The simulated DNL was 0.36/-0.48LSB, and the INL was 0.34/-0.4LSB. The proposed ASAR ADC consumes total power of 0.667mW in the four ADC channels, four phase clock generator, and the clock drivers.



Figure 5-13. Simulated DNL and INL of the proposed 4-channels TI, ASAR ADC.

Figure 5-14 shows 2048 point FFT spectrum of the proposed 4-channels TI, 8-bit, ABS-IRS ADC for a 0.7V peak-to-peak sine wave input at 10MHz. The SNDR and SFDR for this input are 39.533dB and 52.07dB respectively.



Figure 5-14. Spectrum of the proposed 4-channels, 8-bit ABS-IRS ADC for 10MHz sine input.

Table 5-3 summarizes the proposed ABS-IRS ADC performance.

Table 5-3.

Performance summary of the proposed 4-channels TI, ABS-IRS ADC.

| Process | Supply | Sampling<br>Rate fs | Resolution | SNDR      | Power  | FoM        |
|---------|--------|---------------------|------------|-----------|--------|------------|
| 130nm   | 1.4    | 400M                | 8 bits     | 39.533 dB | 0.67mW | 21.5fJ/con |

# 5.5 Chip Micrograph

Figure 5-15 shows the micrograph of the fabricated chip.



Figure 5-15. IBM chip micrograph.

## 5.6 Summary

An 8-bit asynchronous binary search with indirect reference switching ABS-IRS ADC was designed and fabricated in IBM 130 CMRF8SF, 1.2V CMOS process. Individual ABS-IRS ADC could run as fast as 100MS/s sampling rate. A 4-channel time-interleaved (TI) ADC structure containing ABS-IRS ADCs was also fabricated. Post layout simulations confirmed 400MS/s with SNDR of 39.5dB and FoM of 21.5fJ/con operation. The actual measurements would be part of the future work.

The ABS-IRS ADC uses the least number of comparator (1.5N). It also breaks the exponential growth of the multiplexer size with resolution which makes it expandable. Table 5-4 shows a comparison summary between the 8-bits architecture of ABS-IRS ADC and the state-of-the-art binary search ADCs of [61] and [62] with 8-bits resolution architectures. The ABS-IRS ADC consumes 11 times smaller area than the CABS ADC, and 13 times smaller area than the ABS with reference switching.

| 8-bits Architecture |                          | This Work | ABS-Reference<br>switching* [62] | CABS [61]  |
|---------------------|--------------------------|-----------|----------------------------------|------------|
| No. of              | Comparators              | 12        | 15                               | 255        |
| C-                  | 8C DAC                   | 1         | -                                | -          |
| MUXs                | 2:1                      | -         | 2                                | -          |
|                     | 4:1                      | 6         | 2                                | -          |
|                     | 8:1                      | -         | 2                                | -          |
|                     | 16:1                     | -         | 2                                | -          |
|                     | 32:1                     | -         | 2                                | -          |
|                     | 64:1                     | -         | 2                                | -          |
| No. of MU           | JX Transistors**         | 176       | 4424                             | -          |
| Tot<br>Tran         | tal No. of<br>sistors*** | 356 (1x)  | 4649 (13x)                       | 3825 (11x) |

 Table 5-4.

 Comparison summary between ABS-IRS ADC and the state-of-the-art BS ADCs.

\* For single ended architecture.

\*\* Using the same multiplexer architecture of [62].

\*\*\* The total number of transistors in multiplexers and dynamic latched comparators, using a 15 transistor comparator architecture.

# **CHAPTER 6** – Conclusion and Future Directions

Recent progress in digital communications mostly used in UWB and SDR systems requires high speed, medium-to-high resolution, and power efficient data converters. Along with the breakthrough in digital signal processing, the demand for an all-digital transceiver that replaces the analog components with low-cost all-digital circuit elements has increased. For these reasons, in modern receiver architectures, the ADC is placed as close as possible to the antenna. This system level modification places strict performance metric limitations on the ADC's speed, linearity, power consumption, and accuracy. Therefore, the ADC should be able to operate at high speed, with medium-to-high resolution, and with optimized power consumption. Unfortunately, as the CMOS processes scale more into the deep sub-micron, achieving these performance requirements becomes more challenging, and require both system-level and circuit-level considerations while designing the ADC.

This thesis addresses the most important aspects of designing high speed, medium-tohigh resolution, and power efficient ADCs for ultra-wide band (UWB) and software defined radio (SDR) communication systems. This research investigated the ADC design challenges including high-speed operation, medium-to-high bit resolution realization, energy efficiency, ease of design, and flexibility towards programmability. Architectural improvements to the conventional energy efficient successive approximation register (SAR) and the high-speed binary-search ADCs have been introduced, improving their capabilities of sampling rates and flexibility for re-configurability. In loop-unrolled asynchronous SAR (ASAR) ADC, a new DAC scheme that improves the speed and flexibility for programmability was introduced. The proposed DAC scheme not only improves the speed and operation flexibility, but also reduces the total ADC input capacitance, which allows higher sampling speeds. Also, a new asynchronous processing scheme that improves speed, area, and energy efficiency was introduced. An 8-bit, 4-channels TI prototype was designed and fabricated in the AMS 0.35µm CMOS process as a proof of concept.

A new ADC architecture, named "asynchronous binary search with indirect reference switching ADC," or ABS-IRS ADC, was introduced in this thesis. The ABS-IRS ADC architecture implements the binary-search algorithm and quantizes two bits per stage. This novel architecture reduces the number of required components for every bit resolution increase by breaking the exponential growth. Also it uses a fewer number of comparators (1.5N), compared to the state-of-the-art binary-search ADCs which use (2N-1) comparators. Moreover, this architecture is more flexible for programmability. An 8-bit, 4-channels TI prototype was designed and fabricated in the IBM 130nm CMOS process as a proof of concept.

Finally, a novel offset correction technique, named "coarse-fine calibration," was introduced in this thesis. This proposed architecture implements the digitally controlled trimming on two stages, using hybrid structures. This offset correction technique reduces the circuit complexity, calibration time, and power consumption while being more accurate.

## 6.1 Future Directions

The proposed circuits and techniques presented in this thesis could be improved in the future by focusing on two tasks. The first task is to improve the ADC accuracy by including calibration for parasitic and mismatch in the capacitive and resistive networks, and for channel-to-channel mismatch in TI structures. Also, for high-speed ADC topologies,

calibration for clock jitter that severely affects ADC performance has to be developed, more specifically generating different phase clocks in TI structures. Moreover, there should be more effort spent on developing new calibration techniques to eliminate the comparator's kick-back noise and clock-feed, as they are key limiting factors for the ADC's speed and accuracy.

The second task is to improve the ADC architectures, targeting higher speeds and lower power consumption while maintaining or increasing the accuracy. For example, in the proposed asynchronous binary-search indirect reference switching (ABS-IRS) ADC, removing the 4-to-1 multiplexers would result in a topology that no switching occurs while ADC operation would improve speed, accuracy, and power consumption. Also, in the asynchronous successive approximation register (ASAR) ADC, for N-bits of resolution, the ASAR structure requires roughly N+1 comparator delays. Thus, still more speed improvement could be achieved by changing the ASAR ADC operation from sequential processing into pipelined processing.

# REFERENCES

- [1] E. Saberinia, A. H. Tewfik, K.-C. Chang, and G. E. Sobelman, "Analog to digital converter resolution of multi-band OFDM and pulsed-OFDM ultra wideband systems," in *First International Symposium on Control, Communications and Signal Processing*, pp. 787 790, 2004.
- [2] P. P. Newaskar, R. Blazquez, and A. R. Chandrakasan, "A/D precision requirements for an ultra-wideband radio receiver," in *IEEE Workshop on Signal Processing Systems* (SIPS), pp. 270 – 275, 2002.
- [3] "First report and order in ET docket no.98-153, 17 FCC Rcd 7435", April 22, 2002.
- [4] V. Giannini, J. Craninckx, S. D'Amico, and A. Baschirotto, "Flexible Baseband Analog Circuits for Software-Defined Radio Front-Ends," *IEEE J. Solid-State Circuits*, vol. 42, no. 7, pp. 1501 –1512, Jul. 2007.
- [5] I. Mitola J., "Software radios: Survey, critical evaluation and future directions," *IEEE Aerosp. Electron. Syst. Mag.*, vol. 8, no. 4, pp. 25–36, Apr. 1993.
- [6] H. Nyquist, "Certain topics in telegraph transmission theory," *Proc. IEEE*, vol. 90, no. 2, pp. 280–305, Feb. 2002.
- [7] R. M. Gray, "Quantization noise spectra," *IEEE Trans. Inf. Theory*, vol. 36, no. 6, pp. 1220–1244, Nov. 1990.
- [8] M. J. M. Pelgrom, A. C. J. Duinmaijer, and A. G. Welbers, "Matching properties of MOS transistors," *IEEE J. Solid-State Circuits*, vol. 24, no. 5, pp. 1433–1439, Oct. 1989.

- [9] K.-S. Tan, S. Kiriaki, M. de Wit, J. W. Fattaruso, C.-Y. Tsay, W. E. Matthews, and R. K. Hester, "Error correction techniques for high-performance differential A/D converters," *IEEE J. Solid-State Circuits*, vol. 25, no. 6, pp. 1318–1327, Dec. 1990.
- [10] A. Varzaghani, A. Kasapi, D. N. Loizos, S.-H. Paik, S. Verma, S. Zogopoulos, and S. Sidiropoulos, "A 10.3-GS/s, 6-Bit Flash ADC for 10G Ethernet Applications," *IEEE J. Solid-State Circuits*, vol. 48, no. 12, pp. 3038–3048, Dec. 2013.
- [11] J. Pernillo and M. P. Flynn, "A 1.5-GS/s Flash ADC With 57.7-dB SFDR and 6.4-Bit ENOB in 90 nm Digital CMOS," *IEEE Trans. Circuits Syst. II Express Briefs*, vol. 58, no. 12, pp. 837 –841, Dec. 2011.
- Y.-Z. Lin, C.-W. Lin, and S.-J. Chang, "A 5-bit 3.2-GS/s Flash ADC With a Digital Offset Calibration Scheme," *IEEE Trans. Very Large Scale Integr. VLSI Syst.*, vol. 18, no. 3, pp. 509–513, March 2010.
- [13] M. Steyaert, R. Roovers, and J. Craninckx, "A 100 MHz 8 bit CMOS interpolating A/D converter," in *Custom Integrated Circuits Conference*, 1993., Proceedings of the *IEEE 1993*, pp. 28.1.1–28.1.4, 1993.
- [14] R. C. Taft, C. Menkus, M. R. Tursi, O. Hidri, and V. Pons, "A 1.8-V 1.6-GSample/s
  8-b self-calibrating folding ADC with 7.26 ENOB at Nyquist frequency," *IEEE J. Solid-State Circuits*, vol. 39, no. 12, pp. 2107–2115, Dec. 2004.
- [15] J. Yao, J. Liu, and H. Lee, "Bulk Voltage Trimming Offset Calibration for High-Speed Flash ADCs," *IEEE Trans. Circuits Syst. II Express Briefs*, vol. 57, no. 2, pp. 110 –114, Feb. 2010.

- [16] B. P. Ginsburg and A. P. Chandrakasan, "Dual Time-Interleaved Successive Approximation Register ADCs for an Ultra-Wideband Receiver," *IEEE J. Solid-State Circuits*, vol. 42, no. 2, pp. 247–257, Feb. 2007.
- [17] Y.-C. Lien, "A 4.5-mW 8-b 750-MS/s 2-b/step asynchronous subranged SAR ADC in 28-nm CMOS technology," in 2012 Symposium on VLSI Circuits (VLSIC), pp. 88 89, 2012.
- [18] Y.-D. Jeon, Y.-K. Cho, J.-W. Nam, K.-D. Kim, W.-Y. Lee, K.-T. Hong, and J.-K. Kwon, "A 9.15mW 0.22mm2 10b 204MS/s pipelined SAR ADC in 65nm CMOS," in 2010 IEEE Custom Integrated Circuits Conference (CICC), pp. 1–4, 2010.
- [19] H. Wei, C.-H. Chan, U.-F. Chio, S.-W. Sin, U. Seng-Pan, R. Martins, and F. Maloberti, "A 0.024mm2 8b 400MS/s SAR ADC with 2b/cycle and resistive DAC in 65nm CMOS," in *Solid-State Circuits Conference Digest of Technical Papers (ISSCC)*, pp. 188–190, 2011.
- [20] K. Doris, E. Janssen, C. Nani, A. Zanikopoulos, and G. van der Weide, "A 480mW
   2.6GS/s 10b 65nm CMOS time-interleaved ADC with 48.5dB SNDR up to Nyquist," in *Solid-State Circuits Conference Digest of Technical Papers (ISSCC)*, pp. 180 –182, 2011.
- [21] T. B. Cho and P. R. Gray, "A 10 b, 20 Msample/s, 35 mW pipeline A/D converter," *IEEE J. Solid-State Circuits*, vol. 30, no. 3, pp. 166–172, Mar. 1995.
- [22] S. Sutarja and P. R. Gray, "A pipelined 13-bit 250-ks/s 5-V analog-to-digital converter," *IEEE J. Solid-State Circuits*, vol. 23, no. 6, pp. 1316–1323, Dec. 1988.

- [23] S.-W. M. Chen and R. W. Brodersen, "A 6-bit 600-MS/s 5.3-mW Asynchronous ADC in 0.13- CMOS," *IEEE J. Solid-State Circuits*, vol. 41, no. 12, pp. 2669–2680, 2006.
- [24] L. Kull, T. Toifl, M. Schmatz, P. . Francese, C. Menolfi, M. Braendli, M. Kossel, T. Morf, T. M. Andersen, and Y. Leblebici, "22.1 A 90GS/s 8b 667mW 64 X; interleaved SAR ADC in 32nm digital SOI CMOS," in *Solid-State Circuits Conference Digest of Technical Papers (ISSCC)*, pp. 378–379, 2014.
- [25] C.-C. Huang, C.-Y. Wang, and J.-T. Wu, "A CMOS 6-Bit 16-GS/s time-interleaved ADC with digital background calibration," in *IEEE Symposium on VLSI Circuits* (VLSIC), pp. 159–160, June 2010.
- [26] J. Fredenburg and M. Flynn, "A 90MS/s 11MHz bandwidth 62dB SNDR noiseshaping SAR ADC," in Solid-State Circuits Conference Digest of Technical Papers (ISSCC), pp. 468–470, 2012.
- [27] P. Harpe, C. Zhou, X. Wang, G. Dolmans, and H. de Groot, "A 30fJ/conversion-step
  8b 0-to-10MS/s asynchronous SAR ADC in 90nm CMOS," in *Solid-State Circuits Conference Digest of Technical Papers (ISSCC)*, pp. 388–389, 2010.
- [28] Y.-J. Chen, J.-H. Tsai, M.-H. Shen, and P.-C. Huang, "A 1-V 8-bit 100kS/s-to-4MS/s asynchronous SAR ADC with 46fJ/conv.-step," in *International Symposium on VLSI Design, Automation and Test (VLSI-DAT)*, pp. 1–4, 2011.
- [29] F. Akopyan, R. Manohar, and A. B. Apsel, "A level-crossing flash asynchronous analog-to-digital converter," in *IEEE International Symposium on Asynchronous Circuits and Systems*, p. 11 pp. –22, 2006.

- [30] R. Schreier and G. C. Temes, *Understanding Delta-Sigma Data Converters*, 1 edition. Piscataway, NJ : Hoboken, N.J. ; Chichester: Wiley-IEEE Press, 2004.
- [31] M. Vogels and G. Gielen, "Architectural selection of A/D converters," in *Design Automation Conference*, pp. 974–977, 2003.
- [32] Y. Xu, L. Belostotski, and J. W. Haslett, "Offset-corrected 5GHz CMOS dynamic comparator using bulk voltage trimming: Design and analysis," in *New Circuits and Systems Conference (NEWCAS)*, June, pp. 277–280, 2011.
- [33] H. Jeon, Y.-B. Kim, and M. Choi, "Offset voltage analysis of dynamic latched comparator," in *IEEE International Midwest Symposium on Circuits and Systems* (MWSCAS), pp. 1–4, 2011.
- [34] A. Nikoozadeh and B. Murmann, "An Analysis of Latch Comparator Offset Due to Load Capacitor Mismatch," *IEEE Trans. Circuits Syst. II Express Briefs*, vol. 53, no. 12, pp. 1398–1402, 2006.
- [35] R. Sarpeshkar, J. Wyatt, J.L., N. C. Lu, and P. D. Gerber, "Mismatch sensitivity of a simultaneously latched CMOS sense amplifier," *IEEE J. Solid-State Circuits*, vol. 26, no. 10, pp. 1413–1422, 1991.
- [36] P. Kinget and M. Steyaert, "Impact of transistor mismatch on the speed-accuracypower trade-off of analog CMOS circuits," in *IEEE Custom Integrated Circuits Conference*, pp. 333–336, 1996.
- [37] M. J. M. Pelgrom, H. P. Tuinhout, and M. Vertregt, "Transistor matching in analog CMOS applications," in *Electron Devices Meeting, IEDM Technical Digest.*, pp. 915– 918, 1998.

- [38] M. Choi and A. A. Abidi, "A 6-b 1.3-Gsample/s A/D Converter in 0.35-um CMOS," *IEEE J. Solid-State Circuits*, vol. 36, no. 12, pp. 1847–1858, 2001.
- [39] H. Okada, Y. Hashimoto, K. Sakata, T. Tsukada, and K. Ishibashi, "Offset Calibrating Comparator Array for 1.2-V 6-bit, 4-Gsample/s Flash ADCs using 0.13-µm Generic CMOS Technology," in *Solid-State Circuits Conference ESSCIRC*, pp. 711– 714, 2003.
- [40] I. Mehr and D. Dalton, "A 500-MSample/s, 6-bit Nyquist-rate ADC for disk-drive read-channel applications," *IEEE J. Solid-State Circuits*, vol. 34, no. 7, pp. 912–920, 1999.
- [41] C. C. Enz and G. C. Temes, "Circuit techniques for reducing the effects of op-amp imperfections: autozeroing, correlated double sampling, and chopper stabilization," *Proc. IEEE*, vol. 84, no. 11, pp. 1584–1614, 1996.
- [42] O. A. Hafiz, X. Wang, P. J. Hurst, and S. H. Lewis, "Immediate Calibration of Operational Amplifier Gain Error in Pipelined ADCs Using Extended Correlated Double Sampling," *IEEE J. Solid-State Circuits*, vol. 48, no. 3, pp. 749–759, 2013.
- [43] C.-C. Huang and J.-T. Wu, "A background comparator calibration technique for flash analog-to-digital converters," *IEEE Trans. Circuits Syst. Regul. Pap.*, vol. 52, no. 9, pp. 1732–1740, Sept. 2005.
- [44] Y. L. Wong, M. H. Cohen, and P. A. Abshire, "A floating-gate comparator with automatic offset adaptation for 10-bit data conversion," *IEEE Trans. Circuits Syst. Regul. Pap.*, vol. 52, no. 7, pp. 1316–1326, July 2005.

- [45] C. Donovan and M. P. Flynn, "A 'digital' 6-bit ADC in 0.25- mu;m CMOS," *IEEE J. Solid-State Circuits*, vol. 37, no. 3, pp. 432 –437, Mar. 2002.
- [46] B. Verbruggen, P. Wambacq, M. Kuijk, and G. Van der Plas, "A 7.6 mW 1.75 GS/s
  5 bit flash A/D converter in 90 nm digital CMOS," in *IEEE Symposium on VLSI Circuits*, pp. 14–15, 2008.
- [47] G. Van der Plas, S. Decoutere, and S. Donnay, "A 0.16pJ/Conversion-Step 2.5mW
   1.25GS/s 4b ADC in a 90nm Digital CMOS Process," in *Solid-State Circuits Conference, ISSCC Digest of Technical Papers*, p. 2310–, 2006.
- [48] I. Abougindia, I. Cevik, S. U. Ay, and F. N. Zghoul, "A fast two-step coarse-fine calibration (CFC) technique for precision comparator design," in *IEEE International Conference on Electronics, Circuits, and Systems (ICECS)*, pp. 153–156, 2013.
- [49] C.-H. Chan, Y. Zhu, U.-F. Chio, S.-W. Sin, S.-P. U, and R. P. Martins, "A reconfigurable low-noise dynamic comparator with offset calibration in 90nm CMOS," in *Solid State Circuits Conference (A-SSCC)*, pp. 233–236, 2011.
- [50] M. Abbas, Y. Furukawa, S. Komatsu, J. Y. Takahiro, and K. Asada, "Clocked comparator for high-speed applications in 65nm technology," in *Solid State Circuits Conference (A-SSCC)*, pp. 1–4, 2010.
- [51] J. J. Kang and M. P. Flynn, "A 12b 11MS/s successive approximation ADC with two comparators 0.13 mm CMOS," in *Symposium on VLSI Circuits*, pp. 240–241, June 2009.

- [52] P. Nuzzo, C. Nani, C. Armiento, A. Sangiovanni-Vincentelli, J. Craninckx, and G. Van der Plas, "A 6-bit 50-MS/s threshold configuring SAR ADC in 90-nm digital CMOS," in *Symposium on VLSI Circuits*, pp. 238–239, 2009.
- [53] S.-W. Chen and R. W. Brodersen, "A 6b 600MS/s 5.3mW Asynchronous ADC in 0.13μm CMOS," in *Solid-State Circuits Conference, ISSCC Digest of Technical Papers*, pp. 2350 –2359, 2006.
- [54] J. Yang, T. L. Naing, and R. W. Brodersen, "A 1 GS/s 6 Bit 6.7 mW Successive Approximation ADC Using Asynchronous Processing," *IEEE J. Solid-State Circuits*, vol. 45, no. 8, pp. 1469–1478, Aug. 2010.
- [55] Z. Cao, S. Yan, and Y. Li, "A 32mW 1.25GS/s 6b 2b/step SAR ADC in 0.13μm CMOS," in Solid-State Circuits Conference,ISSCC Digest of Technical Papers, pp. 542–634, 2008.
- [56] T. Jiang, W. Liu, F. Y. Zhong, C. Zhong, K. Hu, and P. Y. Chiang, "A Single-Channel, 1.25-GS/s, 6-bit, 6.08-mW Asynchronous Successive-Approximation ADC With Improved Feedback Delay in 40-nm CMOS," *IEEE J. Solid-State Circuits*, vol. 47, no. 10, pp. 2444 –2453, Oct. 2012.
- [57] L. Cong, "Pseudo C-2C ladder-based data converter technique," *IEEE Trans. Circuits Syst. II Analog Digit. Signal Process.*, vol. 48, no. 10, pp. 927–929, Oct. 2001.
- [58] S. P. Singh, A. Prabhakar, and A. B. Bhattcharyya, "C-2C ladder-based D/A converters for PCM codecs," *IEEE J. Solid-State Circuits*, vol. 22, no. 6, pp. 1197– 1200, Dec. 1987.
- [59] G. B. M and T. R. P, "Signal conversion apparatus," US3108266 A, 22-Oct-1963.

- [60] D. Seo, A. Weil, and M. Feng, "A 14 bit, 1 GS/s digital-to-analog converter with improved dynamic performances," in *IEEE International Symposium on Circuits and Systems,ISCAS*, vol. 5, pp. 541–544 vol.5, 2000.
- [61] G. Van der Plas and B. Verbruggen, "A 150MS/s 133μW 7b ADC in 90nm digital CMOS Using a Comparator-Based Asynchronous Binary- Search sub-ADC," in *Solid-State Circuits Conference, ISSCC Digest of Technical Papers*, pp. 242 –610, 2008.
- [62] Y.-Z. Lin, S.-J. Chang, Y.-T. Liu, C.-C. Liu, and G.-Y. Huang, "An Asynchronous Binary-Search ADC Architecture With a Reduced Comparator Count," *IEEE Trans. Circuits Syst. Regul. Pap.*, vol. 57, no. 8, pp. 1829 –1837, Aug. 2010.
- [63] A. Mesgarani and S. U. Ay, "A 6-Bit 1GS/s asynchronous binary search ADC with 2 bit flash quantizers," in 2012 IEEE 55th International Midwest Symposium on Circuits and Systems (MWSCAS), pp. 1008–1011, 2012.
- [64] V. J. Arkesteijn, E. A. M. Klumperink, and B. Nauta, "ADC clock jitter requirements for software radio receivers," in *Vehicular Technology Conference*, VTC- IEEE 60th, vol. 3, pp. 1983–1985 Vol. 3, 2004.
- [65] G. Mitteregger, C. Ebner, S. Mechnig, T. Blon, C. Holuigue, and E. Romani, "A 20-mW 640-MHz CMOS Continuous-Time ADC With 20-MHz Signal Bandwidth, 80-dB Dynamic Range and 12-bit ENOB," *IEEE J. Solid-State Circuits*, vol. 41, no. 12, pp. 2641–2649, Dec. 2006.