Digital To Analog Conversion in High Resolution Audio - v2

Digital-to-Analog Conversion in
High Resolution Audio

by

Ivar Lkken, Ph.D.

i
Abstract

This book is based on the PhD study and thesis by the author. It describes theoretical and
simulation-based work on digital-to-analog conversion for high resolution audio. The
emphasis of the work has been exploration and clarification of issues of contention in
previous art. The work has resulted in six scientific papers published in international peer-
reviewed journals and conference proceedings, and these papers constitute the main
contribution. The papers are included as appendixes, whereas the preceding monograph serves
to provide the necessary background for understanding the results, and also their relevance in
an audio context. It should be noted that although the research primarily treats DA
conversion, the findings and conclusions are largely transferable also to AD conversion since
audio ADC performance is often limited by its usually compulsory feedback DAC.
The first paper, published in the Journal of the Audio Engineering Society, explores power
modulation of the quantization error and the need for dithering in delta-sigma modulators.
There has been a lot of dispute on this issue; previous publications having both argued that the
DSM is self-dithering and that it has the same dither requirements as a regular REQ. By
exploring noise power modulation in the baseband it is shown that even high order DSMs are
not self-dithering in the true sense, but that the adverse effects of quantization are reduced
when the loop filter is of high order. If the REQ is multi-bit the noise power modulation can
be made negligible compared to any practical levels of circuit noise.
The second paper, published in IEEE Transactions on Circuits and Systems Part II,
explores a class of DSM called non-overloading or NOL modulators. Designing the DSM to
be NOL is the only known way to guarantee stability for high order loops, and also the only
way to guarantee no quantization noise power modulation. The paper proves that NOL design
criteria are equivalent for OF and EF modulators, repudiating a claim of difference in a
previous publication, and also their equivalence for rounding and truncating quantizers.
Although the results are developed for a certain class of modulators, the methods are easily
generalized to any DSM design. It is found desirable to use a many-bit REQ since a NOL
DSM with good input swing is then allowed.
The third paper, presented at the 31
st
Conference of the Audio Engineering Society, shows a
useful utilization of the results developed in the second paper. Using a many-bit DSM is
desirable for several reasons, but will in straightforward implementation require a DEM
network of excessive complexity. A previously proposed method to circumvent this is to
segment the DAC and DEM using a dedicated Segmentation-DSM. Previous art has used
SDSMs with a FIR loop filter to ensure no DAC saturation, restricting the concept to very
non-optimal designs. This publication utilizes the NOL method to design IIR SDSMs with
significantly improved performance.
The fourth paper, submitted to Analog Integrated Circuits and Signal Processing, describes
the development of simplified estimates for DSM DAC errors. The mathematical treatment of
high order DSMs is exceedingly difficult, but simplifications and rules of thumb have been
developed that enable design engineers to make quite straightforward optimization of relevant
DSM parameters. A major drawback is that these approximations do not account for analog
error sources in the DAC and may therefore lead to unfortunate design choices. This paper
explores how common DAC errors depend on the DSM transfer function, and presents
extensions of known approximation methods to also include the impact the DSM has on DAC
waveform distortion. Again it is confirmed that using a many-bit DSM is advantageous, and
also that a conservative DSM design will make the DAC less susceptible to errors.

Abstract

ii
The fifth paper, presented at the 124
th
Convention of the Audio Engineering Society, utilizes
the methods presented in the fourth paper to optimize a DAC with regards to jitter noise.
Clock jitter is one of the most critical performance bottlenecks in high resolution audio, and
the paper proposes ways to minimize the DACs jitter susceptibility. The simplified
approximation methods are employed and extended to show that a semidigital FIR DAC gives
a more benign output waveform than a segmented DEM DAC of comparable complexity, and
that it will be a preferable solution if jitter dominates the error budget. A simple method is
also shown to estimate effects of implementation inaccuracies in the analog filter coefficients.
The sixth and last paper, published in ISAST Transactions on Electronics and Signal
Processing, explores and gives some advice on spectral analysis of delta sigma data
converters. In a regular Nyquist rate converter, coherent sampling suffices to ensure no
spectral leakage, and as such to provide a spectral plot and measures for noise and distortion
that are representative for the real components performance. In a DSM converter on the other
hand, there is spectral leakage not only of the signal component and possible harmonic
distortion components, but also of the high power out of band noise. The noise leakage might
smear the in-band spectrum and as such give a wrong estimate of the actual performance of
the converter. This paper explores the nature of the noise leakage and how to prevent it. It
concludes that a combination of both coherent sampling and windowing is recommended for
spectral analysis of delta-sigma data converters.

iii
Table of Contents

Abstract ................................................................................................................. i
Table of Contents................................................................................................iii
List of Figures ...................................................................................................... v
List of Abbreviations.......................................................................................... vi
Chapter 1 Introduction..................................................................................... 1
1.1 Hearing and Audio Quality ..................................................................... 1
1.2 A Brief Historical Review of Digital Audio ........................................... 2
1.3 Organization of This Book...................................................................... 4
Chapter 2 Fundamental Theory ...................................................................... 7
2.1 Sampling and Reconstruction.................................................................. 7
2.2 Quantization........................................................................................... 12
2.3 Oversampling......................................................................................... 16
2.4 Dither ..................................................................................................... 17
2.5 Delta-Sigma Modulation ....................................................................... 19
2.6 The DAC................................................................................................ 21
2.7 DAC Errors............................................................................................ 24
Chapter 3 The Delta-Sigma Modulator ........................................................ 35
3.1 Delta Sigma Modulator Design............................................................. 35
3.2 Alternative Delta-Sigma Structures....................................................... 40
3.3 Stability.................................................................................................. 43
3.4 Cyclic Behaviour, Tones and Noise Power Modulation....................... 46
3.5 Non-Overloading Delta-Sigma Modulators .......................................... 50
Chapter 4 Mismatch Shaping ........................................................................ 53
4.1 Mismatch Error Randomization ............................................................ 53
4.2 Element Rotation Techniques................................................................ 54
4.3 Other Techniques................................................................................... 58
4.4 Segmented Mismatch Shaping .............................................................. 62
Chapter 5 Delta-Sigma and Dynamic DAC Errors ..................................... 65
5.1 Delta-Sigma and Jitter Error Estimation ............................................... 65
5.2 Delta-Sigma and Switching Error Estimation....................................... 68
5.3 Techniques for Reducing Dynamic Errors............................................ 73
Chapter 6 Conclusions .................................................................................... 83

Table of Contents

iv
Appendix 1 Frequency Analysis .................................................................... 87
Appendix 2 Paper I ......................................................................................... 91
Appendix 3 Paper I Errata........................................................................... 107
Appendix 4 Paper II...................................................................................... 109
Appendix 5 Paper III .................................................................................... 115
Appendix 6 Paper IV..................................................................................... 121
Appendix 7 Paper V...................................................................................... 137
Appendix 8 Paper VI..................................................................................... 151

Bibliography..................................................................................................... 159

v
List of Figures

Figure 1: Equal loudness contours (ISO226) ............................................................................. 1
Figure 2: A-weighting function (IEC/CD 1672) ........................................................................ 2
Figure 3: Digital audio recording and playback chain ............................................................... 3
Figure 4: Conceptualization of simultaneous masking .............................................................. 4
Figure 5: Sampling of a continuous-time signal ........................................................................ 7
Figure 6: a) Continuous spectrum b) Sampled spectrum c) Alias distortion ............................. 8
Figure 7: Sampled waveform of fig.5 and an alias .................................................................... 9
Figure 8: Conceptual ADC and AAF......................................................................................... 9
Figure 9: Conceptual DAC and RCF ....................................................................................... 10
Figure 10: Output waveform from PCM DAC ........................................................................ 11
Figure 11: Hold reconstruction filtering effect ........................................................................ 12
Figure 12: Uniform scalar mid-thread quantizer...................................................................... 13
Figure 13: Quantizer input PDF (a) and output PDF (b).......................................................... 14
Figure 14: DAC oversampling in the time and frequency domains......................................... 17
Figure 15: Oversampling DA-converter with REQ................................................................. 17
Figure 16: Dithered quantization.............................................................................................. 18
Figure 17: First two error moments as function of input level................................................. 19
Figure 18: Basic delta-sigma modulator .................................................................................. 20
Figure 19: Illustration of DSM noise shaping.......................................................................... 20
Figure 20: Processing gain of modN DSM.............................................................................. 21
Figure 21: Resistor ladder type DAC....................................................................................... 22
Figure 22: DCT integrator SC DAC ........................................................................................ 23
Figure 23: Current mode DAC with external I-V conversion.................................................. 23
Figure 24: Jitter error in the time domain ................................................................................ 26
Figure 25: Jitter distortion from sinusoid, white and pink jitter............................................... 28
Figure 26: Generalized schematic of binary encoded DAC..................................................... 29
Figure 27: Generalized schematic of thermometer encoded DAC .......................................... 29
Figure 28: DAC transfer function, ideal and with INL............................................................ 30
Figure 29: DAC element on and off switching and error waveform. ...................................... 31
Figure 30: Equivalent small signal circuit for current steering DAC ...................................... 32
Figure 31: INL from finite output impedance.......................................................................... 33
Figure 32: Basic delta-sigma modulator .................................................................................. 35
Figure 33: The Silva-Steensgaard modified DSM structure .................................................... 36
Figure 34: Generalized DSM structure .................................................................................... 36
Figure 35: Basic modN distributed feedback DSM................................................................. 37
Figure 36: Generalized distributed feedback DSM.................................................................. 37
Figure 37: Distributed feedback DSM with resonator for NTF optimization.......................... 38
Figure 38: Optimization of NTF zeros..................................................................................... 39
Figure 39: Distributed feed-forward DSM structure................................................................ 39
Figure 40: The error-feedback DSM structure......................................................................... 40
Figure 41: A two-stage MASH modulator............................................................................... 41
Figure 42: Principle for the ultimate modulator ................................................................... 42
Figure 43: Trellis noise shaping modulator.............................................................................. 43
Figure 44: Example of instability in high order DSM............................................................. 44
Figure 45: Modified linear DSM model used in Root Locus method...................................... 45
Figure 46: Processing gain with 1-bit stable DSM.................................................................. 46
Figure 47: Output spectrum from fifth order DSM with rational DC input............................. 47
List of Figures

vi
Figure 48: Input PDF (a) and output PDF (b), single-bit quantizer ......................................... 49
Figure 49: DAC element randomization, B=3 bit example ..................................................... 53
Figure 50: DWA DAC element rotation, B=3 bit example...................................................... 55
Figure 51: Element selection sequence with DWA ................................................................. 55
Figure 52: Element selection sequence with second order DWA............................................ 58
Figure 53: Switching sequence for each element in a 3-bit DSM DAC.................................. 59
Figure 54: Switching sequence for each element in a 3-bit DSM DAC with DWA................ 59
Figure 55: Two element swapper cell ...................................................................................... 60
Figure 56: Swapping cell network for DEM, B=3................................................................... 60
Figure 57: Data splitting and reduction for tree structure DEM.............................................. 61
Figure 58: Complete reduction tree with first order mismatch shaping................................... 62
Figure 59: DEM and DAC segmentation................................................................................. 62
Figure 60: Equivalent signal flow diagram of segmented DAC.............................................. 63
Figure 61: DEM and DAC segmentation with SDSM............................................................. 63
Figure 62: Two time DEM and DAC segmentation ................................................................ 64
Figure 63: Area error model for jitter distortion analysis ........................................................ 65
Figure 64: SJNR
max
example, 50ps white jitter........................................................................ 67
Figure 65: Jittered spectrum with a) sinusoid, b) white, and c) mixed jitter............................ 68
Figure 66: Simulated spectrum, 10ps switching asymmetry.................................................... 69
Figure 67: Simulated SSNR
max
example, 10ps switching asymmetry..................................... 70
Figure 68: Simulated ISI error spectrum.................................................................................. 70
Figure 69: Simulated spectrum, 10ps switching asymmetry, DWA........................................ 71
Figure 70: Simulated spectrum of LPCM DAC with DWA.................................................... 72
Figure 71: Simulated spectrum, 10ps switching asymmetry, R2DWA................................... 72
Figure 72: Return-to-zero waveform........................................................................................ 73
Figure 73: SJNR
max,
50ps white jitter and RZ DAC................................................................. 75
Figure 74: Dual-RZ waveform................................................................................................. 76
Figure 75: DAC time-interleaving, a) functional diagram, b) waveform................................ 76
Figure 76: 1-bit DSM REQ with semidigital filtering DAC for multi-level output ................ 77
Figure 77: Multi-bit DSM REQ with semidigital filtering DAC............................................. 78
Figure 78: a) Analog PWM modulation b) Digital PCM-PWM conversion ........................... 79
Figure 79: UPWM error ........................................................................................................... 80
Figure 80: PWM-based algorithm used by Reefman et al. to eliminate mismatch and ISI ..... 81
Figure 81: Illustration of DFT spectral leakage ....................................................................... 88
Figure 82: Spectrum of sine multiplied with rectangular (top) and hann (bottom) window ... 89
Figure 83: Convoluted spectrum and DFT samples with coherent sampling .......................... 90
Figure 84: Illustration of signal leakage and noise leakage impairing DSM DFT .................. 90

vii
List of Abbreviations

AAF Anti-Alias Filter
ABE Analog Back End
ADC Analog to Digital Converter (or: Analog to Digital Conversion)
ADDA Analog-Digital-Digital-Analog
AES Audio Engineering Society
AFE Analog Front End
ANSI American National Standards Institute
ARA Acoustic Renaissance for Audio
ASRC Asynchronous Sample-Rate Converter
BIBO Bounded Input Bounded Output
BJT Bipolar Junction Transistor
CD Compact Disc
CF Characteristic Function
CMOS Complementary Metal Oxide Semiconductor
DAC Digital to Analog Converter (or: Digital to Analog Conversion)
DB: DeciBel
DBFS DeciBel relative to full-scale
DCT Direct Charge Transfer
DEM Dynamic Element Matching
DFT Discrete Fourier Transform
DIN Deutsches Institut fr Normung
DNL Differential Non-Linearity
DSM Delta Sigma Modulator (or: Delta Sigma Modulation)
DSD Direct Stream Digital
DSP Digital Signal Processing
DTFT Discrete Time Fourier Transform
DVD Digital Versatile Disc
DVD-A DVD-Audio
DWA Data Weighted Averaging
EF Error Feedback
ENOB Effective Number of Bits
FET Field Effect Transistor
FFT Fast Fourier Transform
FIR Finite Impulse Response
FOM Figure of Merit
FPGA Field Programmable Gate Array
FS Full-Scale (or: If written f
s
; sampling frequency)
HD Harmonic Distortion
HD2 Second Harmonic Distortion
HD3 Third Harmonic Distortion
HF High Frequency
Hi-res High Resolution
IC Integrated Circuit
IEEE Institute of Electrical and Electronics Engineers
IEC International Electrotechnical Commission
IFIR Interpolated FIR
IIR Infinite Impulse Response
List of Abbreviations

viii
ILA Individual Level Averaging
INL Integral Non-Linearity
ISI Inter Symbol Interference
ISO International Standardizing Organization
JTF Jitter Transfer Function
LF Low Frequency
LFSR Linear Feedback Shift Register
LPCM Linear Pulse Code Modulation
LSB Least Significant Bit
MAC Multiplier Accumulator
MASH Multi stAge noise-SHaping
MOS Metal Oxide Semiconductor
MSB Most Significant Bit
MSE Mean Square Error
NOL Non-Overloading
NOS Non-Oversampling
NRZ Non Return to Zero
NTF Noise Transfer Function
OF Output Feedback
OSR Oversampling Ratio
PCM Pulse Code Modulation
PDF Probability Density Function
PDM Pulse Density Modulation
PLL Phase Locked Loop
PRNG Pseudo Random Number Generator
PSD Power Spectral Density
PWM Pulse Width Modulation
RCF Reconstruction Filter
REQ Re-Quantizer (or: Re-Quantization)
RMS Root Mean Square
ROC Region of Convergence
RZ Return to Zero
SACD Super Audio Compact Disc
SC Switched Capacitor
SFDR Spurious Free Dynamic Range
SJNR Signal to Jitter Noise Ratio
SMNR Signal to Mismatch Noise Ratio
SNDR Signal to Noise and Distortion Ratio
SNR Signal to Noise Ratio
SP-DIF Sony/Philips Digital Interface Format
SPL Sound Pressure Level
SQNR Signal to Quantization Noise Ratio
SSNR Signal to Switching Noise Ratio
STF Signal Transfer Function
THD Total Harmonic Distortion
THD+N Total Harmonic Distortion and Noise
TNSM Trellis Noise-Shaping Modulator
VLSI Very Large Scale Integration
VQ Vector Quantization

1
Chapter 1

Introduction

1.1 Hearing and Audio Quality

When Edison invented the phonograph in the 1870s [1], he probably didnt envision what a
major industry the recording, conservation and reproduction of music would become.
Advances in technology have steadily increased the performance as well as availability of
reproduced sound, and a listener can now fit an entire music library in transparent quality into
his pocket.
In development of audio technology, the qualitative context is represented by understanding
and knowledge of the human auditory system and its properties. Fletcher and Munson did
important early work in quantifying the bandwidth and sensitivity of the human hearing [2],
which resulted in the equal loudness contour and the phon denomination of perceived
loudness. The Fletcher-Munson curves were later revised as the Robinson-Dadson curves [3],
which became the basis of the ISO226 equal loudness standard.
Figure 1 shows the equal loudness curves according to ISO226. The 0-phon curve is known
as the threshold of audibility and the 120-phon curve as the threshold of pain. The span
between these two thresholds is generally acknowledged as the usable dynamic range of the
human auditory system. It thus represents a measure for the desirable dynamic range in audio
equipment. The y-axis is the absolute SPL in dB relative to a reference of 20Pa RMS.

Figure 1: Equal loudness contours (ISO226)

In the frequency range of approximately 2kHz to 5kHz, called the midrange, the dynamic
range exceeds 120dB. It is well maintained into the lower (bass) and higher (treble)
frequency regions, but below 100 Hz and above 10 kHz it reduces significantly. The
bandwidth of the hearing will vary from person to person, but the normal convention is to
assume 20Hz to 20kHz for young, healthy people. Studies exist though suggesting that the
way we perceive the timbre of a sound is affected by significantly higher frequencies than this
[4]-[5]. Many musical instruments also have larger bandwidth than 20kHz [6].
Chapter 1 Introduction

2
Because of the large variation with frequency in our hearing sensitivity, uniform frequency
weighting can give misleading figures when measuring audio quality. A widely accepted
frequency weighting norm for sound measurement is the so-called A-weighting function
(IEC/CD1672), which approximates the inverse of the 40-phon curve using six poles and four
differentiating zeros. The frequency response of the standardized A-weighting function is
shown in fig.2.

Figure 2: A-weighting function (IEC/CD 1672)

A-weighting is frequently used in specification and measurement of audio equipment
including audio data converters. For instance noise is often A-weighted when measuring
SNR. A predominantly white noise spectrum is reduced in power by around 3dB in the range
20Hz to 20kHz from A-weighting, meaning A-weighted SNR values are approximately 3dB
better than unweighted ones, given white or predominantly white noise and this bandwidth.
Based on the known characteristics of the human auditory system, the ARA commission in
1995 suggested that a high resolution audio carrier capable of full transparency should have at
least 120dB dynamic range and 26kHz usable bandwidth [7]. It should be noted though that
terms like transparency and audio quality are subject to an ongoing dispute between two
lairs the so-called objectivist and subjectivist factions within the hi-fi community [8].
The subjectivists are generally sceptical to the authority of empirical data, and will often
use arguments of solipsist and/or panpsychist nature to contend the truisms of established
science. As a scientific document this thesis is founded in the objectivist point of view
without any further discussion thereof.

1.2 A Brief Historical Review of Digital Audio

When audio entered the digital world where storage and processing capabilities increase
exponentially with time as predicted by Moore [9], it rapidly became feasible to process
digital audio carriers exceeding the transparency requirements defined by ARA. Practical
considerations and standardization efforts have however led to a more erratic increase in de
facto performance than the feasibility limits governed by Moores Law.
Digital audio was brought to the consumer with the introduction of the Compact Disc audio
playback system in the early 1980s [10]. Marketed under the pretentious slogan Perfect
Sound Forever, the CD-format offered 20kHz bandwidth and 96dB dynamic range in stereo.
The first commercial CD-players; Sony CDP101 (Japan) and Philips CD100 (Europe),
featured around 90dB dynamic range.

3
A complete digital audio chain will look approximately like fig.3. An instrument emits
sound to an acoustoelectric transducer or microphone and the resulting electric signal is
amplified and filtered by an AFE. It is then converted to digital data with an ADC, before the
data is stored on a CD or other digital audio medium. During playback the medium is read
and output data is transformed back to an analog signal with a DAC, amplified and filtered by
an ABE and converted to sound through an electroacoustic transducer or loudspeaker. Ideally
this entire process should be audibly transparent.

Figure 3: Digital audio recording and playback chain

It is well known that the electroacoustic transducers introduce more distortion than the other
elements in the chain. Nevertheless the development process usually aspires to achieve local
transparency, so that the component in question can be disregarded as an error source when
evaluating the system. The CD-formats transparency is questionable both in terms of
dynamic range and bandwidth, and its limitation to two channels makes spatial transparency
unobtainable [11]. Still the CD-system has proved to be very resilient. Part of the reason for
this must be attributed to the fact that it took many years before converter technology reached
a level where ADC and DAC performance approached the fundamental limits of the format.
Entering the 1990s, the effective resolution of ADCs and DACs began to reach a plateau
where the CD-format itself limited the performance of the ADDA process [12]. This led to an
emerging demand of and research activity into higher resolution carriers, including the
mentioned ARA study. By the turn of the century, two competing bids for the next generation
audio carrier were launched: Philips and Sony the companies behind the CD success
fronted the SACD [13] as its heir, whereas the working group behind the then already highly
successful DVD video standard promoted the audio-specific DVD-A [14].
SACD is based on DSD; a radical 1-bit noise-shaped storage format theoretically facilitating
the abolition of non-linear ADC and DAC units. It features 120dB dynamic range and
100kHz bandwidth in up to six channels. The DVD-A format uses more conventional 24-bit
LPCM storage and offers a theoretical dynamic range of 144dB. The bandwidth can be up to
96kHz in two channels or 48kHz in five channels. Double-blind listening tests have failed to
prove any audible differences between the two formats [15] and both have fundamental
performance limits well beyond what is achievable with current converter technology. Still,
despite their high promises and impressive technological potential, the DVD-A and SACD
formats have both failed to gain mass-market appeal [16]. This coincides with severe
problems for the music business as a whole; as the internet media revolution threatens to put
both hi-res audio and the conventional recording industry out of contention [17].
From an engineering point of view, internet music shifted the prime focus of digital audio
research. Since internet lines offer limited bandwidth the emphasis has been moved to
compression and optimization of quality versus data-rate trade-offs, using sophisticated
quantization and coding methods based on perceptual modelling of the auditory system [18].

4
Perceptual coding algorithms vary in complexity, but all are fundamentally based on the ears
masking property; that loud sounds overwhelm weaker sounds and render them inaudible.
The hearing has both temporal, spatial and frequency based masking properties that have led
to some quite sophisticated models and compression formats. This is only tangentially
relevant to data converter design and the only masking property touched upon in this thesis is
simultaneous or frequency masking: The hearing threshold depends greatly on the distance in
frequency to a strong signal component or masker, which is illustrated in fig.4.
Consequently the spectral properties affect the severity of many distortion mechanisms.
Distortion audibility will depend on the distance to maskers as well as the harmonic
coherence of the distortion spectrum [19].
Modern computer based compressed audio formats are generally scalable, and with rapid
increase in network and storage capacity high bandwidth transfer is gaining in popularity.
Combined with advances in the sophistication of perceptual compression routines, the
dynamic range and bandwidth limitations are catching up to SACD and DVD-A levels. This
means that converter technology is again becoming the limiting factor of the ADDA process.

Figure 4: Conceptualization of simultaneous masking

As part of the digital audio history, growing concerns among both consumers and
mastering engineers about the so-called loudness war [20] should also be mentioned. The
omnipresence of reproduced music has led to a move from record labels to increase the
nominal volume in recordings. The intention of doing so is to make a record stand out in the
plethora of airwave broadcasts and marketing, since people will notice a sound quicker if it is
loud. This means that the high dynamic range of modern digital formats is often not utilized.
Unfortunate as it may be this is however not related to the capabilities of converter or digital
format technology per se, and is thus only mentioned in the introduction for its contextual and
historical relevance.

1.3 Organization of This Book

In the introduction, the motivation for the work has been epitomized, based on a brief
review of some fundamental psychoacoustic limitations and a historical retrospect of digital
audio. The next chapter will bridge this with fundamental data converter theory. It reviews the
processes of sampling and quantization, as well as DA conversion and what waveform errors
will typically be introduced by a DAC circuit. This chapter will also establish the case for
using oversampled conversion and delta-sigma modulation in audio.


5
Following this, the third chapter moves on to explore the DSM and its properties. The
history of delta-sigma modulation as well as principles and complications surrounding its
implementation are reviewed. The concepts of stability and loop filter design are introduced,
and the chapter also takes a brief look at some more recent structures and why they are used.
The reader should through this gain a pragmatic understanding of delta-sigma.
The fourth chapter deals with static DAC errors and how these will limit the performance of
the DA conversion process. It introduces DEM and the notation used in the fourth paper to
argue for a simple estimation method to predict errors in generic DEM DACs. In addition to
traditional rotation based DEM, it also explains the reasoning behind some alternative
structures that have been introduced in more recent times.
The fifth chapter deals with dynamic DAC errors and how these will limit the performance
of the DA conversion process. Since dynamic errors are waveform dependent, it means they
will be strongly affected by the output sample sequence from the delta-sigma modulator. This
sequence is generally impossible to predict analytically, but the chapter shows how its
spectral properties can be used to create dynamic error estimates. This chapter has significant
overlap with the contents of paper four, but was included in the monograph to make it appear
more complete and coherent.
The monograph is primarily intended to provide an overview with a unified notation of the
subjects touched upon in this Ph.D. project work. Having read it, the reader should be
provided with the foundation necessary for a general understanding of the papers, their
relevance and what their contributions constitute. The papers are themselves the main
contribution; their contents having been briefly reviewed in the abstract. They are to be found
in appendixes two to eight, whereas the first appendix reviews the DFT and discrete time
spectral analysis of finite length signals. Such analysis is used in most converter performance
evaluations, both in this work and generally, and it is therefore important to understand the
properties of the DFT and the limitations and pitfalls in finite length spectral analysis.


6

7
Chapter 2

Fundamental Theory

In this chapter basic data converter theory is reviewed; it is described how data conversion
works and what fundamental limitations and practical errors are inherent in ADC and DAC
processing. They must be assessed in the context of sound perception as reviewed in the first
chapter, forming the cognitive basis for understanding the book and later contents.

2.1 Sampling and Reconstruction

The fundament for digital signal processing was to a large extent made with the
breakthrough discovery of the sampling theorem. It was implied as early as 1928, through the
derivation by Harry Nyquist [21] that a system of bandwidth B could transmit independent
pulse samples at a rate 2B. Nyquists work focused on transmission capacity and did not
consider sampling and reconstruction of continuous-time signals as such. The now obvious
duality of Nyquists discovery the theory of how any continuous-time signal can be sampled
with no loss of information given a sampling frequency of at least twice its bandwidth was
first formulated by Soviet information theory pioneer Vladimir Kotelnikov in 1933
1
[22] and
made known to the larger international scientific community through Claude Shannons
legendary 1948 publication A Theory of Communication [23]. Shannon formulated the
theorem in A Theory of Communication, and gave its proof and coined the term sampling
theorem in his 1949 follow-up paper Communication in the Presence of Noise [24]. These
two papers are generally acknowledged to be in large part the origin of modern information
theory and digital signal processing, and Shannon is renowned as the father of information
theory. The sampling theorem is also often called Shannons sampling theorem and
sometimes incorrectly Nyquists sampling theorem. The bandwidth limit at half the
sampling frequency for any sampled signal is known rightfully as the Nyquist frequency.

2.1.1 Sampling

In order to enable a digital representation of a signal, samples must be taken for which to
assign data values. The signal is measured at a fixed interval T
s
hereafter called the sampling
period. The inverse of the sampling period is known as the sampling frequency or f
s
=1/T
s
.

Figure 5: Sampling of a continuous-time signal

1
Whittaker arguably gave the theorem first, implicitly [25]. History enthusiasts may enjoy the IEEE anniversary review [26].
Chapter 2 Fundamental Theory

8
Sampling is illustrated with a simple sinewave in fig.5. From the figure it is seen that in
mathematical terms sampling is the multiplication of the input signal with a string of Dirac
pulses at all integer multiples of the sampling period T
s
. The mathematical description of this
operation is given by:

( ) [ ] ( )
s s
n
x n x t T t nT
=
=
. (1)

With the Fourier transform F its definition and use assumed familiar to the reader an
expression for the sampled signal frequency spectrum F{x[n]} can be found as a function of
the continuous signal frequency spectrum F{x(t)}:

( ) [ ] { }
( ) ( )
( ) { }
( ) ( ) ( ) { }
2

, .
s
S
s s
n
nf t
n
s
n
X f x n
x t T t nT
x t e
X f nf X f x t
=
=

=

=
= =
i
F
F
F
F
(2)

It is seen from this result that sampling gives a spectrum repeating around multiples of f
s
as
illustrated in fig.6. From both the equation and the figure it is now understood how having the
sampled signal x bandlimited to below f
s
/2 the Nyquist frequency is a requirement for
preservation of its spectral integrity. If the repeated spectra known as aliases are removed
during DA conversion, the ideal ADDA process leads to an output identical to its input.

Figure 6: a) Continuous spectrum b) Sampled spectrum c) Alias distortion


9
If the signal bandwidth on the other hand exceeds the Nyquist frequency, or analogously
that f
s
violates the sampling theorem, the aliases will overlap as illustrated in fig.6c). Spectral
integrity is then lost in the overlap region. In fact any energy content residing above the
Nyquist frequency at the point of sampling will create an alias below it. It is known as alias
distortion or just aliasing. From this it is given that unless the input to an ADC is limited
strictly below the Nyquist frequency, alias distortion will compromise its performance. It is
thus necessary to ensure that as little energy content as possible violating the sampling
theorem enters the ADC. This is done by using an antialias filter before sampling to suppress
any energy that may exist above f
s
/2. The necessary damping of this filter is determined by the
expected amount of out-of-band energy and the required level of signal integrity preservation
in the baseband.
An intuitive way to understand why sampling produces a repetitive spectrum is to look at
fig.5 and acknowledge that other high frequency sinewaves can be defined by the exact same
sequence of samples. Thus the sample sequence contains information of many waveforms.
Figure 7 shows two sinewaves giving exactly the same sample sequence. If the high
frequency sinewave was the one sampled to generate this sequence, obviously in violation of
the sampling theorem, reconstruction would form its low frequency alias from the samples.

Figure 7: Sampled waveform of fig.5 and an alias

Schematically the sampling process can be seen as an AAF followed by a sampling network
or an ADC. The desired input signal, filtered to conform to the sampling theorem and entering
the sampler, is in this thesis denoted as x(t).

Figure 8: Conceptual ADC and AAF

Since the sampled spectrum is repetitive around f
s
, and since processing the sampled signal
does not necessarily imply any a priori knowledge of the sampling frequency, the sampled
waveform is more conveniently expressed through its angular frequency defined as:

2
def
s
f
f
= . (3)

Then it is given from the derivation of the sampling theorem that the frequency spectrum of
the discrete sequence can be rewritten as:


10
[ ] ( )
n
s
n
X x n e

=
=
i
. (4)

This result is the normal definition of the DTFT. It is also valid for finite length sequences
as the (finite) DFT. The simulated spectra presented in this thesis and associated papers are of
course of finite length and found by DFT calculation on finite sequences. The DFT may if not
used carefully have incongruities due to the Gibbs phenomenon, which can be alleviated with
windowing or coherent sampling as reviewed in Appendix 1. A generalization of the DTFT is
given by the z-transform:

[ ] ( ) ,
n n
s
n
X z x n z z r e

=
= =
i
. (5)

It is seen that the DTFT is identical to the z-transform for r=1, or evaluation along the unit
circle in the complex plane. Although introduced for completeness, it is assumed that the
reader has prior knowledge of the fundamental properties for the z-transform and related
terms such as unit circle, poles, zeros and ROC.

2.1.2 Reconstruction

In the DAC process, the sample sequence must be transformed back to an analog continuous
time waveform. Ideal reconstruction, i.e. x
out
(t)x
in
(t), would imply removing all spectral
content above the Nyquist frequency and retain all spectral content below it. This requires an
infinitely steep reconstruction filter which is not feasible to implement. Rather, a real-life
RCF is specified from how much high frequency alias energy is tolerable at the output. The
RCF is typically placed outside the DAC chip as shown in fig.9. The DAC converts sample
data into a continuous time waveform which is then low-pass filtered to approximate the
original input. The DAC output has been given its own denotation y(t). Since this thesis deals
primarily with issues in DAC design, the nature of y(t) is of essential interest and will be paid
special attention in the theory introduction.

Figure 9: Conceptual DAC and RCF

The typical way of constructing y(t) is to connect the output to a current or voltage
proportional to the sample value and hold it over the duration of the sample period. In other
words the output is defined as:

( ) [ ] ( ) , 1
s s
y t x n nT t n T = < + . (6)

This ensures that the output is in principle linearly proportional to the input signal x. The
hold reconstructed waveform can also be described as the time convolution of the sample
sequence and a rectangular window:


11
( ) [ ]
-T T
1 , 1 1
rect , rect
2 2
2
0 , otherwise
def
s
n
s s
t t nT t
y t x n
T T T
<
= =

. (7)

The resulting output from the DAC described in (7) is shown in fig.10 for a sinusoidal
sample sequence. This is the well-known stair-case output waveform.

Figure 10: Output waveform from PCM DAC

The frequency spectrum of this output waveform is found by taking the Fourier transform of
the time domain expression (7). Since it is known that convolution in the time domain equals
multiplication in the frequency domain this is relatively simple:

( ) ( ) { }
[ ] { }
( )
( )
2
0
1 1
rect
2
1

sinc .
s
s
s s
T
ft
s
s
f
f
s
s
Y f y t
t
x n
T T
X f e dt
T
f
X f e
f
=

=

=

=

i
i
F
F F
(8)

According to usual signal processing notation, the normalized sinc-function is defined as:

( )
sin( )
sinc
def
x
x
x
= . (9)

Hold reconstruction in other words performs first order sinc-filtering of the sampled
spectrum. This means that the aliases are suppressed somewhat, but also that there is some
inband attenuation below the Nyquist frequency. This is typically compensated for at the
digital side of the DAC.


12

Figure 11: Hold reconstruction filtering effect

Current steering and DCT switch-cap DAC circuits, used in most audio DAC
implementations, will both provide this type of waveform. The aliases are suppressed further,
or analogously the stair-case is smoothed, through the external RCF. There also exist other
types of reconstruction, some of which will be touched upon later.

2.2 Quantization

Digitization of a signal is a two-step process. After the signal is sampled, the samples must
be given a data representation. The process of mapping samples to a finite set of data values is
known as quantization. The most common method is scalar quantization, where each sample
x[n] is mapped to one in a range of values Q(x); being a scalar set that is typically integer.
Another possibility is to map an input vector x=[x[1],,x[N]] to one in a set of output vectors
Q(x)
N
, where
N
is an N-dimensional vector space; called N-dimensional vector
quantization. This thesis deals only with uniform scalar quantization which is used in
practically all data converter applications.
A scalar quantizer defined by being the integer set {-2
B-1
-1...-1,0,1...2
B-1
-1}, can be
realized with a B bit binary output. It is hence called a B-bit quantizer. Mapping the input to
the nearest integer in can be done by rounding:

( )
( )
1
2
Q x
x
Q x

= +

l
. (10)

This is a symmetric or mid-thread quantizer which has M=2
B
-1 levels when it is B-bit. A B-
bit quantizer can also have M=2
B
integer levels if it is made asymmetric. The denotation is
used for the input-referred quantizer step-size. It is shown graphically in fig.12 together with
the quantization error e, which is the deviation of Q(x) from x. It is seen that the error is
constrained to |e|1/2 or input-referred to |e|/2 as long as the output is within a range of
2
B
. The corresponding input range |R|(2
B-1
-1/2)

is the input non-overload range.
Using Nyquist sampling and uniform scalar quantization to digitize signals is known as
Linear Pulse Code Modulation and the resulting data as LPCM or just PCM samples [27].
This is the original and most direct/intuitive approach to signal digitization, but as will shortly
become clear other modulation schemes can be used.


13

Figure 12: Uniform scalar mid-thread quantizer

The quantization error is a not a continuous function of the input signal, but has
discontinuities making it difficult to analyze. In his classic paper Bennett showed that under
certain conditions the quantization error can be approximated as an additive noise source,
uniformly distributed in the range /2 [28]. The conditions he stipulated were that the
quantizer had a large input range large compared to and that the input signal was active
over significant parts of this range without overloading it. He proved the approximation to be
asymptotically correct for a Gaussian input distribution as 0, and showed through
simulations it was a valid approximation for sampled and quantized sinusoids spanning over
an amplitude range of many . Denoting the error PDF as f
e
(e) it can be written as:

1
, -
( )
2 2
0 , otherwise
e
e
f e

<
. (11)

From (11) the first two statistical moments of the error i.e. the input-referred mean and
variance are given by:

( ) ( )
( )
( )
2
2 2
0

12
m m
e
e
E e
E e e f e de
E e
=

= =
. (12)

If the quantizer is B-bit its total input non-overload range is 2
B
. The highest level input
sinewave that doesnt overload it is hence x[n]=2
B-1
sin(n), and its power
x
2
=2
2B
2
/8.
The peak SQNR for sinusoid input is consequently:

2 2
max 10 2
2
8
10 log 6.02 1.76 [dB]
12
B
SQNR B

= = +

. (13)


14
This is the well known 6dB per bit rule also used to calculate ENOB. Bennett in the same
paper also showed that if the input signal had a smooth power spectrum, the error samples
would be approximately orthogonal. Then the error autocorrelation function is given by
2
:

[ ] [ ] ( )
2
, 0
( )
0 , otherwise
def
e
ee
k
r k E e n e n k
=
=

. (14)

Using the Wiener-Khinchin theorem the error power spectral density is found to be:

( ) { }
2
1
( )
2 2
e
e ee
S r k

= = F . (15)

Widrow [29] extended the work of Bennett by applying sampling theory to the quantizer to
find a statistical model for an arbitrary input PDF. This enabled Widrow to find the criteria
for conditional input independence in any statistical moment of the quantization error. While
the input PDF is a continuous function, the output has discrete probabilities in the value set ,
or input-referred in multiples of . This means that the output PDF is an area sampled version
of the input PDF with sampling frequency
q
=1/. The probability for the output to take
any discrete level is given by the cumulative input PDF within /2 of this level.

Figure 13: Quantizer input PDF (a) and output PDF (b)

For simplicity of notation, the quantizer output Q(x) is denoted q in the figure and in the text
from here on. The discrete PDF of the quantizer output becomes:

( ) ( ) ( )
( ) ( )
/ 2
/ 2
rect .
n
q x
n
n
x
n
f q q n f x dx
q
q n f q
=
=

=

(16)

The definition of the rectangular window is the same as before. Similar to the derivation the
sampling theorem, Widrow took the Fourier transform of the discrete output PDF to find:

2
This is the discrete time definition of the autocorrelation function. Bennett used continuous time analysis to
show the error had approximately zero autocorrelation in two arbitrary time instants t and t+, thus the error PSD
of a sampled and quantized process would be white.

15
( ) ( ) { }
( ) sinc .
def
q q
x
n
u f q
n
u u n
=
=

=

F
(17)

The Fourier transform of a PDF is known by definition as the characteristic function. The
CF is periodic and sinc-weighted similar to the spectrum of a signal sampled and
reconstructed with hold reconstruction. To avoid PDF aliasing, the input CF must be zero
above 1/(2). If so the input PDF is merely convoluted with a rectangular window of width ,
equalling the Bennett approximation of an additive error with rectangular PDF. A large
Gaussian PDF converges towards such a CF, confirming Bennetts conditions.
Widrow however went further and found a requirement for conditional independence in any
statistical error moment. Any moment can by found by differentiating the CF at the origin:

( ) ( )
( ) ( )
( )
0
0
2
sinc
.
2
m m
q
m m
q m
u
m
m x
n
m
u
d u
E q q f q dq
du
n
d u u n
du
=
=

= =

i
i
(18)

If the requirement for no PDF aliasing is fulfilled, the m
th
output moment equals the m
th

input moment plus a constant. But there is also a weaker condition: The assumption that:

( ) ( ) ( )
sinc
0 , 0
n
m
x
m
u
d u u
n
du
=

= , (19)

leads to the following simplification of (18):

( )
( ) ( ) ( )
( ) ( )
( ) ( )
0
2
2 2
sinc
2
12
m m
x
m
u
d u u
E q
du
E q E x
E q E x
=

=

=
= +
i
. (20)

It follows from (20) that the error is conditionally independent and additive in its statistical
moment m if (19) is fulfilled for the m
th
derivative. Of course this cannot be ensured with the
lack of any a priori knowledge of the input statistics, but as will be seen shortly one can force
this condition to hold in any given statistical moment by applying dither.
The reader should be aware that since both Bennetts and Widrows methods are statistical
methods, validity is limited to cases of static input PDF and they are not telling of the
dynamic behaviour of the quantization noise. Many studies have been made on the dynamic
characteristics of quantization noise that would make for a very extensive review. Interested
readers are recommended to take a look at Grays comprehensive survey paper [30] and its
references for an overview.

16

2.3 Oversampling

Oversampling is a technique that has become invaluable in high resolution, low bandwidth
converters. DAC oversampling is helpful in making the RCF design easier and it also gives a
processing gain allowing re-quantization to fewer bits. One of the earliest papers on
oversampled DA conversion was published in 1974 [31], and a patent was filed in 1981 [32].
Oversampling DACs have been used in most digital audio units all the way back to the Philips
CD100 which had an OSR of 4.
Looking first at the ADC; sampling with a rate far higher than twice the signal bandwidth
can be of benefit for several reasons. First and foremost one can use a much simpler AAF.
Since the sampled spectrum is periodic in f
s,
the transition band of the AAF can range from f
b

to f
s
-f
b
, where f
b
is the signal bandwidth limit. An increase in f
s
in other words relaxes the
requirements for the AAF by making the transition band wider, and designing a high
performance AFE becomes much easier. Using the Bennett approximation for quantization
noise, it is also found that the total in-band noise power being the quantization error PSD in
(15) integrated over the input Nyquist range decreases proportionally to the OSR. If the
sampling rate is increased so that f
s
=f
s_in
L, the in-band quantization noise power is:

( )
/ 2
2
/
L
e
e e
L
S d
L
= =
. (21)

The signal-band SQNR as a function of the number of bits is consequently given by:

( )
2
max 10 2
10
10 log
6.02 1.76 10 log [dB] .

x
e
SQNR
B L

=

= + +
(22)

It follows from (22) that for each doubling of L one can reduce B by half a bit and get the
same SQNR. If L=256 four bits are saved, making ADC design simpler.
In a DAC the advantages of oversampling isnt as intuitively appreciated, but the same
fundamental mechanisms apply. The space between aliases can be extended by first
increasing the sample rate or zero-pad the signal and then low-pass filter it. This is the same
as interpolation and it is shown in the time-domain as well as the frequency domain in fig.14.
When the aliases are moved apart like this, the requirements for the analog RCF are greatly
relaxed. In essence it means that the burden of filtering unwanted energy is moved from the
analog to the digital domain. Digital filter implementations are much more flexible, much
cheaper and have much higher performance than their analog counterparts.
Design of oversampling filters is a large field within DSP and is not reviewed in detail in
this thesis. Unlike general purpose digital filters, oversampling filters are typically FIR filters
since they can then be implemented very efficiently in a polyphase filter structure [33]. In
audio applications the oversampling filter will typically be realized as several cascaded stages
of halfband filters [34] or as IFIR filters [35], using a multiplier-accumulator realization [36].
For more insight in the design of oversampling filters the reader is recommended to read a
textbook covering the subject, e.g. Mitras Digital Signal Processing [37] chapter 13.

17

Figure 14: DAC oversampling in the time and frequency domains

With DAC oversampling the same processing gain as for the ADC will also apply to any
post-oversampling quantization operation. This means it is possible to use a REQ to reduce
the number of bits while maintaining a high effective resolution. This is shown in fig.15.

Figure 15: Oversampling DA-converter with REQ

Any noise or distortion inherent in x[n] is to be regarded as part of the input signal for all
succeeding processing blocks, meaning it is not reduced when oversampling. But by using
oversampling, errors introduced after the sampling rate is increased may be spread over a
larger frequency range. This is significant especially for the quantization error, which has in-
band noise power as given in (21). For instance a REQ can have a 12-bit arithmetic output,
but with an OSR of 256 have 16 bits effective resolution. The DAC then needs to resolve 2
12

levels instead of 2
16
, meaning its implementation will be much simpler. It must however be
stressed that any in-band error introduced by the DAC must still be at a 16-bit level. It is only
its number of elements that are reduced; the requirements for DAC in-band noise density,
DAC linearity and so forth still remain the same.

2.4 Dithering

As mentioned in the section on quantization, it is possible to exploit the weaker condition of
CF derivatives being zero in multiples of the quantization frequency to obtain conditional
input independence in any error moment of choice. This is done by adding dither; a small,
independent noise-source that applied to the input of the quantizer as shown in fig.16.
In the figure an additional signal v is added prior to the quantizer. Here the quantizer is a
REQ used in re-quantizing DACs, meaning that the input and dither signals are generated
digitally. If the input is assumed to have so high resolution compared to that it can be
approximated as continuous amplitude, a DAC with digital input and dither can be regarded
as equivalent to an ADC with analog input and dither.


18

Figure 16: Dithered quantization

The first use of applied noise in quantization was seen in a 1962 publication on low-
resolution image digitization [38], while the term dither was established two years later [39].
Its etymology comes from the word didder which means to shiver or shake with cold. The
term was coined because the noise was seen to shake up perceptually annoying quantization
error patterns. In the past, especially with regards to digital audio, there has been some
dissension on the nature of dither and requirements for dithering. Results published from the
research of Lipshitz, Vanderkooy and Wannamaker [40]-[42] have been very central to the
development of an understanding of dither in the audio community. Their work is based on
Widrows statistical model of quantization.
Looking at fig.16, the quantizer input is now w=x+v. The dither signal is assumed
statistically independent of the input so the PDF f
w
is simply a convolution of f
v
and f
x
. Eq.(16)
rewritten for the dithered case then becomes:

( ) ( ) ( ) ( ) rect
q v x
n
q
f q q n f q f q
=

=

. (23)

Consequently the rewritten CF becomes:

( ) ( ) sinc
q v x
n
n n
u u u u n
=

=

. (24)

Going back to (19), we need the m
th
derivative of
q
to be zero at all integer multiples of the
quantization frequency for the error is to be input independent in its m
th
statistical moment.
What is now noteworthy is that if either of the products in (24) is zero, the whole expression
becomes zero. This means that if the dither sequence is made to conform, it does not matter
how the input signal behaves. Then the m
th
output moment will regardless be given as:

( )
( ) ( ) ( ) ( )
( ) ( ) ( )
( ) ( ) ( )
0
2
2 2 2
sinc
2
12
m m
x v
m
m
u
d u u u
E q
du
E q E x E v
E q E x E v
=

=

= +
= + +
i
. (25)

Thus, with dither no a priori knowledge about the statistics of the input signal is necessary.
All that has to be done to ensure conditional error independence, is to apply a dither signal
where its given CF derivative is zero in all integer multiples of 1/. As it turns out the sum of
N independent random sources with uniform distribution in /2 has a total CF of:

( ) ( ) =sinc
N
v
u u . (26)

For this function all derivatives from 0 to N are zero in u=k/ for all k. The dither mean is
zero and its variance is N
2
/12. Using a single random source (N=1) is called RPDF

19
(rectangular PDF) or 1PDF dithering. Adding two independent sources (N=2) makes f
v

triangular in and it is therefore referred to as TPDF (triangular PDF) or 2PDF dithering. A
total of N added sources NPDF dither will render the first N quantization error moments
input independent. Studies suggest that only the first two moments the mean and the
variance make the error audibly different (from white noise) if they are input dependent
[43]. Even very coarse quantization gives no detectability of dependence in the skew, kurtosis
or higher error moments. Since additional noise power from the dithering increases with N,
TPDF is therefore regarded as optimal dithering in audio.

Figure 17: First two error moments as function of input level

Figure 17 shows through simulations how dither renders error moments conditionally input-
independent. With RPDF dither the error mean is zero implying no distortion of the input
signal. With TPDF dither both the error expectance value and the error power is constant over
the input range. The average error power is increased from
2
/12 to
2
/4 which needs to be
included in SQNR estimation (13). With TPDF dither we in other words have
SQNR
max
=6.02B-3.01 dB, and it is no longer just an approximation.

2.5 Delta-Sigma Modulation

Oversampling gives a nominal processing gain which enables a reduction of the number of
bits while maintaining high dynamic range. Its effect in this regard is however limited; since
the processing gain as mentioned is one half bit per doubling of the sampling rate. A
substantial reduction of the number of bits which is very desirable for complexity reasons
will require an unfeasibly high OSR.
This led to research on modulation alternatives for improvement of the processing gain. The
DSM, being an extension of the delta-modulator or differential PCM encoder, was first
published by Inose and Yasuda in 1962 [44]. The possibility to use it to improve processing
gain in oversampled data conversion was first treated in a 1969 paper by Goodman [45],
although a less known patent preceding this work exists that describes in essence the same
basic principle [46].

The reader should be aware that both delta-sigma and sigma-delta are commonly used terms
for the same process [47]. The causal order of the process suggests delta-sigma whereas the

20
modulators functional hierarchy suggests sigma-delta. This thesis uses the original and
arguably most used term; delta-sigma modulation.

Figure 18: Basic delta-sigma modulator

The basic functionality of a DSM is depicted in fig.18. It uses filtered negative feedback
compensation, causing the quantization error to be spectrally shaped. The loop filter H(z)
determines the spectral properties of the DSM. Using Bennetts additive noise approximation,
the input-output relation of the modulator can be described by the linear sum:

( )
( )
( )
( )
( )
( )
( ) ( ) ( ) ( )
1
1 1
.
q
q
H z
Q z X z E z
H z H z
STF z X z NTF z E z
= +
+ +
= +
(27)

The closed loop transfer functions of the signal x and quantization error e
q
have been
denoted Signal Transfer Function and Noise Transfer Function respectively. It is seen that if
H(z) is large, NTF(z) approaches zero and STF(z) approaches unity meaning the signal is
preserved while the quantization noise is suppressed. To achieve lowpass modulation the loop
filter must be an integrator type function with high gain for low frequencies. Bandpass
modulation can be realized by replacing the integrators with resonators having high gain at
the frequency band of interest. The output PSD as a result of (27) is:

( ) ( ) ( ) ( ) ( )
2 2
q x e
S S STF S NTF = + . (28)

Remembering the SQNR derivation (12)-(13) it is seen how an appropriate NTF can
improve the processing gain since assuming |STF()|
2
1 the maximum SQNR will be:

( )
2
max 10 /
2
/
2
10 log [dB]
1
3
B
L
L
SQNR
NTF d
. (29)

An illustration is shown in fig.19, where the shaded area indicates the quantization noise
falling in-band. For obvious reasons DSM is also referred to as noise-shaping.

Figure 19: Illustration of DSM noise shaping

21
A self-evident condition for realizability is that the DSM has no delay-free loops. Thus it
isnt possible to substitute H(z) with a huge gain and expect noise to disappear. The
realizability condition can be formalized as ntf[0]=1 in the time domain or equivalently
NTF()=1 in the z-domain. Maximum error suppression at =0 obviously suggests all NTF
zeros should be located at DC or z=e
i0
. Both these conditions are fulfilled for any order N if
NTF(z)=(z-1)
N
. Often called a basic N
th
order DSM or simply a modN in the literature, fig.20
shows processing gain in bits (according to the 6dB per bit rule) vs. OSR for N=0 (only
oversampling) to N=5. It is seen that if the order is high, the processing gain is very large.
Using high order DSM with 1-bit REQ quickly became very popular in audio since a 1-bit
DAC is guaranteed to have static linearity. The first audio converters were PCM converters
[48], but high order 1-bit DSM quickly took over and soon reached a performance level where
it in many ways outperformed the fundamental limitations of the 16-bit CD-system [49].
However while the basic functionality of a DSM is very simple, the fact that it is a non-linear
feedback system creates issues not apparent when using Bennetts linear model. For instance
a modN will be unstable for large input if N is higher than two and the REQ is few bits.
Because of this the loop filter must be damped, causing a reduction in processing gain. The
output of the DSM furthermore affects the DAC performance and its sensitivity to circuit
errors. The modulator itself is also susceptible to limit cycles and noise power modulation.

Figure 20: Processing gain of modN DSM

It is extremely complicated to do rigorous mathematical analysis of these non-linear effects
and design is often based on simplified rules-of-thumb. An important part of this thesis is to
present extensions of the rules-of-thumb, including a wider scope of error sources and
enabling easier estimation of performance as a function of DSM design parameters.

2.6 The DAC

This section presents the DAC and gives an introduction to the errors it commonly causes.
As will later be seen these errors interact with the DSM REQ and performance estimates can
be given if one knows the DSM design parameters.

2.6.1 DAC topologies

Early DAC implementations e.g. [48] were commonly realized as resistor string DACs.
An example of a resistor string DAC is shown in fig.21, where the different bits of the binary
DAC input data are denoted b
0
b
B-1
. Depending on whether bit b
i
is one or zero, the switch it

22
steers is throughout the sample period connected either to ground or to the reference voltage
through a corresponding resistor in a binary weighted resistor string. The output voltage is
then given by:

0 1 2
2 4 8
.
o F ref
F
ref
b b b
V R V
R R R
R
V q
R

= + + +

=
L
(30)

The DAC input q is the REQ output q offset to unipolar representation, since the DAC uses
positive binary values (elaborated in 2.6.2., DAC encoding).

Figure 21: Resistor ladder type DAC

Use of resistor string DACs gradually lessened because technologies for IC implementation
are not very suitable for large resistive devices. The resistor string will use much die area and
have poor device matching. Resistor string DACs were eventually superseded by switched
capacitor (switch-cap) DACs. Switch-cap is a technique to realize resistor equivalents
through charge transfer in clocked capacitors, first shown in 1977 [50]. It transfers sampled
charge packets creating a resistor equivalent R
eq
=1/(Cf
s
), and can thus be used to implement
continuous amplitude amplifiers or filters with a discrete-time transfer function H(z).
In a typical switch-cap system the output of a functional stage is sampled by the next stage,
meaning that only the settled output value matters. But since a DAC output is continuous-time
it is very important that a switch-cap DAC settles linearly. It is possible to realize a switch-
cap integrator which is insensitive to op-amp slewing and nonlinearity, called a direct charge
transfer integrator. It is distinguished by the input capacitor directly depositing charge on the
integrating capacitor. The DCT integrator was proposed by Bingham in 1984 [51] and a high
performance DCT-based audio DAC was shown in 1991 [52]. Implementations with very
high performance [53] and efficiency [54] have been seen since.
A DCT -based switch-cap DAC is shown in fig.22. In the sampling phase
1
it charges the
sampling capacitor array depending on the input sample data, and in the hold phase
2
it
distributes this charge directly to the integrating or hold capacitor C
h
. Evaluating the charge
redistribution it is found that the input-output transfer function of this DAC will be given by:

( )
( )
( )
1
1
s
o h
DAC
s ref
h
C
V z C
H z
C Q z V
z
C

= =

+

. (31)


23

Figure 22: DCT integrator SC DAC

The low-pass function of the DCT-based DAC will be of benefit to suppress out-of-band
noise from the DSM REQ. With the charge distributed passively between the capacitors, the
settling is given by a linear RC time constant of the capacitors and switches, and the circuit is
insensitive to op-amp slewing. Distortion is still generated from signal-dependent charge
injection and signal-dependent switch resistance variation, which must be alleviated through
good circuit design [53]. Still, its properties make the circuit very suitable for DAC use.
Although the DCT-based DAC still is quite popular in audio converter ICs, it has in recent
years started receding. Instead it becomes more and more common for hi-res DACs to have
current mode output. Then the DAC generates and holds an output current proportional to the
input data, which is externally converted to voltage. The chief reason for doing so is that
lowered supply voltages in modern IC processes reduce the headroom for SC circuits. This
makes it very difficult to implement good switches and opamps, and capacitors must be big to
achieve low kT/C-noise. One way to overcome these problems is to operate the DAC IC in
current mode and use external I-V conversion with a dedicated high supply opamp or even a
discrete transistor stage. Such an arrangement with an opamp is shown in fig.23. Here

ref o
I q I = and V
o
=I
o
R
F
. In practice the external I-V is often combined with the analog RCF.

Figure 23: Current mode DAC with external I-V conversion

The idea to use steered current sources in DACs is not new [55], but it has been revived in
recent times because of the development towards lower voltage IC technologies. The
approach shows good potential with very high resolution having been reported [56], and it
will probably be the dominant hi-res DAC design paradigm for the foreseeable future. Just
like the resistor ladder DAC, the current steering DAC has no discrete time filtering of the
output, and performs straightforward hold reconstruction.

24
2.6.2 DAC encoding

The above examples all show binary encoded DACs where the DAC elements be it
resistors, capacitors or current sources are weighted as binary digits and fed with binary data
from the (DSM) REQ. The base-two numeral system was first described as early as 800 B.C.
by Indian mathematician Pingala [57] and Boole in the 19
th
century developed the modern
concept of binary logic, the basis for all digital circuit operation [58]. Shannon was first to
show automated circuits operating on Boolean logic [59], and in addition to being regarded as
the father of information theory he is also widely acknowledged as the originator of digital
arithmetic circuits.
The term binary encoding is used about non-redundant base-two arithmetic, where B bits
or digits can express 2
B
unique values including zero. A B-bit binary encoded DAC thus
contains B elements with a 2
B-1
size ratio between the largest and the smallest weight or digit.
This is not necessarily the preferable way to implement a DAC since it is difficult to match
elements with large differences in size. A much used way to get around the matching problem
is thermometer encoding. The thermometer code is a redundant base-two code where every
digit has unit weight. Thermometer DACs thus need 2
B
-1 elements to resolve 2
B
values
including zero, all being equal in size.
Digital processing is usually zero mean and operating on signed binary logic, where
negative numbers are represented by the two's complement of the absolute value, or k
equalling 2
B
-k. The first bit is then defined as the sign bit. With switching as described in
2.6.1 the DAC input must be unipolar. This means the DAC input should equal the REQ
output offset by M/2 as shown in table 1. Thermometer encoding is offset by default.

Table 1: Binary and thermometer encoding of DAC, M=8
REQ
output q
Binary q in
twos compl
DAC input
q =q+M/2
Binary DAC
input code
Therm. DAC
input code
-4 100 0 000 0000000
-3 101 1 001 0000001
-2 110 2 010 0000011
-1 111 3 011 0000111
0 000 4 100 0001111
1 001 5 101 0011111
2 010 6 110 0111111
3 011 7 111 1111111

2.7 DAC errors

This section provides a review of common error sources in a typical DAC implementation.
The emphasis is on the nature of the waveform distortion and how it compromises
performance. For a more in-depth review of the circuit mechanisms causing DAC errors, the
reader is recommended to read Wikners thesis [60] or an appropriate textbook.
DAC errors can roughly be divided into two categories; static and dynamic errors. Static
errors are errors that are time invariant whereas dynamic errors are related to the switching of
elements, both leading to distortion and/or noise at the DAC output. Since the emphasis in
this thesis is on current steering converters, errors are presented in the context of this DAC
type and the waveform it produces. It should however be noted that the same types of errors
are also present in other topologies, but then being inferred from other circuit effects (e.g.
capacitor mismatch instead of transistor mismatch). Errors are normalized to the DAC input,
or in other words the REQ output, thus assuming a unity quantizer step-size =1.

25
The performance is throughout assessed in terms of dynamic range measures related to
noise (SNR), distortion (SFDR), or both (SNDR). Static errors are also sometimes expressed
through the INL function. How INL relates to spectral performance and dynamic range is
assessed in [61]; depending on its shape it may cause harmonics and degrade the SFDR or
cause noise-like errors and degrade the SNR.

2.7.1 Jitter errors

The first class of DAC errors isnt really an error in the DAC per se, but more of an
environment variable. It has until now been assumed that T
s
is always constant which in a
real implementation is not the case; deviations in T
s
are inevitable and referred to as jitter
errors. In a typical consumer audio system, data is transferred from the digital source to the
DAC through an SP-DIF connection. SP-DIF uses biphase mark encoding to multiplex data
and clocking in a single coax or optical line. Band limitation in the interconnect wire will then
give rise to signal dependent jitter patterns whereas transmission noise results in random jitter.
The jitter behaviour of SP-DIF was analysed in an early 90s paper by Chris Dunn and
Malcolm Hawksford [62].
The clock signal is recovered in the DAC through an input locked PLL oscillator. A PLL
clock recovery circuit has a frequency dependent jitter transfer function, typically first order
low-pass, meaning that SP-DIF transmission jitter is filtered accordingly. Important research
on the JTF and the impact of jitter on conversion quality was done by the late Julian Dunn in
the early 90s [63]-[64]. This was significant in establishing an understanding of jitter as an
error source in digital audio, for long widely regarded as perfect sound forever. New
interface formats like IEEE1394 have been shown to be even more challenging [65].
In addition to interface jitter which is a function of the transmission and the JTF of the
receiver, the receiver and DAC themselves have intrinsic clock jitter due to on-chip and on-
board noise [66]. Intrinsic jitter is usually regarded as random and consisting of a white
component as a result from circuit thermal noise and a pink component from circuit 1/f-
noise. The jitter variance, usually quantified in (ps)
2
, is typically inversely proportional to the
clock frequency [67]. This means that the jitter standard deviation in ps is proportional to
1/f
1/2
, where in a DAC the clock frequency is typically f=f
s
=f
s_in
L. The AES recently released
a document for standardizing jitter terminology [68].
Analysis of jitter effects in DACs; how it results in output distortion and what kind of
distortion it leads to, has been featured in several previous publications [69]-[72]. The
proposed methods have often been computationally quite heavy [69]-[70], have not
considered the special case of DSM [71], or have been based on experimental results [72].
The work conducted for this thesis resulted in a simple method for prediction of jitter
distortion susceptibility as a function of the NTF and the number of bits in the REQ. As
revealed in later chapters, the modulator output waveform significantly influences the jitter
distortion susceptibility of a DSM DAC.
Figure 24 shows the error waveform of a jittered DAC with zero order hold. Whereas the
ideal waveform should change value at the time instances nT
s
, it in reality occurs with a jitter
offset nT
s
+j(nT
s
). This results in error pulses.


26

Figure 24: Jitter error in the time domain

Consider a single time instant nT
s
for which the corresponding jitter is given by j
n
: In an
ideal DAC with hold reconstruction i.e. where y(t)=M/2+q[n], t{nT
s
, (n+1)T
s
} the jitter
error pulse associated with one jittered sample is given by:

[ ] [ ] ( )
1 1
( ) 1 rect sign
2
n
j n
s n
t
e t q n q n j
T j

=

. (32)

The entire error waveform will be composed of all individual error pulses:

( ) ( )
n
j j s
n
e t e t nT
=
=
. (33)

Consider again the single error pulse from (32): For simplicity the output step defining its
amplitude is now denoted as d
n
. The spectrum of this pulse is then given by:

( ) ( ) { }
( )
( )
0
2 2
0
0 0

sin 2

2
sinc .
n n
n
n
n n
n
n
j j
j
ft ft n n
s s j
j j
j f n n
s
j f n n
n
s
f e t
d d
e dt e dt
T T
d j f
e
fT
d j
j f e
T

= =
<
=
= =
=

i i
i
i
t t
F
(34)

The spectrum for the entire error pulse train in (33) consequently becomes:


27
( ) ( )
2
s
n
fnT
j j
n
f E f e

=
=
i
. (35)

It is possible to approximate this with the DTFT by sampling the rectangular error pulse
e
jn
(t) after band limiting it with a brick-wall filter. A simulation model doing this is described
in [69]. Its problem is that the computation time is very high if a long DFT and brickwall
filter is used for each pulse and it thus has to be done with limited spectral resolution. A
simpler approximation is to assume the jitter is very small in the frequency region of interest,
so that j
n
f<<1. Then (34) simplifies to:

( ) ( ) sinc
.
n
n
j f n n
j n
s
n n
s
d j
f j f e
T
d j
T

i
(36)

This is a constant, i.e. a white spectrum. In other words the simplification assumes that e
j
(t)
consists of Dirac pulses. The composite spectrum (35) can then be simplified
correspondingly:

( )
2
1
1
.
s
fnT
j n n
n
s
n
n n
n
s
f d j e
T
d j e
T
i
i
(37)

As seen this is identical with the DTFT of a sample sequence where the sample values are
given by the relative area of each error pulse. It is in other words a sampled error area model.
For small jitter values it is quite accurate since the error area greatly dominates the distortion
contribution. A previous study gives an assessment on this [73], and it can easily be
calculated how close (36) approximates (34) for a given j
n
.
From (37) and the convolution theorem its apparent that the error spectrum will be the
convolution of the spectra of d and j. Its PSD can generally be expressed as:

( ) ( ) ( )
2
1
j
e d j
s
S S S
T

, (38)

where since d
n
=q[n]-q[n-1] for all n the PSD of d will be:

( ) ( )
2
1
d q
S S e

=
i
. (39)

If the jitter spectrum is white the convoluted distortion spectrum will obviously also be
white, meaning white jitter decreases the output SNR. If the jitter has a pink PSD the jitter
noise density decreases proportionally to the distance from the signal component, making it
rather benign due to the ears frequency masking property. If the jitter sequence and input
signal are both sinusoids there is multiplication of two sinusoidal terms and the trigonometric
angle sum and difference identities can be used to find discrete mixing products at
x
j
.


28
In fig.25 the output spectrum is shown for each of these three cases. In real life the jitter
spectrum will be a combination of both white, pink and sinusoid components. White and pink
jitter noise is as mentioned typically caused by thermal and 1/f noise in the clock circuitry and
by transmission noise. Sinusoid jitter can stem from parasitic coupling between signal lines or
supply ripple, with a significant contribution also being the SP-DIF interface. As highlighted
in [62], the jitter spectrum of an SP-DIF connection has lots of signal correlated sidebands.
Whereas low jitter noise is a matter of good circuit design and sufficient power for high
SNR implementation, the jitter sideband distortion is more difficult to assess and control.
Jitter noise is part of the DACs intrinsic noise and thus limits its overall SNR performance.
Sideband distortion on the other hand stems largely from external audio sources or from the
interface, and is usually not included in a DAC specification sheet. Dunn suggested a so-
called J-test [63] for standardizing measurements of DAC jitter immunity, which has been
quite widely adopted in the audio community.

Figure 25: Jitter distortion from sinusoid, white and pink jitter

Due to the ears frequency masking (fig.4), the audibility of jitter sidebands depends both
on their magnitude and distance in frequency from the signal content. Published jitter
audibility threshold estimates vary from hundreds of nanoseconds [74] to tens of picoseconds
[63]. What is certain is that the jitter PSD is very important to the audibility and that HF jitter
is more critical than LF jitter. Fortunately PLL-based oscillators have a low-pass JTF. It has
also become increasingly popular to use asynchronous sample-rate converters which often
feature JTFs with very low cut-off frequency [75]. ASRCs were originally intended to enable
the connection of several digital sources with different sampling rates in one system, e.g. in a
studio. But they have increasingly been incorporated in consumer audio equipment like CD-
players because of jitter concerns. The thesis by Rotacher [76] provides a comprehensive
review of the properties and design of asynchronous sample-rate converters.

29
2.7.2 Static (mismatch) errors

Static errors in a DAC are caused by physical mismatch between element weights, which
always occur because of production inaccuracies. In the case of a current steering DAC there
is mismatch between transistors used to realize current sources, caused by on-chip
temperature deviations, threshold voltage variations and variations in the gate oxide thickness
[77]. These errors are usually modelled as random stochastic variables although they may in
reality be graded [78]. Error grading can be minimized through layout techniques such as
common centroid, but a designer should know that random error modelling has limitations.
To generalize the notation, element weights are denoted as a set of non-physical variables
w
i
, ideally being of unity value
3
. A generalized schematic of the binary weighted DAC where
q consists of bits b
0
to b
B-1
will then look like fig.26.

Figure 26: Generalized schematic of binary encoded DAC

The mean element weight is:

1
0
2
2 1
B
i
i
i
B
w
w
=
=
. (40)

Ideally the DAC would have a perfectly linear transfer curve from 0 to 2
B
-1, but because of
non-ideal element weights the real one is non-linear. The deviation from linearity or INL as a
function of q derived from q =[b
0
b
1
b
B-1
] is:

( ) ( )
1 1 1
0 0 0
2 2 2
B B B
i i i
i i i i i
i i i
INL q b w b w b w w

= = =
= =

. (41)

As seen the MSB or close to MSB elements are very critical for the INL, which is the
reason why binary encoding is not optimal for high resolution DACs. The relative accuracy of
the largest weight must be at least 1/(2
B
) for the DAC to have a monotonically increasing
transfer function. This implies the need for transistors with a very large gate area.

Figure 27: Generalized schematic of thermometer encoded DAC

3
Referred to q, if referring to the input x in a re-quantizing system the ideal weight is

30
If the DAC is thermometer encoded the mean weight of its 2
B
-1 equal elements is:

2 1
0
2 1
B
i
i
B
w
w
=
=
. (42)

Thermometer encoded, i.e. with two-level representation where t
0
t
q-1
=1 and t
q
t
2
B
-1
=0,
the INL becomes:

( ) ( )
1 2 1
0 0
B
q
i i i
i i
INL q w q w t w w

= =
= =

. (43)

Relative element matching is now much less critical. If the mismatch is 1% for any given
DAC element, its INL contribution is 0.01 LSB. Furthermore it is guaranteed that the DAC
transfer function will be monotonically increasing since the total output value always
increases when more elements are connected to the output. This makes a thermometer
encoded DAC, albeit having significantly higher routing complexity due to its 2
B
-1 elements,
often the desirable alternative.

Figure 28: DAC transfer function, ideal and with INL

As will be seen later, the redundant nature of thermometer encoding can also be exploited to
implement digital algorithms for mismatch-shaping, so-called dynamic element matching.
DEM performs spectral shaping of mismatch errors. Although the spectral distortion as the
result of a given INL curve must be found through simulations, it is later shown how to
estimate it for common DEM algorithms under the assumption of random element mismatch.

2.7.3 Dynamic (switching) errors

In addition to mismatch errors which are time invariant (except if caused by temperature
variations), another major source of waveform distortion is switching errors. In a current-
steering DAC, switching errors can be caused by charge injection from the transistor switches
as well as finite rise and fall times due to parasitic capacitances [60]. Again this thesis does
not explore the circuit behaviour, but builds its analysis on generalized error waveform
modelling [79].

31
If the DAC is thermometer encoded and all elements are identical, it can be assumed that
switching on one element is associated with an on-error pulse e
on
(t) and switching off an
element is associated with an off-error pulse e
off
(t). Figure 29 shows the error for one element.
To derive the spectral distortion this causes will require exact knowledge about the time
domain behaviour of the error waveform. This is not possible to find for any but the simplest
circuit approximations and pulse simulation would still be very computationally demanding
as pointed out in the jitter section. Therefore the switching error analysis just like the jitter
error analysis uses error area modelling.

Figure 29: DAC element on and off switching and error waveform.

Error area modelling assumes that a net error area is added to or subtracted from each
sample depending on how many elements are switched on or off. The mathematical analogy is
again to assume that error pulses are Dirac pulses with a white spectrum and with strength
given by the error area.
In a thermometer encoded DAC it is seen from table 1 that if the DAC input value increases
from sample n-1 to sample n, a total of q[n]-q[n-1] elements are switched on. If the DAC
input value decreases, a total of q[n]-q[n-1] elements are switched off. The error associated
with the sample sequence, also called its ISI, is therefore:

[ ]
[ ] [ ] ( )
[ ] [ ] ( )
1 , 0
1 , 0
on n on n
ISI
off n off n
q n q n e d e d
e n
q n q n e d e d

= <
. (44)

From the ISI sequence it is relatively straightforward to calculate the waveform distortion
produced at the output if q is sinusoid. As converter resolution increased throughout the
1980s, ISI became a dominant source of distortion and designers found it increasingly
difficult to keep the switching errors sufficiently small. This was particularly problematic for
the 1-bit DSM DACs dominant at the time, because the entire full scale value is switched each
time the DSM output changes. An interesting discussion on these difficulties is found in a
1986 paper by Adams [49], and proposed solutions like return-to-zero (RZ) appeared quickly
thereafter [80].


32
2.7.4 Finite output impedance

In addition to mismatch, another static distortion source in a current steering DAC is the
finite output impedance of the current sources. For any DAC input value k there are k unit
current sources coupled to the output in parallel. This means that the equivalent small signal
circuit is as shown in fig.30.

Figure 30: Equivalent small signal circuit for current steering DAC

For simplicity the output impedance is assumed purely resistive for most cases a valid
approximation [81] and for low audio frequencies it is also assumed that the load given by
the IV-converter is resistive. The output voltage then becomes:

( )
ref
o
L o
q I
V q
g q g
=
+
. (45)

In (45) g is the conductance value g=1/R. The mean LSB weight for an M-level DAC is:

( )
.
o
ref
L o
V M
w
M
I
g M g
=
=
+
(46)

This results in the output INL:

( ) ( ) ( )
( )
1
0
1

.
q
o o
i
L ref
o
INL q V q V q q w
R q q M I
R
=
=

=
(47)

If the current-mode DAC is differential which is usually the case in high resolution
implementations, the output voltage is:

( )
( )
( )

ref ref
o
L o L o
q I M q I
V q
g q g g M q g

=
+ +
. (48)

By simple manipulation of the single-end procedure the INL is then found to be:

( )
( ) ( )
( ) ( )
2
2 2 2
2
2
o L ref
L L o o o L o
g g q q M M q I
INL q
g g g M g Mq g q g Mg

=
+ + +
. (49)

33
Not surprisingly the single-ended INL is a parabola and the output SFDR is HD2 limited,
whereas in the differential case the INL is anti-symmetric around the centre point and the
SFDR is limited by HD3.
Figure 31 shows the INL pattern caused by finite current source output impedance for both
cases. The DAC is 15-level and the y-axis is normalized to the LSB. The current source
output resistance R
o
=1M and the load resistance R
L
=50. Given that the IV-converter
represents a 50 load; 20-bit linearity or INL below -120dB relative to full-scale, will require
current elements with an output impedance of approximately 150k in the differential case
and a momentous 200M in the single ended case. It is thus very obvious why differential
output is strongly preferred for high resolution applications.

Figure 31: INL from finite output impedance

In consumer audio equipment it is very common for the full-scale output voltage to be
2V
RMS
. This is not a formal standard, but has established itself as a de facto standard. If the
IV-converter is a 50 passive resistor the DAC output current would have to be 40mA
RMS
. To
increase the transresistance of the IV-converter while maintaining a low load resistance, an
active transresistance amplifier as shown in fig.23 must be used. For the basic circuit in fig.23
the transresistance is approximately equal to R
F
if the opamp has high open loop gain,
whereas the equivalent load resistance it represents is:

( )
0
1
in F
L
in F
R R
R
A R R
=
+ +
. (50)

A
0
is the opamp open loop gain and R
in
is the opamp input resistance. Since the
transresistance is approximately equal to R
F
it is given that V
o
I
out
R
F
. The necessary size of
I
out
and by extension R
F
is determined by the fundamental noise limitation of the IV-converter
and the SNR requirement for the system.
Illustrating by example; if the IV-converter SNR requirement for 2V
RMS
output is specified
to 125dB@0-20kHz, its white noise voltage density must be below 8nV/Hz
1/2
. If the opamps
input referred noise current density is 5pA/Hz
1/2
, a balanced version of fig.23 with two
feedback resistors and one opamp gives the following noise equation:


34
2
12
2 4 5 10 8
1 k .
F F
F
nA nA
kTR R
Hz Hz
R

+

(51)

From this it is found that the minimum DAC output current requirement is:

2
2
RMS
out RMS
F
V
I mA
R
= . (52)

Note that this does not include any noise produced in the DAC itself, such as quantization
noise, noise in the current sources, parasitically coupled noise, jitter noise and so on. It is
therefore normal to include some headroom. Good design practice is to aim for a white noise
dominated system where the IV converters white noise specification is 3-6dB better than the
total system SNR, whereas all other noise sources such as quantization noise are designed 3-
6dB better than this again.
This means that if the targeted final system SNR is 120dB; the DAC output current is
perhaps scaled for 125dB SNR IV-conversion, while noise contributions in the DAC current
source noise, mismatch noise, jitter noise, quantization noise and so on are all specified to
be below -130dBFS. As an example the TI DSD1792A; a high end current-mode DAC with
127dBA SNR at 2V
RMS
, has 2.75mA
RMS
full-scale output current [82].

35
Chapter 3

The Delta-Sigma Modulator

This chapter takes a closer look at the delta-sigma modulator. Different architectures are
described, research in stability theory is briefly reviewed and it is discussed how non-ideal
behaviour affects the modulators performance. The reader should through this gain insight in
how delta-sigma modulators are designed and how things like the number of bits in the re-
quantizer, the NTF and the OSR affects the performance and design conditions.

3.1 Delta Sigma Modulator Design

As seen in ch.2.5 delta-sigma modulation is in principle relatively straightforward. For
convenience the basic structure and the input-output relation is repeated:

Figure 32: Basic delta-sigma modulator

( )
( )
( )
( )
( )
( )
( ) ( ) ( ) ( )
1
1 1
.
q
q
H z
Q z X z E z
H z H z
STF z X z NTF z E z
= +
+ +
= +
(53)

To realize modN noise shaping or NTF(z)=(z-1)
N
with this structure we have that:

( )
( )
( )
( )
( ) ( )
1 1
1
1
1
1 1
N
N
N
z
H z
NTF z
z
STF z z

= =
=
. (54)

The processing gain of this NTF is shown in fig.20 and it is seen by inspection that the STF
is approximately unity for <<. Although fine for most purposes, the basic modulator
structure can be improved by adding a direct feed-forward path to the quantizer input as
shown in fig.33. This is often referred to as a Silva-Steensgaard structure [83]. The input-
output relation is now:

( ) ( )
( )
( )
1
1
q
Q z X z E z
H z
= +
+
. (55)

Chapter 3 The Delta-Sigma Modulator

36

Figure 33: The Silva-Steensgaard modified DSM structure

To add this path forces a unity STF which allows the NTF to be chosen arbitrarily. Another
improvement that is especially of relevance in ADC design is that the input to the loop filter
H(z) which as seen from the figure is given by x-q now only consists of shaped noise. In
an ADC the loop-filter is analog and will have some degree of nonlinearity, but since the filter
now doesnt process x, nonlinearity will not cause signal distortion
4
.
These two DSM structures are both global feedback systems, where the output is fed back
and subtracted from the input in a single node. This is not very flexible and it is entirely
possible to design the STF and NTF separately by having different inputs to the loop filter for
the input and feedback terms. A generalization of the basic structure with separate filter inputs
is shown in fig.34.

Figure 34: Generalized DSM structure

( )
( )
( )
( )
( )
( )
( ) ( ) ( ) ( )
0
1 1
1
1 1
.
q
q
L z
Q z X z E z
L z L z
STF z X z NTF z E z
= +

= +
(56)

It is seen that if L
0
=-L
1
=H, the STF and NTF are identical to the basic structure. If L
0
=1-L
1

the STF is unity regardless of how L
1
and from it the NTF is designed.
A very common implementation strategy is to use a distributed feedback structure, which in
its basic modN form is shown in fig.35. Looking at fig.32, the loop filter for making a mod1
NTF will be a single integrator H(z)= I(z)=1/(z-1). As it turns out the DSM can be increased
from mod1 to modN simply by cascading N integrators, given that the output is fed back to
each integrator input. This can be verified through the generalized structure in fig.34.
The input signal goes directly through all integrators, while the feedback signal is split into
N branches where each is subtracted and goes through all integrators up to the quantizer:

( ) ( )
( )
0
1
1
1
N
N
k
L z I z
z =
= =
. (57)
( ) ( )
1
1
1
1
N N
k
k
I
L z I z I
I
=

= =

. (58)

4
If the quantization error is input dependent in the first statistical moment, i.e. correlated with the input, there
may be some distortion. But much less than with the basic structure.

37
Using (56) to find the NTF and STF it is given that:

( )
( )
( )
1
1
1
1
N
NTF z z
L z
= =
. (59)
( )
( )
( )
0
1
1
N
L z
STF z z
L z

= =
. (60)

Figure 35: Basic modN distributed feedback DSM

To cascade integrators like this was actually the original idea for improving on mod1 noise
shaping [31]. Analyses of second [84] and higher order [85] implementations followed. It
should be noted that although I(z) is here assumed to be a delaying integrator 1/(z-1), non-
delaying integrators I(z)=z/(z-1) can be used for all instances but the innermost to reduce
modulator latency. A few samples latency is however normally tolerable and in a DAC this
choice makes little difference. In an ADC it may be advantageous to use delaying integrators
since each one can then settle independently in a switch-cap loop filter. With only delaying
integrators the modulator is often called a Boser-Wooley DSM [86].
The realizability condition from ch.2.5 that there are no delay-free loops in the system
can be formalized by writing the NTF on a form to which it must comply:

( )
1
N
k
k k
z z
NTF z
z p
=
. (61)

Here z
k
and p
k
denote the zeros and poles of the meromorphic NTF function. Until now only
FIR NTFs with all zeros at DC and all poles at the origin have been considered. A problem
with higher order DSMs of this kind is instability [87] and to avoid it one may have to move
poles rightward, giving less peak gain around f
s
at the cost of less damping around DC. Doing
so is quite intuitive with distributed feedback where damping the NTF means damping the
feedback terms. Maintaining STF control is done with corresponding feed-forward
coefficients, giving the structure in fig.36. Such a generic structure has a high degree of NTF
and STF controllability.

Figure 36: Generalized distributed feedback DSM


38

From this figure it is found that now:

( ) ( )
( ) ( )
( )
1
1 2 1 1
0
1
1 1
1
N
N
N N k
k N
k
b b z b z
L z b I z
z
+
+ +
=
+ +
= =
L
. (62)
( ) ( )
( ) ( )
( )
1
1 2
1
1
1 1
1
N
N
N k
k N
k
a a z a z
L z a I z
z
=
+ +
= =
L
. (63)

And by the same algebraic manipulation as before:

( )
( )
( )
( ) ( ) ( )
1
1
1 2
1
1
1
1 1 1
N
N N
N
z
NTF z
L z
a a z a z z
= =
+ + + + L
. (64)
( )
( )
( )
( ) ( ) ( )
( ) ( ) ( )
0 1 2 1
1
1
1 2
1 1 1
1
1 1 1
N N
N N
N N
N
L z b b z b z b z
STF z
L z
a a z a z z
+
+ + + +
= =
+ + + +
L
L
. (65)

The NTF still has all its zeros at DC, but the poles are now determined by the feedback
coefficients. The STF has the same poles as the NTF and its zeros are controlled by the feed-
forward coefficients. If b
k
=a
k
and b
N+1
=1 the zeros cancel the poles and the STF is unity. If all
input branches but b
1
are removed the zeros are at infinity and the STF is b
1
/A(z) where A(z) is
the feedback polynomial. This is typically a low-pass function which can be of benefit for
suppressing alias residues from the interpolation. The STF can also be designed to
compensate for passband droop in the interpolation filter.
Another improvement that can be made is to replace some of the integrators with resonators
to spread NTF zeros across the signal band. This can give lower total in-band noise power and
improve the processing gain, especially for low OSR. It is also possible to optimize the NTF
from a psychoacoustic point of view by placing zeros where the hearing is most sensitive
[88]. A resonator introduces a pair of zeros at a resonance frequency
r
so each resonator
must be built from two integrators. Typically a low-pass DSM has at least one NTF zero at
DC, so to add one pair of non-DC zeros the modulator must be at least third order, for two
pairs fifth order and so on. An example of the former is shown in fig.37. Note that at least one
integrator core in each resonator must be delaying to avoid delay-free loops.

Figure 37: Distributed feedback DSM with resonator for NTF optimization

With only delaying integrators, the resonator has a local loop transfer function of:

( )
( )
( )
2 3
2
1
1
2 1
a a z
R z
z z g
+
=
+ +
. (66)

39

This means that its poles, which will equal NTF zeros, are located at:

1 1
1
r
r r
z g g

<<
= i . (67)

It follows that if an NTF zero at e.g. 4kHz is desired and the sampling rate is 44.1kHz64,
the resonator coefficient should be g
1
=0.0944. Figure 38 compares a fifth order NTF with
zeros optimized for an OSR of 64 to one where all the zeros are at DC.

Figure 38: Optimization of NTF zeros

In addition to the described distributed feedback, another popular structure is distributed
feed-forward. Then there is one global feedback path and several feed-forward branches to the
quantizer input as shown in fig.39. The NTF and STF can be derived using the same
procedure via the generalized structure of fig.34; this is left as an exercise for the reader.
Resonators can be inserted the same way as before. The advantage with this approach is that
each integrator sits in a local Silva-Steensgard loop, meaning no integrator inputs contain any
signal component. As mentioned this gives linearity benefits in ADC implementations. A fifth
order FF-DSM is Sonys recommendation to use in ADCs for the SACD-format [89].

Figure 39: Distributed feed-forward DSM structure


40

3.2 Alternative Delta-Sigma Structures

From the basic topologies reviewed, many modified or slightly different modulators have
been published where some tricks are typically used to improve a certain design parameter.
To show all these would be much too comprehensive for this text, but a few of the most
significant are introduced; namely the error feedback modulator, the multi-stage or MASH
modulator and the Trellis modulator. All these have been shown in audio applications.
Beginning with the error-feedback modulator; this is a simple and seemingly very attractive
structure that is unfortunately unsuitable for ADCs but frequently used in DACs [90]-[91]. It
is shown in fig.40. The reason that it is not suitable for ADC use is apparent, since there is an
analog subtraction and a loop filter in the feedback path with no error suppression from it to
the final output. In a distributed output feedback DSM only the first integrator is without error
suppression.

Figure 40: The error-feedback DSM structure

( ) ( ) ( ) ( ) ( )
( ) ( ) ( )
1
.
q
q
Q z X z H z E z
X z NTF z E z
= +
= +
(68)

In a DAC this structure makes it simple to realize the loop filter since it is only 1-NTF(z).
Especially if the modulator can handle a basic modN NTF it is simple to implement H(z) since
it then is a FIR filter. Unlike reported in [91], the stability constraints (more about this in
ch.3.3) are the same as for a regular DSM with identical NTF and unity STF.
Another frequently used alternative is multi-stage or MASH noise shaping. MASH was first
introduced in 1986 as a way to obtain mod2 and later mod3 DSMs using single integrator
loops [92]-[93]. Generalized MASH was analyzed in a 1989 publication [94]. A MASH is
made up of cascaded sub-modulators, and is often described as an o
1
o
2
o
M
MASH where o
k

is the order of modulator k in the cascade. Many applications use a 1-0 MASH (often called a
Leslie-Singh modulator), but in audio it is more common to use a 2-1-1 MASH. A two-stage
MASH example is shown in fig.41, with its input-output relation given in (69).


41

Figure 41: A two-stage MASH modulator

( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( ) ( ) ( )
1 1 1 2
1 1 1 1
2 2 1 2 2

.
q
q q
Q z H z U z H z U z
H z STF z X z NTF z E z
H z STF z E z NTF z E z
= +
= +
+ +
(69)

The first stage quantization error will be cancelled if fulfilling the condition:

( ) ( ) ( ) ( )
1 1 2 2
0 H z NTF z H z STF z = . (70)

(70) is fulfilled if H
1
(z)=STF
2
(z) and H
2
(z)=NTF
1
(z). The output is then given by:

( ) ( ) ( ) ( ) ( ) ( ) ( )
1 2 1 2 2 q
Q z STF z STF z X z NTF z NTF z E z = + . (71)

It is seen that if both the first and second stages are mod2 the total NTF will be mod4, or
more generally o
MASH
=o
k
. At the same time loop stability is determined by the low order sub-
modulators, which is greatly advantageous. For DA conversion the disadvantage with MASH
is that it cannot possibly be used to realize a two-level DAC output, since filtering and
recombination of the sub-modulator output terms will produce a multi-level signal. In MASH
ADCs this is not a problem, but there may be some leakage of e
q1
since the analog modulator
loop can not be made to match exactly the digital post-filter function. Still the MASH can be a
useful structure for both.
The last modulator structure that is looked at in this section is the Trellis noise shaping
modulator. TNSM is a look-ahead modulator scheme used to improve 1-bit noise-shaped
encoding. The basic principle for look-ahead modulation is based on the ultimate modulator
shown in fig.42, and TNSM was first introduced in a 2002 publication by Kato [95].


42

Figure 42: Principle for the ultimate modulator

The complete input data sequence can be written as a vector x. In an ultimate scenario, x
should be compared to every possible permutation of an equally long binary output vector and
the one producing the smallest total error should be chosen as the output. Since in-band noise
is sought to be minimized, the error evaluation is weighted by H(z) meaning a noise shaping
function NTF(z)=1/H(z) is imposed. Finding the smallest error is formalized as minimizing an
error cost function, typically the MSE. The single permutation giving minimum cost is the
ultimate output sequence. Obviously this isnt feasible to implement as x could be infinitely
long and an ultimate modulator would need infinite memory and storage.
Consider a single sample instant: A binary ultimate modulator has the choice of setting
q[n]=0 or q[n]=1. In the next sample instant it has two options of setting q[n+1]=0 or
q[n+1]=1 for each q[n]. In other words there are four possible permutations or candidate
paths 00, 01, 10 and 11 from n to n+1. There are eight candidate paths from n to n+2,
sixteen to from n to n+3 and so on. The TNSM uses a variation of the Viterbi algorithm [96]
to keep the number of candidate paths constrained.
For a sub-sequence of length L there are 2
L
candidate paths. In the sample instant following
it, either 0 or 1 can be added bringing the number up to 2
L+1
. But if adding either 0 or 1
is determined depending on what gives lowest cost with the existing candidates, half the
candidates can be discarded and there are only 2
L
candidates left for the next sample instant as
well. Saving these and accumulating the cost function, the procedure can be reiterated for
minimum accumulated cost at every sample instant, generating a candidate trellis of width 2
L
.
Usually L is referred to as the trellis order.
If this procedure runs long enough the paths will converge, meaning that all candidates in
the candidate trellis at n=k are likely to have originated from the same output for n=k-D where
D is a large integer. The output can thus be unambiguously determined from backtracking of
the trellis by D samples.
Figure 43 shows a general block diagram of a TNSM. The 2
L
processing units are used to
determine the accumulated baseband error cost from adding 0 as well as 1 to each
candidate of the L
th
order trellis. The cost metrics are then sent to a trellis generator that
determines which to choose, discards half the candidates and advances the trellis one step.
The new trellis layer must be fed back to update the filter states according to the choice
made. A total of D trellis layers are stored in the trellis register. Since the paths converge the
output generator can create an output sequence unambiguously by backtracking though the
trellis register.


43

Figure 43: Trellis noise shaping modulator

In TNSM publications it is typically suggested for the trellis order L to be between two and
four. The higher the better, but requirements for storage as well as the number of
computations doubles for each increment of L meaning that the trellis order is limited by
complexity. The backtracking depth D is typically recommended to be a few thousand
samples. The cost function is usually the cumulative MSE or:

[ ] ( )
2
0
n
i i
k
c n w k
=
=
. (72)

Here w
i
(k) is the filter output of processing block i at time instant k. Because of the high
complexity, recent research has revolved around further path reduction while maintaining the
desirable properties of ultimate modulators [97]. Alternative algorithms for look-ahead
modulation have also been shown [98]. The advantages of the TNSM are probably better
understood after reading the next sections on the non-idealities in a regular DSM. Because of
a more global error optimization the TNSM is much less tonal, it is less susceptible to noise
power modulation and perhaps most importantly it is much more stable. Whereas a
traditional high order 1-bit DSM is typically limited to -6dBFS or less stable input range, a
TNSM with the same NTF may work up to -2dBFS [95]. Furthermore it loses track gently,
rather than having the catastrophic instability behaviour of a regular DSM.

3.3 Stability

As mentioned in ch.2.5 a modN NTF will not yield a stable DSM for high N. From this it is
understood that ensuring BIBO stability in the NTF is not sufficient to know if the modulator
is stable. Modelling the quantizer as an additive noise source does not take into account the
fact that it is in reality a nonlinear unit with limited input range. For example one can envision
a situation where the input signal x is so large that the input to the first integrator is always
positive; in this case the integrator output steadily increases without bond and drives the
quantizer into overload. The output then loses track of the input and typically the modulator

44
starts to oscillate. Figure 44 shows an example of modulator instability. For the first few
samples the output tracks the input and performs noise-shaped 7-level quantization, but when
the input gets too large the quantizer starts overloading and the system goes into oscillation.

Figure 44: Example of instability in high order DSM

Analysis of instability in non-linear feedback systems is extremely difficult and for high
order DSMs no analytical method to find absolute stability constraints exists. It is clear that
quantizer overload is not a sufficient proof of instability, since for instance a 1-bit DSM REQ
operates in overload for all non-zero input signals. Therefore design constraints are
determined empirically and the designer must use extensive simulations to ensure stability in
system implementations.
Some practical rules-of-thumb and non-rigorous mathematical methods have been published
that provide a starting point, the most famous of these probably being Lees Rule [99]. It
formulates the following NTF constraint:

( ) 1.5 NTF

. (73)

Lee found through extensive simulation work that if this constraint is met, the 1-bit DSM
will most likely be stable for input up to 0.5 normalized amplitude or -6dBFS. It must
however be noted that there is no mathematical proof for this, and stability must still be
validated by the designer. Some sort of automated reset function should also be included in
case of instability [100]. In a multi-bit DSM the NTF can be more aggressive since the
quantizer non-overload range is bigger compared to the quantization step and error. It has e.g.
been suggested to use the restriction ||NTF()||
3.5 in 3-bit modulators [101]. Alternatively

one can keep the NTF conservative but allow a bigger input range. This depends on whether
quantization noise dominates the noise budget and it is a trade-off that should be done early in
the design process. If the stable input range relative to full scale output is a value |A
max
|<1,
typically 0.5<|A
max
|<1, the maximum stable SQNR is found by modifying (29) accordingly:

45

( )
2 2
max
max 10 /
2
/
2
10 log [dB]
1
3
B
L
L
A
SQNR
NTF d
. (74)

Another method that is much used in addition to Lees Rule is the Root Locus method
[102]-[103], also developed for 1-bit DSM REQs. A 1-bit quantizer can be seen as a gain
element, where the gain g is inversely proportional to the input amplitude as shown in fig.45.
The linear DSM model can then be modified accordingly, also shown in the same figure.

Figure 45: Modified linear DSM model used in Root Locus method

The modified NTF becomes:

( )
( )
( ) ( )
1
1
1
g
NTF z
NTF z
g g NTF z
=
+
. (75)

A simple check of stability can then be made by sweeping g from 0 to 1 and see if the NTF
remains BIBO-stable over the whole range, or in other words check if the poles (roots) stay
inside the unit circle for all g. For small g or in other words large input one may find that
the roots move outside the unit circle, which will be an indication of instability.
Figure 46 shows the simulated processing gain for modulators designed to be stable with 1-
bit quantization according to the above methods. Compared to fig.20 it is seen that the
processing gain is severely restricted, especially for low OSR. With multi-bit quantization the
choice of NTF is less restricted, and with more than four bits or so the processing gain can be
very close to that of fig.20. Even though multi-bit quantization has now become very popular
also in high OSR audio converters, it was the desire for low OSR and high bandwidth delta-
sigma conversion that really drove the development of multi-bit DSM.


46

Figure 46: Processing gain with 1-bit stable DSM

The mentioned methods for stability analysis have in common that they are simple, backed
by empirical results [104] but also that they lack a rigorous mathematical basis. Many
attempts have been made to develop a mathematical framework for better analysis of higher
order delta-sigma modulators. The quasi-linear describing function method was used in an
early publication by Ardalan [105], whereas Hein [106] and Wang [107] pursued approaches
based on geometric analysis. Several publications deal with efforts to build a framework
based on non-linear dynamics; first used in DSM analysis by Freely [107] and given a
comprehensive treatment in the thesis of Risbo [109]. A quite recent paper by Reiss [110]
provides a historical review of non-linear DSM analysis and some general assessments of the
road ahead. Reiss also presented an intriguing paper at the 124
th
AES Convention [111], in
which parallel decomposition of the loop filter was used to break the DSM down into a sum
of first order functions, mutually dependent only through the quantizer function. This
provided very promising simulation results in support of a framework for stability analysis of
general higher order 1-bit modulators.
Although they are significant for developing a theoretical foundation, practical application
of most of the above methods is problematic due to them being very difficult to use and
typically only shown with strong limitations on input conditions, initial conditions and
modulator designs. Schreier et al. published significant results by estimating stability bounds
based on invariant sets in the DSM state space [112], and developing computer code for how
to find them [113]. But this method too is neither rigorous nor analytical.

3.4 Cyclic Behaviour, Tones and Noise Power Modulation

An expression for the processing gain of a DSM was found in ch.2.5 using the linear
quantizer model. Typically when designing DSM-based converters, this method is used to
choose an appropriate OSR and NTF for a target SQNR. The linear model does however hide
some unfortunate effects also present during stable operation. The most serious one is perhaps
cyclic behaviour. Cyclic behaviour can be understood through a simple example; a 1-bit mod1
with a rational DC-input. The output from a mod1 with a single delaying integrator is in the
time domain given by:


47
[ ] [ ] [ ] ( )
1
1 1
n
k
q n Q x k q k
=

=

. (76)

This can be rewritten to:

[ ] [ ] [ ]
1 1
1 1
n n
k k
q n Q x k Q q k
= =

=

. (77)

Consider a binary quantizer and an input sequence x[n]= for all n. Then q[n] will be
given as the sequence {1,1,-1,1,1,-1,1,1,-1,1,1,-1...}. Not surprisingly the output mean is
since mod1 has an NTF zero at DC. But it is seen that the bit pattern repeats and the output
energy is concentrated in f
s
/3. This is known as a limit cycle. Repetitive output patterns or
limit cycles in mod1 was first described mathematically by Candy [114], while Friedman
[115] extended the analysis to describe limit cycles in mod2. For high order DSMs it is much
more difficult to find limit cycles but it has been proven that they exist [116]-[117].
A problem caused by cyclic behaviour is idle-tones. Whereas a limit-cycle as such describes
the repetitive output pattern occurring under strictly defined state conditions, an idle-tone is a
discrete component appearing in the noise spectrum during normal operation because of
cyclic behaviour [116]. This should be kept in mind although a lot of literature does not
distinguish between the theoretical limit cycle and the practical idle-tone. Idle-tones occur
when the input is idle, i.e. DC. Figure 47 shows the output spectrum of a fifth order binary
DSM with optimized NTF zeros and rational DC input stimuli. It is seen to have some clearly
visible in-band idle-tones.

Figure 47: Output spectrum from fifth order DSM with rational DC input

Tones in the output spectrum inferred from cyclic behaviour also occur when the input is
active. This has been shown for simple sinusoid input signals and in the literature it is referred
to as modulator harmonic distortion [114], [118].
Since the ear is generally much more sensitive to discrete tones than to noise, cyclic
behaviour is highly undesirable. It can be reduced or avoided through dithering: If the error is
uncorrelated with the input the DSM will be tone-free. This is as known from ch.2.4 achieved
with full NPDF dither of any N. Lesser dither weakens the correlation without removing it,
thus reducing tones without eliminating them fully. Full dither is achievable with many levels
in the REQ, while for a few-level or 1-bit DSM dithering even less than this will significantly
reduce the stable input range. Alternative techniques have therefore been proposed like
making the DSM chaotic [119]. Chaotic operation is achieved if one or more NTF zeros are

48
moved outside the unit circle, by modifying one or more integrators so that (z)=1/(z-) where
>1. The thesis of Risbo [109] investigates chaotic modulators. It is also possible to use
dynamic dithering where the dither level is inversely proportional to the input level [120].
Then the dithering is strong for weak input and vice versa. As such it makes use of the ears
masking property since a loud signal will mask tones.
Another potential problem is noise-power modulation, already introduced in the chapter on
quantization and dithering. It has often been argued that the DSM is a self-dithering system
since quantization noise is fed back into the modulator from the output. However
Wannamaker proved in his thesis [43] that since the fed back error is not input independent in
any other moment than the dither forces it to be; the m
th
derivative of the input and dither joint
PDF has be zero in all multiples of the quantization frequency if the m
th
error moment is to
be input-independent:

( ) ( ) ( )
,
sinc
0 , 0
m
v x
m
n
u
d u u
n
du
=

= . (78)
( ) { }
, ,
,
def
v x v x
f v x = F . (79)

With no a-priori knowledge of the input statistics this is only ensured if:

( ) ( ) ( ) sinc
0 , 0
m
v
m
n
u
d u u
n
du
=

= . (80)

This is the exact same requirement as for the non-modulating dithered quantizer in 2.4.
What it means is that to guarantee error moments 1 to m to be input-independent, a DSM
REQ needs m
th
order dithering just like an ordinary REQ. The fundamental dither requirement
does not change. This postulate led to some dissention and heated debate between those
claiming the DSM to be self-dithering and the purveyors of Widrows statistical analysis [42].
It is clear that if the REQ is situated in a high order DSM loop, the input-dependency of the
error is quite weak and resulting noise-power modulation quite low, just like tones are low
and the distortion is low. An investigation of practical levels of in-band noise power
modulation in several DSMs was featured in the first paper (Appendix 2), which was intended
to provide a pragmatic context to this discourse. Further investigations are also featured in the
recent thesis by Campbell [122].
It should be noted that the method of chapters 2.4 and 3.4 is not directly applicable to the 1-
bit quantizer which has traditionally been used in audio DSM converters. Assuming the 1-bit
quantizer takes the sign of the input still denoting the quantization step its output has
two discrete probabilities as shown in fig.48.


49

Figure 48: Input PDF (a) and output PDF (b), single-bit quantizer

For simplicity the following equations are normalized to the quantizer output, i.e. the output
is 1 and =2. Since it is known that a DSM REQ with NTF zeros at DC forces the average
output to equal the average input, the probability for the two output states can for any input
level be described by the following two relations:

( ) ( ) 1 1 P q P q x = = = . (81)
( ) ( ) 1 1 1 P q P q = + = = . (82)

Combining these two relations, the output PDF for any static input level is found to be:

( ) ( ) ( )
1 1
1 1
2 2
q
x x
f q q q
+
= + + . (83)

The quantizer error is given by e
q
=q-x and its PDF is thus given by:

( ) ( ) ( )
1 1
1 1
2 2
q
e q q q
x x
f e e x e x
+
= + + + + . (84)

From the PDF each statistical moment of the error is easily found:

0
q
E e =

. (85)
2 2
1
q
E e x =

. (86)
( )
3 2
2 1
q
E e x x =

. (87)

It is noteworthy that as long as the average output equals the average input, this relation is
constant regardless of any applied dither. This suggests that no dither changes the noise
power modulation in a 1-bit DSM. It was supported by simulations in the first paper
(Appendix 2). The in-band noise power modulation was found to be dominated by whether or
not any idle-tones fell in-band, and by a power increase when the quantizer started to
overload. This means that the sole purpose of dithering in a 1-bit DSM should be to eliminate
tones. Since dither reduces the stable input range and increases the occurrences of overload, a
binary DSM should not be dithered beyond rendering it sufficiently tone-free

50
3.5 Non-Overloading Delta-Sigma Modulators

As is now clear it is very difficult to ensure stable operation and control the quantization
noise behaviour in a DSM REQ. To eliminate both tones and noise-power modulation
completely the quantizer needs full TPDF dither. Being of width this will eat up much of
the input range in a few-bit quantizer, and if it can not be used while maintaining sufficient
stable swing some non-ideal behaviour must be tolerated. Extensive simulations will be
needed during design to ensure non-idealities dont deteriorate the output performance beyond
what is acceptable.
If the quantizer is many-bit, there is on the other hand a simple method to guarantee
stability for a strictly defined input range and if desired with full TPDF (or any) dither to
eliminate idle-tones and noise-power modulation. This is achieved by designing the DSM
using the non-overload method [123], which ensures no quantizer overload according to the
range shown in ch.2.2. Repeating the DSM input-output relation in the z-domain,

( ) ( ) ( ) ( ) ( )
q
Q z STF z X z NTF z E z = + , (88)

the output of the quantizer w, is of course given by its output q minus its error e
q
, meaning
that it can be expressed as:

( ) ( ) ( )
( ) ( ) ( ) ( ) ( ) 1 .
q
q
W z Y z E z
STF z X z NTF z E z
=
= +
(89)

Using the inverse z-transform, the corresponding time-domain expression is found:

[ ] [ ] [ ] [ ] [ ] [ ]
0 0
q q
k k
w n stf k x n k ntf k e n k e n

= =
= +

. (90)

The bounds for the peak amplitude of w can be found using the Cauchy-Schwartz inequality:

[ ] [ ] [ ] [ ] [ ] [ ]
0 0
q q
k k
w n stf k x n k ntf k e n k e n

= =
+

. (91)
1 1
q q
w stf x ntf e e

+ . (92)

(92) is the more compact L-norm notation of (91). The L-norms of a vector are defined as:

[ ]
1
0
def
p
p
p
n
x x n
=

=

. (93)
( )
max
def
x

= x . (94)

If the STF and NTF are FIR functions the peak quantizer input can be calculated exactly. If
they are IIR functions they are infinitely long, but as long as they are BIBO-stable they
converge to zero and the L-norms can be estimated with arbitrarily high precision using a
large sample set. As long as the non-overload range R of the quantizer is larger than ||w||
it
never overloads. Consequently the non-overload requirement is:

51
1 1
q q
R stf x ntf e e

+ . (95)

If the quantizer has a dither input v, the dither sequence must be included in (90) and the
procedure is easily repeated to find:

1 1 1
q q
R stf x ntf e ntf v e

+ + . (96)

That this condition holds is a sufficient, but not necessary criterion for stability. It is a
sufficient and necessary criterion to guarantee no overload.
Assuming a B-bit mid-thread quantizer like the one shown in fig.12 is used, the non-
overload range is |R|2
B-1
- normalized to the output. As long as there is no overload the peak
error is limited to ||e
q
||
. If the NTF is basic modN; ||ntf||

1
=2
N
. The STF is usually unity and
then if a stable input swing of half the quantizer input range i.e. ||x||
2
B-2
is desired, its
easy to calculate that the number of bits B must be at least:

1 2
1 1 1
2 2 2 1
2 2 2
B B N
B N

+ + . (97)

If the modulator has TPDF dither of width , the requirement becomes much stricter:

1 2
2
1 3 1 3
2 2 2 2 log
2 2 2 2
B B N
B N

+ + +

. (98)

It is clear that since sufficiently high SQNR for hi-res audio will typically require mod3 or
higher see fig.20 a non-overload DSM needs quite many bits in the REQ to get a good
stable input range, especially if it is dithered. A conservative N
th
order IIR NTF e.g. one that
is designed according to Lees rule for 1-bit stability will have a significantly smaller L
1
-
norm than modN, perhaps reduced by 30-50%. This is not optimal with regards to SQNR vs.
OSR, but; if both N and the OSR are sufficiently high for quantization noise to be negligible
compared to other error sources, the non-overload modulator is very attractive. This is of
course because it unlike other modulators can be made with guaranteed stability, no idle-tones
and no noise power modulation. In the second paper (Appendix 4), non-overloading
modulators are explored for different topologies and quantizer functions.

52

53
Chapter 4

Mismatch Shaping

As became clear in the previous chapter there are several advantages to using more than one
bit in the DSM REQ. Apart from greater processing gain for low OSR, it is easier to ensure
modulator stability, a larger input swing is tolerated and full RPDF or TPDF dither can be
applied to eliminate tones and/or noise power modulation. The downside is that multi-bit
DACs are not amplitude linear, which is the main reason why audio converters moved from
LPCM to highly oversampled one-bit DSM in the first place. To achieve better than 10-12
bits resolution by physical matching alone is extremely difficult and dynamic element
matching algorithms were introduced to spectrally shape the mismatch error contribution.

4.1 Mismatch Error Randomization

The term DEM in a data conversion context was introduced by Van De Plassche in 1976
[124] when he used redundant switching in a binary encoded DAC to improve its mismatch
performance. In 1989 Carley showed an implementation of DEM in the sense it is known
today, when he used a butterfly switching network to randomize the element selection in a
thermometer encoded DAC, thus eliminating systematic INL [125]. This was an eight element
DAC driven by a three-bit DSM REQ as shown in fig.49. In the figure thick lines indicate an
ordinary multi-bit signal bus whereas tin lines are single (bit) lines.

Figure 49: DAC element randomization, B=3 bit example

A B-bit DAC needs a B-bit switching network and PRNG to randomize the switching. As
mentioned in ch.2 the number of elements M is typically 2
B
or 2
B
-1 depending on whether or
not the REQ is symmetrical. For clarity the INL as a function of element weights is repeated:

( )
1
1
0
0
,
1
M
i q
def
i
i
i
w
INL q w q w w
M
=
=
= =
. (99)

As before the INL is referred to the numeric quantizer output q and not given a unit. Input-
referred i.e. referred to x its unit is while the analog output y is referred to some
reference current or voltage as explained in ch.2.
Chapter 4 Mismatch Shaping

54
With randomization a random set of weights are assigned every time so one cant find an
expression for the sample-to-sample error. But assuming the weights themselves are random
variables with unity expectance value and variance
w
2
, in other words that there are no
graded or correlated errors
5
, it is found that that the error expectance value E{e
w
}=0 and the
error variance as a function of q is:

( )
2
1
2
0
2
1 .
w
q
e i
i
w
q E w q w
q
q
M
=

=

=

(100)

The maximum error variance occurs at the mid scale or q=0, when M/2 elements are
switched on and the rest are turned off:

( )
2
2
0
4
w
w
e
M

= . (101)

This is the worst case error power with randomization. Since the elements are assumed to be
Gaussian random variables, the error has a white spectrum. The Wiener-Khinchin theorem
can be used to estimate the mismatch error PSD similarly to the quantization error:

( )
2
2
w
w
e
e
S

= . (102)

This PSD can be integrated over the signal band to find the resulting SMNR.

4.2 Element Rotation Techniques

Element randomization efficiently turns mismatch non-linearity into a more benign white
noise contribution, but it still does not facilitate very hi-res multi-bit conversion. Assume that
the DAC is 4-bit with a mismatch standard deviation of 1% at the LSB-level (or
w
=0.01);
then the maximum SMNR with randomization is only around 60dB without oversampling or
80dB with an OSR of 128. This is clearly not sufficient for hi-res audio.
A few years after element randomization the concept of element rotation was introduced.
This is based on the idea that since there is zero INL when all elements are in use, ensuring
that every element contributes equally over time will cancel out the error. Several rotation
algorithms were published in the early 90s, of which Individual Level Averaging [126] and in
particular Data Weighted Averaging [127] turned out to be the most successful. More than a
decade after its conception, DWA is still arguably the most popular DEM algorithm around.
DWA element rotation is shown in fig.50. Note that the integrator must be a B-bit modulo
integrator for the rotation to work as it should.

5
This is a reasonable assumption if the DAC has good layout, utilizing common centroid techniques

55

Figure 50: DWA DAC element rotation, B=3 bit example

Figure 51 exemplifies how element rotation uses each element equally over time. Over the
course of five clock cycles it is seen that every element is used exactly twice, meaning that the
net error cancels out over this time span. In real-life the averaging will of course mostly be
slower, but as n grows every element will have contributed equally.

Figure 51: Element selection sequence with DWA

To simplify the description of the rotation schemes mismatch shaping property, a vector
notation was introduced in the fourth paper (Appendix 6). This notation defines an element
selection vector s of length M, controlling the DAC element switching. The corresponding
DAC error can then be written as:

( )
w
e w = s w . (103)

Here the vector w is the static element weight vector and the operator is the vector dot
product by conventional definition. To simplify the notation another vector u also of length
M is defined as:

( )
1 , 0 1

0 , 1
def
i
i a
a u
a i M

=

u . (104)

If ordinary thermometer encoding is used the q lowest elements are always selected,
meaning that the element selection vector can be described by:

( )
q = s u . (105)


56

It follows from this that the DAC error for any given sample n is:

[ ] [ ] ( )
[ ] ( ) ( )
[ ]
[ ] ( )
[ ]
0

.
w
q n
i
i
e n n w
q n w
w q n w
INL q n
=
=
=
=
=
s w
u w
(106)

This means any DAC INL translates directly to output distortion. In a DWA encoder on the
other hand, the element selection is rotated by updating the starting point with a rotation
pointer p (see fig.50), given by:

[ ] [ ] [ ] ( )
1 mod p n p n q n M = + . (107)

It is seen use fig.51 for inspection if necessary that the element selection vector can now
be described as a function of the u vector as follows:

[ ]
[ ] ( ) [ ] ( ) [ ] [ ]
( ) [ ] ( ) [ ] ( ) [ ] [ ]
1 , 1
+ 1 , 1
p n p n p n p n
n
M p n p n p n p n
=

<
u u
s
u u u
. (108)

The vector u(M) indicates that the modulo pointer has wrapped around M. Using the same
procedure as in (106) the DAC error is found to be:

[ ] [ ] ( ) [ ] ( )
1
w
e n INL p n INL p n = . (109)

Since INL(M)=0 this holds for both cases in (108). It means that the output distortion is a
first order noise shaped function, since the z-transform gives:

( ) ( ) ( ) ( )
1
1
w
E z z INL P z
= . (110)

Exact derivation of the error PSD requires exact knowledge of the statistics of the pointer.
This is generally not trivial to obtain since the pointer is a modulo integral of the input.
Approximations can however be made under certain conditions: If it is assumed that the input
is a Gaussian-like random variable, p[n] approximates a white random process. This makes it
possible to use white noise estimation akin to Bennetts quantizer model. As long as the input
signal is smaller than full-scale we also know that q is centred around and close to 0, so the
worst case randomization estimate can be used as an approximation of the INL(P(z)) PSD.
Under this assumption, the DWA DAC mismatch error PSD will be:

( )
2
2
2
2
1
2
1 .
8
w
w
e
e
w
S e
M
e
=
i
i
(111)


57
With normalized input amplitude A, a B-bit REQ with the number of levels given by M=2
B

will have the corresponding signal-to-mismatch noise ratio:

2
10 / 2
2
/
2
10 log [dB]
1
B
L
w
L
A
SMNR
e d
i
. (112)

As reviewed in ch.3.3, a DSM REQ designed according to Lees rule typically has a max
stable input of around -6dBFS or A
max
=0.5, so SMNR
max
is easily found by insertion. Just like
the DSM quantization noise estimate, this mismatch error estimate is based on the assumption
that the input signal is a random process. In a real world scenario first order mismatch shaping
has non-idealities quite similar to those of a first order DSM REQ. If the input is a DC-signal
it is seen from (107) that p[n] is a periodic function, meaning that a weighting error w
i
will
also appear periodically and create an error spectrum consisting of tones. Tones are not as
severe as in a first order DSM REQ, since the input to the DEM block typically is a relatively
few-level signal and contains a strong shaped quantization noise component. Spurs around the
-100dB level are however to be expected and several techniques have been developed to
alleviate tonality [128]-[130]. They typically dither of the rotation process, which reduces the
tones at the cost of less efficient shaping. With very careful layout, DACs using first order
mismatch shaping have achieved almost 18 ENOB [53],[56]. To improve further DEM must
be evolved beyond first order shaping. A generalized analysis shows that second order DWA
is theoretically trivial, but not easily to implement [131].
A simplified generalization requires for the signal conservation rule to be introduced. To
preserve signal integrity the numeric output of the DEM encoder has to be equal to its input:

0
M
i
i
s q
=
=
. (113)

This is obviously fulfilled with the DWA algorithm since s[n]=u(p[n])-u(p[n-1]) and p[n]-
p[n-1]=q[n] for all n. In a second order extension of DWA for convenience called 2DWA
it is desired that:

( ) ( ) ( ) ( )
2
1
1
w
E z z INL P z
= . (114)

From this the selection vector can be generally defined as:

[ ] ( ) [ ] ( ) [ ] ( ) [ ] ( )
+ 2 1 2 n c M p n p n p n = + s u u u u . (115)

The integer c is a carry variable saying how many times the pointer has wrapped around M.
Pointer wrapping can now occur more than once, since to fulfil the signal conservation rule
for (115) the pointer must be given by:

[ ] [ ] [ ] [ ] ( )
2 1 2 mod p n p n p n q n M = + . (116)

The problem with direct implementation of this is that each element in the selection vector
can now take other values than 0 or 1. For instance the same input sequence as in fig.51 will
with 2DWA give the selection sequence shown in fig.52.

58

Figure 52: Element selection sequence with second order DWA

In this case each element needs to resolve four discrete levels. The 2 value can be obtained
with a single element by running it at double the sampling rate, but it must still be ternary and
is thus susceptible to internal mismatch. A solution to this problem was proposed and later
patented as the Restricted 2DWA (R2DWA) algorithm [132]. In R2DWA an intermediate
vector it is generated according to the second order noise shaping equation, and the algorithm
then forces the selection vector s to take either 1 or 0 values, allocating ones to the q
elements for which it has smallest entries.

[ ] [ ] [ ]
[ ] [ ] [ ] ( )
[ ] [ ] [ ]
for 1: len( )
2 1 2
all_min ,

end
n data
n n n
n n q n
n n n
=
=
=
= +
it t t
s it
t it s

In this pseudo code description the function y=all_min(it,q) allocates ones to q elements in
y corresponding to those for which it has smallest values. In practice this introduces a
compression in the 2DWA transfer function which must also be used in the feedback. The
mismatch shaping function of R2DWA and other restricted second order DEM algorithms is
thus on the general form:

( )
( )
( ) ( ) ( ) ( )
2
2
2
1
R DWA
H z
H z
g q g q H z
=
+
. (117)

The function H
2
(z) denotes ideal 2DWA shaping or H
2
(z)=(1-z
-1
)
2
, and g is a less than unity
compression factor. The value of g depends on the input signal and the number of levels in the
DAC: If the input signal is small or the number of levels high, g is close to unity and the
DEM efficiency is near ideal second order shaping. Simulations in [132] as well as paper 4
(Appendix 6) suggest a typical SMNR around 10dB worse than ideal 2DWA.

4.3 Other Techniques

Although DWA based rotation techniques were the breakthrough for DEM and
consequently multi-bit DSM in hi-res applications, many publications have been made where
the problem is attacked from a different point of view. This has led to some intriguing
implementation approaches that are both more flexible and more hardware efficient than the
rotation scheme in fig.50. In wide bandwidth applications it is desirable to use as low OSR as

59
possible and many bits in the REQ facilitates higher DSM processing gain for low OSR.
Therefore the research activity in hardware efficient DEM techniques has been quite high.
An alternative way to understand the distortion generated by DAC mismatch is to view it as
spectral leakage of single element switching sequences. The DSM generates an M-level
signal, which in a thermometer encoder is divided into M two-level switching sequences
routed to separate 1-bit DACs. An example of the element switching sequences s
0
to s
7
in an
8-level DSM DAC is shown in fig.53. Since the Fourier transform is linear, superposition
gives that S
i
()=Q(). But if there is a weighting error in one of the elements the spectrum
of its switching sequence will leak since:

( ) ( ) ( )
( ) ( )
1
0
1 .
mis
M
i i
i
E Q Y
w S

=
=
=
(118)

Figure 53: Switching sequence for each element in a 3-bit DSM DAC

This means that the objective of DEM switching is to ensure that in addition to signal
preservation or s
i
= , q every switching sequence s
i
itself has a shaped spectrum. Figure 54
shows the element switching sequences for the same input as fig.53 but now with DWA. A
spectral analysis will reveal that every s
i
has a spectrum consisting of a signal component and
a first order shaped noise component. Thus DWA provides first order mismatch shaping.

Figure 54: Switching sequence for each element in a 3-bit DSM DAC with DWA

The obvious question poised by this approach to mismatch distortion and DEM is
consequently: How do you ensure that every switching sequence is spectrally shaped while
keeping their combined sum equal to the input?


60
For a general two-element switching cell as shown in fig.55, where a control signal c
determines whether the inputs are sent directly through the cell or if they are swapped, the
outputs are necessarily given by:

1 2
s s a b + = + . (119)
( )
1 2
s s c a b = . (120)

Figure 55: Two element swapper cell

Solving (119) and (120) for s
1
and s
2
it is found that they can be expressed as:

( ) ( )
1
1
2 2
c
s a b a b = + + . (121)
( ) ( )
2
1
2 2
c
s a b a b = + . (122)

The property s
1
+s
2
=a+b implies signal preservation. A weighting error in s
1
or s
2
introduces
a non-unity signal gain, but more importantly leakage of c(a-b) to the output. This means that
as long as c is a shaped sequence the leakage is also shaped. To generate a first order shaped
control sequence c can be done with simple logic as described in Adams patent [133] used
for Analog Devices converters. Higher order is much more complex, but the published
R2DWA implementation is based on this type of swapper cells. Through induction it is found
that ensuring all sequences are noise shaped requires for the swapper cells to be arranged in a
complete swapping network like fig.56.

Figure 56: Swapping cell network for DEM, B=3

The sum of shaped sequences approach also led to an ingenious solution by Galton which
significantly reduced the DEM complexity while maintaining high flexibility [134]-[136].
Both DWA and the swapper cell approach have O(Mlog
2
M) complexity for the switching
network, in addition to a O(M) complex thermometer encoder. Using a tree structure bit
reduction logic, Galton reduced the thermometer encoder and DEM network to a single block
just slightly above O(M) complexity. This facilitates the use of more elements in the DAC.


61

Figure 57: Data splitting and reduction for tree structure DEM

Imagine that an input signal x is split into two sequences as shown in fig.57, where we have:

( )
( )
1
2
1
,
2
1
.
2
s x c
s x c
= +
=
(123)

Since s
1
+s
2
=x this structure is signal preserving. A weighting error between s
1
and s
2
means
that c leaks to the output, so again c being shaped means the error is shaped. With a few other
restrictions this structure can be used in a logic reduction tree. Firstly; to ensure both s
1
and s
2
are integers a restriction seen from (123) is that:

even if is even
odd if is odd
x
c
x
. (124)

Furthermore; to enable bit reduction the outputs obviously have to be represented with less
bits than the input, i.e. s
1,2
2
B-1
for one bit reduction of a B-bit x. This is fulfilled as long as:

{ }
min , 2
B
c x x . (125)

(125) is fulfilled for any positive B given |c|1. A control signal satisfying both (124) and
(125) for every sample instant n can thus be made within the restriction:

[ ]
[ ]
[ ]
0 if is even
1 if is odd
x n
c n
x n
. (126)

A simple modified 1-bit mod1 DSM can generate such a sequence that is also first order
shaped, and can thus be used in a complete reduction tree with mismatch shaping. With a B-
bit binary input and a set of 2
B
two-level outputs where every output sequence s
i
is first order
shaped and s
i
=q, this structure with look like fig.58, showcasing a 3-bit example.

62

Figure 58: Complete reduction tree with first order mismatch shaping

Galton also showed simple logic for restricted second order shaping [135], but found higher
order than this difficult to keep stable. A modified approach was recently shown [137] where
higher order shaping is used for the first few switching layers (then the switching block input
is more than two bits and the restrictions on c can be relaxed), while second order shaping is
used for the last layers. Note that only the block that generates c has to be replaced to change
the mismatch shaping function.
Most alternative (to DWA) algorithms are based on the sum of shaped sequences approach.
One other that is noteworthy though not reviewed here, is the Schreier VQ-approach [138]

4.4 Segmented Mismatch Shaping

Even if DEM algorithms have become more efficient, a many-bit implementation will still
be quite complex and chip area consuming. Especially for second order mismatch shaping this
is true; at best a B-bit DAC will need 2
B
modulators in the DEM encoder. The routing of a
unit element DAC with many levels is also complex. An intuitive way to solve this would be
to split the DAC into two sub-DACs with separate DEM encoders as shown in fig.59. Of the
B bits B
0
LSBs are fed to the lower sub-DAC and the B-B
0
remaining MSBs are fed to the
upper sub-DAC. The MSB DAC must then have an element weight of 2
B
0
to give a correctly
recombined output. The output is now sort of mid-way between thermometer code and binary
code. A segmented DAC was first shown in 1979 [139], then without any DEM.
With segmentation there are now two smaller DEM blocks and less routing to implement.
But although the DEM encoders linearize the sub-DACs, mismatch between them is not
shaped. How this affects the output can be seen by making a signal flow diagram as shown in
fig.60. For simplicity the sub-DACs are assumed linear (in reality they are DEM linearized),
and inter sub-DAC mismatch is modelled as a weighting error 1 in the LSB DAC.

Figure 59: DEM and DAC segmentation

63
It is seen that splitting the data is equivalent to introducing a truncating quantizer and feed
its truncated output to the MSB DAC. The truncation error is subtracted through the LSB
DAC, meaning that this effectively acts as an error-compensation DAC. The MSB data is
effectively right shifted by an amount equal to the number of bits shaved off indicated by
the 2
-B0
gain element which must be cancelled by a nominal MSB DAC gain of 2
B0
. Ideally
the compensation DAC has unity gain, but because of mismatch it is in reality some random
variable , making the compensation non-ideal
6
. The output is:

( ) 1 y q e = + . (127)

Figure 60: Equivalent signal flow diagram of segmented DAC

If the truncation is e.g. 4-bit and =0.999, it means 0.1% of a 4-bit quantization error leaks
to the output. A 4-bit quantization error suppressed by 60dB gives a total ENOB around 14.
This is clearly insufficient for very hi-res applications, and a proposed solution was given by
Adams in 1998 [140] where he replaced the truncation with a dedicated Segmentation-DSM
(SDSM). The SDSM replaces the truncation and shapes the leaking error as shown in fig.61.

Figure 61: DEM and DAC segmentation with SDSM

The number of bits in the SDSM REQ is B
1
. It thus scales the signal with a factor 2
-(B-B1)
and
the nominal weighting of the MSB DAC must compensate for this. The output is now:

( ) 1
SDSM
y q e = + . (128)

Since e
SDSM
is a shaped error the leakage caused by inter sub-DAC mismatch is also shaped.
Conceptually this is very similar to the shaping of single element spectral leakage in a DEM-
encoded unit-element DAC.
A disadvantage in replacing the truncation with an SDSM is that the peak error fed to the
compensation-DAC grows in magnitude. With a truncator it is given that B=B
0
+B
1
as is
evident from fig.59 but when the REQ is situated inside a DSM the peak error increases.
This means that the compensation-DAC needs more bits to accommodate a larger input
swing. In his publication Adams used a first order error feedback SDSM with a (z-1) FIR
NTF. Then e
SDSM
[n]=e[n]-e[n-1] and consequently ||e
SDSM
||
=2||e||
. This means that the

compensation-DAC doubles in size.

6
It is easily found that the variance of is 2
-B
0
times the nominal LSB-level DAC mismatch.

64
Generally for an (z-1)
N
FIR NTF the peak gain is 2
N
, so in a unity STF SDSM it implies an
N-bit increase of the compensation DAC. Generally ||e
SDSM
||
=||ntf||
1
||e||
and consequently
since B
e
=B-B
1
the compensation-DAC number of bits has to be at least:

( )
0 1
1
B B B ntf . (129)

This means that if B=8 and (z-1)
2
mismatch shaping is desired throughout the system, the
most efficient DEM segmentation will be with a 5-bit SDSM, leading to both B
1
and B
0
being
5-bit signals.
Although not utilized in any published implementations known to the author, it is fully
possible to use conservative non-overloading IIR NTFs in the SDSM. Since it has less peak
gain such an NTF gives less additional cost from increasing the order. What improvements to
expect with non-overloading IIR SDSMs compared to the FIR SDSMs previously used, is
investigated in the third publication (Appendix 5). It reveals that the complexity penalty from
increasing the SDSM shaping order can be significantly reduced.
Various structures for further DEM segmentation using SDSMs were investigated in the
thesis by Steensgaard [141]. The most intuitive choice would be to just repeat the
segmentation as is shown in fig.62. Steensgaard called this a symmetrical tree structure and
he also explored one-sided and asymmetrical tree structures. Some are more efficient than
others, but all create a DAC overhead that increases with the degree of segmentation.

Figure 62: Two time DEM and DAC segmentation

On a final note a segmented version of the Galton tree structure has also been published
[142]. In this the reduction tree is asymmetric and it thus generates something between a
binary and thermometer type code. It is not reviewed here, but does have the same advantages
as the regular Galton reduction tree, i.e. that it does not require a separate thermometer
encoder and improves the hardware efficiency.

65
Chapter 5

Delta-Sigma and Dynamic DAC Errors

In chapter two the main categories of DAC errors; the quantization error, static errors, and
dynamic errors, were reviewed. The next chapter explored how DSM REQ can facilitate few-
bit or single-bit conversion with very low in-band quantization noise. It also showed benefits
of using more than one bit, for instance that the in-band quantization noise can easily be made
negligible while maintaining modulator stability. All multi-bit DACs have static non-linearity,
but as the previous chapter reviewed DEM can be used to ensure very high resolution still.
This leaves the class of dynamic or waveform type errors. Chapter two showed the nature of
such errors, but without relating them to the DSM REQ. In a DSM converter the DAC input is
a coarsely quantized and noise shaped sample sequence, the nature of which significantly
affects dynamic error sensitivity. Since it is generally not possible to analytically derive the
DSM output sequence, it is neither possible to analytically derive dynamic errors. Simplified
estimates can however be made and this chapter reviews the development of such.

5.1 Delta-sigma and Jitter Error Estimation

Jitter was introduced in chapter two as a waveform error caused by deviations in the
sampling instant. It stems from the digital audio interface as well as noise and parasitics in the
clock regeneration and distribution circuitry. The jitter pattern can appear signal correlated, as
sinusoids, as white noise, and as pink noise. It was established that the jitter error can be
approximated with an area error model as shown in fig.63, and that the error PSD then is:

( ) ( ) ( )
2
1
j
e d j
s
S S S
T

. (130)

Here d is the differentiated DAC input. It is desirable to have prediction models for the jitter
distortion in a DSM DAC, so that qualified choices can be made for the DSM design.
Development of such prediction models were featured in the fourth paper (Appendix 6).

Figure 63: Area error model for jitter distortion analysis

To create a simple estimate of DSM DAC jitter distortion is most conveniently done
through a frequency domain approach. As known the output sequence cannot be analytically
derived, but we also know that in the frequency domain it approximates a spectrum consisting
of a signal component and an independent shaped noise component. If the STF is unity in the
signal band, the output PSD of the modulator can be expressed as:
Chapter 5 Delta-Sigma and Dynamic DAC Errors

66
( ) ( ) ( )
2
2
2
q
e
q x
S S NTF
+ . (131)

The PSD of d or S
d
() is found through spectral differentiation:

( ) ( ) ( ) ( ) ( ) ( )
2
2
, 1
2
q
def
e
d dx
S S dNTF dNTF e NTF

+ =
i
. (132)

The jitter estimate makes use of the total power of d. This is found by integrating the PSD
S
d
() across the whole frequency range to , or in other words find the spectral L
2
-norm:

( ) ( ) ( )
2 2
2 2 2
2 2
1
,
2
q
def
d dx e
dNTF H H d

+ =

. (133)

From the convolution theorem the power of e
j
in (130) has to be:

2 2 2
2
1
j
e d j
s
T
= . (134)

The cases of white random jitter and sinusoid sideband jitter were explicitly considered in
the paper since these are most likely to cause audible distortion or noise
7
. If the jitter PSD is
white, the jitter error PSD will also be white since it stems from convolution. This means that
1/L of the total error power (134) fall in-band, and the in-band jitter noise power is:

( )
( )
2
2 2 2 2
2
2
1
j q
e dx e j
s
dNTF
L T
= +
. (135)

If the signal component is a sinewave with output normalized peak-to-peak amplitude A2
B
,
and
x
<< so that A
dx
A
x
x
, the in-band SJNR will be:

( )
2 2
2
_
10
2
2 2 2 2
2
2
10 log
2
2
3
B
s in
B
x j
A
f L
SJNR
A dNTF

+

. (136)

If the number of bits B is very large, e.g. in a hi-res LPCM converter
8
, the denominator is
dominated by the signal term and jitter noise approximates that of ordinary sampling jitter
[63]. With few bits the quantization error term dominates the denominator in (136), and the
SJNR reduces by 6dB for each bit removed. Achieving hi-res performance is very difficult
with few bits since the phase noise variance must then be extremely low. Figure 64 illustrates
this by showing SJNR
max
for varying numbers of levels assuming a peak input of -6dbFS
(A=A
max
=0.5). The NTF of the DSM REQ is also the same in all examples and designed
according to Lees Rule. The input sampling frequency f
s_in
is 44.1kHz and the jitter 50ps
RMS
.

7
Pink jitter noise is likely to be masked, and in this context in-band jitter sidebands behave the same whether
they are correlated or uncorrelated.
8
A non-modulating REQ will have an NTF of 1.

67

Figure 64: SJNR
max
example, 50ps white jitter

The figure compares SJNR
max
according to the estimate with simulated SNR in a high order
DSM DAC. For low OSR the quantization noise dominates the simulated error while for high
OSR the performance is jitter limited. It is seen that with 50ps
RMS
white jitter the DAC needs
quite many bits to maintain hi-res audio performance.
Sinusoid jitter leads to sideband distortion since convoluting the power spectral densities
means discrete jitter components are multiplied with the spectral components in d, which
include a signal component and shaped noise. If the signal x is sinusoid, straightforward
multiplication through the angle sum and difference identities gives resulting modulation
products at
x
j
with amplitude:

( )
2 2
x j
j
e
s s
j x x j d
A
A A A A
T T

=

. (137)

No component in the quantization noise contains enough power by itself to create
discernible modulation products with sinusoid jitter, so the total distortion approximates (137)
and is equivalent to sampling jitter. Since convolution is linear, calculation of jitter noise and
jitter sidebands from a composite jitter spectrum can be done separately before adding them
together. Figure 65 shows simulated output spectra of a DSM DAC with sinusoid, discrete,
and mixed jitter. It is seen that combining them does not affect the contribution of each.
In conclusion jitter sideband distortion is not affected by the DSM, but to maintain high
SNR in the presence of white PSD phase noise it must be many-bit or out-of-band noise must
be removed while in discrete time. A switch-cap filtering DAC does the latter; and the
differentiated PSD at the discrete-to-continuous interface or SCF output will be:

( ) ( ) ( ) ( )
2
2
2
q
e
d dx SCF
S S d NTF H
+

. (138)

H
SCF
() is the low-pass response of the switch-cap filter. The advantage of SC-filtering was
assessed experimentally by Fujimori [53], but can also be estimated with the above method.
As an alternative it is possible to explore other types of reconstruction than zero order hold.
Hawksford suggested raised-cosine reconstruction [143], but due to the difficulty of a hi-res
implementation it has not been seen in commercial applications to my knowledge.


68

Figure 65: Jittered spectrum with a) sinusoid, b) white and c) mixed jitter

5.2 Delta-sigma and Switching Error Estimation

In chapter 2.7 it was established that the error waveform or ISI generated by switching
errors e
on
and e
off
in a DAC, can be approximated using an area error model:

[ ]
, 0
, 0
n on n
ISI
n off n
d e d
e n
d e d

<
. (139)

It is seen that this error is proportional to d and scales with either e
on
or e
off
depending on
whether d is positive or negative. Thus e
on
=e
off
means that e
ISI
is a linear product of d and the
ISI is benign. On the other hand, if e
on
e
off
the error waveform is asymmetrical around d=0
and in other words constitutes a non-linearity.
Just like the jitter error approximation, the ISI error approximation is based on superposition
of spectral components in the differentiated DSM output, as given by the additive noise
model. Assessing first the signal component: The assumption x
n
=Acos(
x
n) and
x
<< gives
that d
n
-A
x
sin(
x
n), and (139) can thus be rewritten to:

[ ]
( )
( )
sin , 0
.
sin , 2
x x off x
ISI
x x on x
A n e n
e n
A n e n

<
<
(140)

Since (140) is infinite and periodic in
x
n=2 its Fourier series can be developed, which
was showcased in analysis by Clara et al. [79] and results in even harmonic spectral
components with amplitude:

69
( )
2
, 2, 4, 6...
( 1)( 1)
0 , otherwise
x
off on
k
x
ISI
e e
A k
A
k k
. (141)

Paper four extended this analysis to also assess the impact of the shaped quantization noise
component e
dsm
. As is known either e
off
or e
on
multiplies with d depending on whether or not
its instantaneous value is above or below zero. A time sequence of e
dsm
can again not be found
analytically, but finding its sign means subjecting it to 1-bit unshaped quantization. 1-bit
unshaped quantization of a shaped noise sequence effectively renders it white, and whether
e
off
or e
on
multiplies with d is thus something given by a random process with a white PSD.
Since the NTF has zeros at DC this process can be assumed zero-mean and its total power is
approximately |e
off
e
on
|
2
. The expression for the approximate total power of d is already
known see (133) and the total ISI error power is follows from it:

( ) ( )
2 2
2
2 2
2
1
12
ISI d off on off on
e e dNTF e e = . (142)

Since the sign sequence is approximately white, it means the ISI error it produces is also
approximately white. In-band noise power is therefore 1/L of the total noise power, and in-
band SSNR disregarding distortion from the signal component is thus approximately:

( ) ( )
2 2
2
_
10
2 2
2
2
10 log
2
3
B
s in
off on
A
f L
SSNR
dNTF e e

. (143)

With a constant element on-error -e
on
and off-error e
off
, the SSNR increases by 6dB per bit
since the switching activity caused by the differentiated DSM noise remains constant while
the maximum signal swing increases proportionally.

Figure 66: Simulated spectrum, 10ps switching asymmetry


70
Figure 66 shows a simulated output spectrum from a DSM DAC with asymmetric
switching. The switching error area in this simulation is calculated from linear slewing and a
rise-time and fall-time asymmetry of 10ps. The grey trace is the ideal DSM output and the
black trace is the DAC output. Harmonic signal distortion components estimated from (141)
are shown as markings, and as seen all are buried in the noise floor. Switching asymmetry can
thus be approximated as a white error with an error power estimated from (142).
Comparisons of this estimate with performance simulations give the result of fig.67. The
bottom to top trace shows 7-level to 255-level DACs with the same relative element switching
error; that is linear slewing with 10ps rise-time and fall-time asymmetry in every element. For
low OSR the quantization noise dominates, while for high OSR the ISI limits performance.
Results are now shown for the 2-level DAC since the ISI models were made to facilitate DEM
and thus had to be multi-level, but can obviously be expected to be worse.

Figure 67: Simulated SSNR
max
example, 10ps switching asymmetry

As seen the simple estimate matches the simulated performance very well when the latter is
limited by switching asymmetry. It should be noted though that like the others this estimate is
based upon simplified approximations. Notably the additive noise model is used, but also the
sign sequence and from it the error sequence is assumed to have a white PSD.

Figure 68: Simulated ISI error spectrum

71
Figure 68 shows the extracted error spectrum from the simulation used in fig.66. As can be
seen the error is in reality not entirely white but contains some residuals of the signal and out-
of-band noise components. Nonetheless the estimate gives a good prediction of the SSNR.
Note also that just like for jitter distortion it will be advantageous to use a many-bit system.
With DEM the switching activity is quite different, which is clearly seen from fig.51 as well
as fig.54. Assuming DWA is used, (108) and (139) gives the following relation between the
DSM output time sequence and the switching error:

[ ]
[ ] [ ] [ ] [ ]
[ ] ( ) [ ] ( ) [ ] [ ]
1 , 1
1 , 1
on off
ISI
on off
q n e q n e q n q n M
e n
M q n e M q n e q n q n M
+
=

+ >
. (144)
Evaluating first the signal component i.e. assuming q[n]=Asin(
x
n) where

x
n<< it is
found that the ISI error will be:

[ ]
( ) ( ) ( )
( ) ( ) ( )
sin sin 1 , 0
2 2
sin 1 sin , 2
2 2
x on x off x
ISI
x on x off x
M M
A n e A n e n
e n
M M
A n e A n e n

+ + <

<

. (145)

Just like without DEM, e
on
=e
off
means that e
ISI
is a linear function of the signal. Fourier
series development of (145) also shown in [79] results in even harmonics with amplitude:

( )
2
, 2, 4, 6...
( 1)( 1)
0 , otherwise
x
off on
k
ISI
e e
A k
A
k k
. (146)

To develop a spectral estimate for the additional in-band noise that is caused by asymmetric
switching of e
dsm
, would be difficult since (144) spectrally constitutes a non-linear filter. But
as simulation shows in fig.69; harmonic distortion is very dominant, and is clearly the limiting
performance factor in the sense that asymmetric switching makes the SFDR unacceptable
long before the SNR. Estimation of additional in-band noise was consequently not pursued.

Figure 69: Simulated spectrum, 10ps switching asymmetry, DWA

72
It is seen that although the estimate (146) only predicts even harmonics there are also some
weaker albeit clearly visible odd harmonics, which are not predicted by the signal analysis nor
mentioned in [79]. Remember that the estimate (146) only assesses the signal component and
does not take into account that the DAC input is generated by a DSM. The odd harmonics are
probably caused by e
dsm
in reality not being independent of the input, although the additive
noise model assumes it is. So in addition to causing switching noise the DSM error will also
make asymmetric switching cause some odd harmonic content.

Figure 70: Simulated spectrum of LPCM DAC with DWA

Figure 70 shows the output spectrum of a DWA DAC with asymmetric switching, but
instead of a DSM REQ the DAC input is now generated by a TPDF dithered 12-bit LPCM
REQ. The switching asymmetry is significantly increased to make the distortion clearly
visible in the spectrum. As seen odd harmonics are now not present and simulations give a
distortion matching the estimate in (146) and [79]. Although a DSM REQ causes the
distortion to also contain some odd harmonics and additional in-band noise; the ISI error with
DEM is still dominated by even harmonics and (146) will accurately predict the SFDR.

Figure 71: Simulated spectrum, 10ps switching asymmetry, R2DWA


73
With higher order DEM it is not possible to derive the switching sequence and it would
therefore be extremely difficult to create good estimates for switching asymmetry distortion.
Simulations do show that the ISI error will be dominated by in-band switching noise and a
strong second-harmonic component as seen in fig.71. The SFDR is approximately 10dB better
than with first order DEM, and the harmonic spectrum is more benign. Thus second order
DEM will be superior over first order also to reduce ISI distortion.

5.3 Techniques for Reducing Dynamic Errors

Back in the 1980s when 1-bit delta-sigma was the dominating design paradigm for high
resolution audio converters, it was quickly acknowledged that switching errors would be a
limiting factor for the performance [49]. Investigation into techniques to reduce this problem
followed shortly thereafter.

5.3.1 Return to zero

A solution that was soon proposed for this was the same that is often used to eliminate ISI
in digital transmission channels, namely return-to-zero switching. An RZ DAC simply resets
every element within each sample, creating an output waveform as shown in fig.72.

Figure 72: Return-to-zero waveform

The elements are switched on for a given fraction of the sample period <1. Now,
regardless of the value of the input sample, a number of elements equal to this value are
turned both on and off within one sample period. This means the error expression reduces to:

[ ] [ ] ( )
ISI off on
e n q n e e = . (147)

Now the ISI error is a linear function of the input also if switching is asymmetric, meaning
that it is benign. Thus ISI is eliminated fully, as long as the settling is complete. But even
though ISI is eliminated, RZ switching does have some major disadvantages.
Since the sample period must be divided into a reconstruction phase and a reset phase,
internal clock speeds must be higher than the sampling rate. There are also high frequency
components produced at the output, which may fold down due to non-linearities and
insufficient filtering. Switching losses are increased and the output power reduced by a factor
2
for a given element current. But most importantly; since the output resets to zero for each
sample, the RZ DAC is highly sensitive to random clock jitter. Assuming instantaneous jitter
values at nT and (n+)T are uncorrelated random values, the jitter error PSD is approximately:

74
( ) ( ) ( )
2
2
j
e q j
s
S S S
T

. (148)

We know that [ ] [ ] ( )
/ 2 q n M Q x n = + and thus its DTFT is:

( ) ( ) ( ) ( )
2 2
2
dsm
k
M
Q k X E
=
+ + +
. (149)

If the jitter is white we have from (148) that the error is also white, with in-band power:

2 2 2
2
2
j
e q j
s
L T
=
, (150)

and the integrated PSD or spectral L
2
-norm of q is found to be:

( ) ( )
( )
2
2
2 2 2
2
2
2
2 2 2 2
2
2
1
2 4
2
4
q
j q
q x e
e x e j
s
M
Q d NTF
M
dNTF
L T
= + +

+ +

. (151)

This results in an SJNR for sinusoidal input signals that is given by:

2 2
2
_
10
2
2 2
2 2
2
10 log
2
2
3 2
s in
RZ
j B
A
f L
SJNR
A NTF

=

+ +

. (152)

Figure 73 shows SJNR estimates and SNR simulations of a RZ DAC with duty-cycle =0.8.
It is seen that for a 2-level DAC the sensitivity to random jitter roughly doubles since there
are two jittered edges instead of one and signal power is reduced by
2
. With many levels the
waveform is always reset from mid-scale to zero, which dominates the jitter error area and
means that the first term dominates the denominator in (152). The SJNR will then not increase
as the number of levels is increased. This implies that RZ switching makes the use of many-
bit DACs pointless, which is confirmed by the simulated jitter performance.
RZ switching might on the other hand be advantageous for low frequency sinusoidal jitter,
since the instantaneous jitter values j(nT) and j((n+)T) are then very similar in amplitude.
This means that the area error from switching on is nearly cancelled by the area error from
switching off. An approximation for sideband distortion with RZ can be derived identically to
the NRZ case and is found to be:

( )
2 2
x j
j
j j
e
s s
x dj x
A
A A A A
T T

=

. (153)

There is also a distortion component at
j
due to mixing with the offset. Since the offset is
M/2 its amplitude will be as given in (154).

75

Figure 73: SJNR
max,
50ps white jitter and RZ DAC

( )
j
j
j j
e
s
A
M A
T

. (154)

Actually the susceptibility to low frequency sinusoidal jitter is somewhat improved with RZ
switching compared to NRZ, but since the sensitivity to white or wide-band jitter is so high
RZ is only really usable for 1-bit DSM DACs.

5.3.2 Dual return-to-zero and time interleaving

Since traditional RZ switching ruins the gain in jitter sensitivity from using multi-bit DSM
conversion, developers and researchers quickly ventured into research on methods for ISI-
elimination that preserve the output waveform. Adams proposed a variation called dual-RZ,
which he introduced with the same innovative DAC design that also introduced segmented
DEM [140]. Dual-RZ was described closer in a subsequent JSSC publication [144]. The
design uses two RZ sub-DACs clocked in opposite phase and sums their outputs to form a
replica of the input waveform as shown in fig.74.
If each sub-DAC element is associated with a turn-on error e
on
and a turn-off error e
off
, the
combined error from the two RZ sub-DACs becomes:

[ ] [ ] ( )
2
ISI off on
e n q n e e = . (155)

This means that ISI is eliminated as long as settling is complete, and the two sub-DACs are
driven by the same clock signal. If they are, any deviations in the clock transition will affect
both identically and the reproduced waveform is an input waveform replica as shown in the
figure regardless of jitter. The jitter sensitivity is thus the same as for an ordinary NRZ DAC.
Disadvantages with this scheme include it requiring two RZ DACs meaning it has double the
complexity and even higher switching losses. Additionally, synchronization of the two sub-
DACs will be very critical to the final reproduced waveform and the converters performance.


76

Figure 74: Dual-RZ waveform

Another approach, that was first proposed by Steensgaard [141] and in a variation used for a
more recent high-speed DSM DAC design [145], is DAC time-interleaving. Straightforward
sample-interleaving cannot be used since mismatch between sub-DACs then produces output
distortion. But this can be dealt with by modifying the DEM scheme [145] or by interleaving
in such a way that both sub-DACs contribute equally to every sample, shown in fig.75 [141].

Figure 75: DAC time-interleaving, a) functional diagram, b) waveform


77
In this approach the sub-DACs are not RZ, but they are allowed to settle before they are
connected to the output. This means that the sub-DACs can be slow and their dynamic
behaviour sluggish without it affecting the output waveform. The dynamic behaviour of the
output switches will on the other hand affect the waveform and may cause ISI distortion. Its
transitions are shown as dotted lines in the figure. The design is however much improved over
regular NRZ since it is much easier to control the switching behaviour of a single output
switch than a score of DAC elements. An implementation suggestion is featured in [141].

5.3.3 Semidigital filtering DAC

Back when the norm was to use 1-bit DSM REQs, several ways to improve the jitter
performance were explored and one of the more useful proposals was the semidigital filtering
DAC [146]. By arranging several DAC elements as coefficients in a semidigital FIR filter, a
multi-level output signal could be created where mismatch did not affect the DAC linearity.
This concept is shown in fig.76.

Figure 76: 1-bit DSM REQ with semidigital filtering DAC for multi-level output

With N equally weighted sub-DACs the filtering DAC has a sinc(N) low-pass response
meaning it suppresses out-of-band noise. As long as L>N where L is the OSR, the in-band
gain approximates N meaning that y is in practice an N-level signal. Mismatch between the
sub-DACs will not lead to distortion as is the case in a regular multi-bit DAC, but will
compromise the low-pass function H
DAC
. Generalized the output is approximately:

( ) ( ) ( ) ( )
2
2
2
2
q
e
y x DAC
S S N NTF H
+ . (156)

H
DAC
() is the semidigital DACs frequency response. The SJNR is now approximately:

( ) ( ) ( )
2 2
2
_
10
2
2 2 2 2
2
2
2
10 log
2
2
3
B
s in
B
x DAC j
A
f L
SJNR
A d NTF H
N

+

. (157)


78
Although it alleviates wideband jitter problems, a filtering DAC does not prevent problems
inherent in the 1-bit DSM REQ such as poor stability, limited input-range, idle-tones and
noise power modulation. But as should be clear by now; with a high OSR the DSM only
needs quite few levels to render REQ quantization noise and related issues negligible. The
reason it is desirable to use many levels is primarily to alleviate wideband jitter problems. In
the fourth paper (Appendix 6), the combination of a few-level DSM REQ and a semidigital
filtering DAC to create a many-level and relatively jitter-immune output, was explored as an
alternative to a many-level DSM REQ with segmented DEM.

Figure 77: Multi-bit DSM REQ with semidigital filtering DAC

As seen in fig.77 the DSM is now M-level and N sub-DACs are implemented for an
effective (MN)-level output signal. Furthermore the DAC weights are generalized since
windowed weighting of the sub-DACs gives better out-of-band suppression and thus better
SJNR than equal weighting. It is shown in the paper how mismatch compromises the DAC
transfer function so that its expected response is:

{ }
( ) ( )
w
DAC ideal e
E H e H e N

= +
i i
. (158)

Where
ew
is the mismatch error as given in (101). Simulations in the paper show that a
DSM REQ with second order DEM and a hann filtering DAC where M=15 and N=17,
performs better than a segmented second order DEM DAC with M=255 in the presence of
50ps
RMS
white jitter and 1% RMS mismatch at the 255-level LSB weight. What is the best
choice depends on whether mismatch (with DEM) or jitter is expected to be the limiting factor
for the final SNR. If jitter noise dominates the semidigital filtering DAC is the better choice,
while if mismatch noise dominates the segmented DEM DAC will be the better choice.

5.3.4 Pulse Width Modulating DAC

Pulse width modulation is a way to represent a signal as a two-level waveform. While PCM
represents the input as amplitude quantized codes, PWM represents input amplitude samples
as corresponding pulse widths in a periodic waveform. PWM was conceptually described in
1933 by Bennett [147], its use in audio amplification suggested in 1965 by Josephson [148].
PWM amplification is attractive because two-level signals facilitate Class-D (switching)
amplifiers with very high efficiency [148]. Research has also diverted into digital PCM-PWM
conversion for use in DACs [150] and high output power digital amplifiers [151].
Figure 78a) shows the conversion of an analog signal to PWM, typically referred to as
Natural PWM (NPWM). The PWM waveform is obtained by comparing the input to a

79
reference carrier in an analog comparator. The carrier is periodic with frequency f
c
and one
output pulse is generated per period with a width proportional to the input amplitude at the
crossing point. Thus NPWM is time sampling at the crossing point. To avoid multiple
crossing points the slew rate of the carrier must always be higher than the signal. From this it
is required that f
c
>f
x
for a full-scale sinusoid x. The PWM spectrum will consist of a signal
component, a carrier component and modulation products. The input is reconstructed by low-
pass filtering after the switching amplifier. For good reconstruction, i.e. high suppression of
the carrier component and modulation products, it is common that f
c
>>f
x
.

Figure 78: a) Analog PWM modulation b) Digital PCM-PWM conversion

Conversion of PCM to PWM is done similarly, and illustrated in fig.78b). The input sample
is held throughout its sample period at one input of a digital comparator. The reference carrier
at its other input is generated by a counter. The counter resets at an interval T
s
and must count
from 0 to 2
B
-1 between each reset so that any PCM input sample value can be given a digital
PWM representation. This means 1-bit PWM samples are generated at a rate 2
B
f
s
, which for
24-bit 96kHz audio equals 1,600GHz. Analogously, the PWM time resolution corresponding
to 24-bit PCM amplitude resolution is 0.6ps. This is obviously not feasible to implement so
the input must be requantized first. In digital amplifiers for audio it is common to use an 8-
bit DSM REQ with an OSR of 8 [152] for a more manageable PWM sample frequency of
~200MHz. The requirement for timing accuracy is however unchanged since jitter in the
PWM waveform is not shaped by the DSM. Unsurprisingly the jitter susceptibility is
comparable to a two-level RZ DAC since PWM is in essence a two-level RZ waveform.
Another major issue in PWM amplifiers with digital modulation is PCM-PWM distortion.
Input sample n is held by the comparator from nT
s
to (n+1)T
s
and resampled at a time instant
depending on its value. This happens along a time grid given by T
s
/2
B
, but if the resolution
is reasonably high it can be approximated as continuous in time. It is then called Uniform
PWM. Not unexpectedly, the hold error will fold down into the signal band upon resampling
and since the resampling instant is signal dependent it will also cause harmonic distortion
[153]. How the hold error changes the PWM pulse width compared to ideal reconstruction is
illustrated in fig.79. Since only the value at the crossing point is sampled, it is possible to use
algorithms for signal-dependent interpolation to approximate the ideal reconstruction case.
Goldberg and Sandler did important early work on this [154] and a comprehensive treatment
of several approaches and algorithms is given in Nielsens PhD-thesis [155].

80

Figure 79: UPWM error

The observant reader will have noted that a switching amplifier can work on any two-level
bit-stream, so why not just use a 1-bit DSM REQ which is much more linear than a PCM-
PWM conversion? The answer to this is switching losses. With high OSR and high and
irregular switching activity, the DSM bit-stream in its basic form is not very suitable to drive
high power Class-D amplifiers. Likewise, PWM due to its two-level representation and high
jitter susceptibility is not very suitable for high resolution small signal DACs. The research
effort into eliminating the weaknesses of both has however led to some convergence, and
through new techniques both high-power switching amplifiers based on DSM and hi-res
DACs based on PWM have been reported.
Digital amplifiers based on 1-bit DSM commonly use quantizer hysteresis [156]-[157] to
reduce the switching activity, while some recent hi-res converters have used innovative PWM
variations to reduce dynamic errors in multi-bit current DACs. Doorn et al. showed a design
using PWM in combination with a semidigital filtering DAC to reduce jitter problems [158].
Each DAC element is fed by a two-level PWM stream making it ISI free, and to avoid PCM-
PWM distortion the PWM modulation is done inside the DSM loop. Rueger et al. also
showed a solution [159] using several time-interleaved PWM DAC slices to control the
switching errors and limit the switching activity. The slices consisted of semidigital DACs
to improve jitter performance.
Reefman et al. showed an ingenious utilization of PWM in a 2003 publication [160], where
it is used to eliminate both mismatch noise and ISI while retaining jitter susceptibility at the
same level as an ordinary NRZ DAC. In this design each element is PWM modulated and all
are used equally regardless of input value. The PWM makes the elements ISI free by ensuring
they are switched on and off once every sample, and using them all equally regardless of input
value eliminates mismatch distortion. By time-interleaving the PWM modulation within the
sample period, the combined output of the current elements equal a normal PCM staircase.
This is illustrated in fig.80. Note that the PWM modulation works in modulo fashion so that
the active period is rotated when exceeding a sample period.
This algorithm does have a few disadvantages. Firstly, the clock frequency of the PWM
logic needs to be f
s
OSR2
B
which limits the number of bits in the REQ (and makes it
unusable for wide bandwith applications). Also to keep the switching activity constant, the
REQ output can only change by 1 from one sample to the next. Reefman et al. used a limiter
inside the DSM loop to force this, but preserving stability then mandates a conservative NTF.

81

Figure 80: PWM-based algorithm used by Reefman et al. to eliminate mismatch and ISI

A possibility that wasnt explored by Reefman et al. is to use this algorithm in combination
with a semidigital filtering DAC. If the DSM REQ and PWM modulated sub-DACs were
chosen to be e.g. 5-bit at OSR=128 (for a PWM clock of 400MHz at 96kHz f
s_in
), and 32 sub-
DACs were arranged as a hann-weighted semidigital FIR filter, that would make the DAC
highly jitter insensitive and immune to both ISI and element mismatch. For a super hi-res
implementation this would appear as a very attractive design approach.

82

83
Chapter 6

Conclusions

Having digested the five chapters and the overview they give, the reader should be able to
assess the challenges and evaluate the results of data conversion design for high resolution
audio. It should also have provided the fundament necessary to evaluate the six papers, which
deal more specifically with some of the issues that have been presented.
The development of state-of-the-art performance in hi-res audio DACs is illustrated in table
2, listing some key silicon-proven publications. Unfortunately, the relatively low number of
published silicon-proven DACs for hi-res audio makes it difficult to produce a survey or
performance chart akin to those used for general purpose ADCs [167]. This situation is
complicated by published measurements often being made under differing conditions, like
signal frequency, amplitude, and frequency weighting. It would in the authors opinion be
helpful if designers more strictly adhered to the AES17-1998 measurement standard [168].
The publications in table 2 are selected for having reasonably comparable measurements, and
also for illustrating the change of design paradigms: The earliest converter is LPCM and then
it moved to (1-bit) DSM, later with switch-cap filtering. In the late 90s multi-bit DSM took
over for 1-bit, whereas current-mode DACs superseded switch-cap in the early 2000s. State of
the art performance has steadily increased, as has efficiency quantified by the FOM [167]:

2 2
ENOB
b
f
FOM
P
= . (159)

The ENOB is calculated from the SNDR using the 6dB per bit rule, f
b
is the measurement
bandwidth and P is the power dissipation in watts. (A) means measurements are A-weighted.

Table 2: Performance development, selected silicon-proven hi-res audio DAC publications
Publication Topology SNDR
@FS
Power
pr.ch.
Meas.
bandwidth f
b

ENOB FOM
(10
9
)
[163] (1986) 16-bit LPCM current-divider 95dB 400mW 20kHz (A) 15.5 4.63
[164] (1987) 1-bit DSM, CT CMOS buffer 90dB 150mW 20kHz 14.7 7.10
[165] (1991) 1-bit DSM, SC DAC 102dB 375mW 20kHz (A) 16.7 11.4
[53] (2000) Multi-bit DSM, SC DAC 102dB 155mW 20kHz (A) 16.7 27.5
[132] (2000) Multi-bit DSM, SC DAC 100dB 100mW 20kHz (A) 16.3 32.3
[166] (2000) Multi-bit DSM, I-DAC 108dB 111mW
9
20kHz (A) 17.6 71.6
[56] (2001) Multi-bit DSM, I-DAC 112dB 125mW
10
20kHz (A) 18.3 103
[160] (2003) Multi-bit DSM, PWM hyb.I-DAC >110dB
11
75mW 20kHz (A) >18.0 >140

In pursuit for higher resolution still, the designer will have to address all problems dealt
with in this book. The methods and results presented should make this task easier.

9
Estimated from data sheet for Texas Instruments PCM1738
10
Estimated from data sheet for Texas Instruments PCM1792
11
Limited by resolution of measurement instrument [158]
Chapter 6 Conclusions

84
The work conducted for the first paper was based on extensive simulations and evaluation
using Matlab. Four different DSM models were written, including a first order, third order,
fifth order and a trellis noise shaping modulator. Their baseband noise power as a function of
the input level was simulated by stepping the input and doing a new simulation run for each
step. To ensure high enough resolution, each simulation run was 2
21
samples long and a total
of 2
12
input levels were simulated for each DSM. These included simple fractions of the
quantization step to provoke the modulators idle-tone behaviour. Results show that even if it
is high order, a DSM without TPDF dither will have noise-power modulation, but for multi-
bit high order modulators it is likely to be negligible compared to circuit noise. Both the third
order and the fifth order 1-bit DSMs the latter being Sonys proposed design for DSD did
however exhibit noise power modulation that will subjectively impede state of the art
performance. From these results it is tempting to conclude that SACD or other 1-bit formats
will make it very difficult to achieve full transparency, whereas LPCM that in theory is
infinitely scalable would be preferable as a raw storage format also in the future.
The research for the second paper was initiated after an e-mail exchange with Peter Kiss,
main author of the paper Stable High-Order Delta-Sigma DACs in TCAS-I [161]. His paper
argued for EF modulators being intrinsically more stable than OF modulators, and how a high
order multi-bit EF DSM could be designed with guaranteed stability whereas an OF DSM
could not. This was found to contradict the conclusions in Kenney and Carleys paper on
multi-bit DSM design [123], where the non-overload approach was first introduced. It was
found that the cause for the disparaging conclusions between the papers of Kiss and
Kenney/Carley was that the former used OF-modulators implemented as modN basic
structures (fig.32). Such a DSM does of course not have a unity STF and it was the STF that
caused inferior stability. After some further correspondence a paper was written that clarified
and extended the non-overload theory, proving the equivalency of OF and EF modulators and
now also including truncating quantizers, quantizers with offset and any IIR NTF. The work
was again done in Matlab and the model library extended with general model files for any
modN DSM, having selectable N and quantizer functions.
During an excellent course on delta-sigma at the EPFL in Lausanne Switzerland, Robert
Adams in his lecture presentation showcased the advantages of segmented DEM exemplified
through his high-end design [140], and argued for a first order SDSM as preferable. A little
later a new TCAS paper, Multibit Delta-Sigma Modulator with Two-Step Quantization and
Segmented DAC [162], also discarded the use of a second order SDSM for mandating two
extra bits in the compensation DAC. The work done on the non-overload method for the
second paper made it clear that it would be applicable here and could be used to design more
optimal segmentation modulators. IIR SDSMs were designed and analysed in Matlab and it
was found that a very conservative NTF would lessen the complexity penalty by moving to
second order to less than half. Using second order segmentation modulators was also found to
be hugely advantageous with regards to tones. According to Steensgaards thesis ([141]
pp.174-175), Adams previously argued that tones in the SDSM would not be a problem
because its input contained a strong shaped noise component. In his thesis Steensgaard
repudiates this claim and simulations done for paper III confirm his reasoning. Matlab models
for mismatch DACs with various selectable DEM algorithms and SDSMs were developed in
the making of this publication.
The fourth paper was motivated by the difficulty in finding any good documentation for the
relationship between the DSM and jitter performance. In numerous publications one can read
arguments in favour of multi-bit modulation because of jitter concerns, or that moving from
switch-cap to current-steering DACs increases the jitter problem. However it has been
difficult to find quantified assessments, showing how or by how much the jitter susceptibility
changes when the number of bits, the NTF, or the oversampling ratio is altered. This paper set
out as a general study on the relationship between the DSM and jitter errors, but was later

85
extended to consist of a more general analysis, also evaluating mismatch errors and ISI errors.
A range of Matlab models were built for DEM DACs, jittered DACs, and DACs with
switching errors, and simplified estimates that are also shown in chapters 4 and 5 were
developed based on spectral analysis with the additive noise source REQ model. The paper
provides estimation methods that should make it easier to predict the distortion caused by
circuit non-idealities when designing a DSM converter, or predict e.g. how many bits will be
necessary to reach a target SJNR, given a certain amount of jitter. It also clearly shows how
advantageous it is to use multi-bit REQ and clarifies common confusions, e.g. surrounding
DSM DACs and their susceptibility to different jitter types. The reader should perhaps in
particular note how jitter sideband distortion will not be affected by the number of bits or the
NTF of the modulator, whereas white jitter noise to a great deal will.
The fifth paper extends on the fourth to investigate some proposals using the simplified
estimation methods. The objective was to find a jitter optimal DAC within certain
complexity constraints. Semidigital filtering DACs have previously been used to improve the
jitter performance of 1-bit converters, in this paper it was proposed to combine a multi-bit
DSM and DEM with a semidigital multi-bit DAC. The imposed complexity constraint was
that the DAC should have 255 levels. An 8-bit DSM REQ followed by a segmented DAC
(with a 2.order SDSM) was compared to a 15-level DSM REQ followed by 17 15-level sub-
DACs arranged as a semidigital filter. It was confirmed that with proper weighting the
semidigital DAC would have significantly better jitter performance than the segmented DAC,
with the bonus that the complexity overhead caused by the SDSM and larger DEM network is
removed. This topology proved superior with regards to jitter susceptibility, and is
recommended to pursue if jitter noise dominates the error budget. In a high-OSR converter for
audio it would be a viable approach to achieve very high resolution. In wider bandwidth low-
OSR delta-sigma converters it will not be applicable.
The sixth paper was written after the hand-in of the original thesis upon which this book is
made, but was also part of the Ph.D work and is included in this edition. The motivation
behind this study was the confusion that arose when simulations often gave misleading
performance figures and visible smearing of the in-band spectrum, and only a minute change
in signal amplitude, signal frequency or modulator initial state sometimes made these
problems disappear. It was suspected early on that this was caused by noise leakage, but no do
literature actually documenting the problem seemed to be available. Since clarifying this issue
should be of benefit to designers, it was decided to explore the issue and write a paper that
sought to explain why it arises and what should be done about it to minimize the likelihood of
wasted simulation or measurement time.


86

87
Appendix 1

Frequency Analysis

Throughout this thesis, frequency domain simulations are done in the sampled domain. For
discrete time signals the DTFT is used to find a continuous spectrum.

{ } [ ] ( )
def
n
s
n
DTFT x X x n e

=
=
i
. (160)

The DTFT gives an infinite periodic spectrum. In a real-world simulation scenario, an
infinite length sample sequence is generally not available. Assuming the available sample
sequence to be of finite length L, its DTFT is:

[ ]
1
0
( )
L
n
s
n
X x n e

=
=
i
. (161)

Still the transform is not usable for computer simulation since is a continuous variable.
The intuitive way to obtain an equivalent fully discrete transform is to sample the DTFT
spectrum:

[ ]
2
1
0
2
( ) , 0,1, , 1
kn
L
N
s
n
k
X k X x n e k N
N

=

= = = =

i
K . (162)

Assuming the available sequence is at least as long as the sample set, i.e. LN, the N-point
DFT can be defined as:

{ } [ ]
2
1
0
( ) , 0,1, , 1
kn
N def
N
N
n
DFT x X k x n e k N

=
= =
i
K . (163)

If L<N the sequence must be zero-padded to do an N-point DFT, this is not the case for any
simulations for this thesis. Direct calculation of the DFT is computationally very demanding;
its complexity being O(N
2
). If N is chosen a power of 2, several algorithms exist to partition
the data and speed up the process significantly. These algorithms are generally referred to as
Fast Fourier Transforms; a review of FFT algorithms is provided in [169]. An FFT algorithm
will typically compute an N-point DFT with O(Nlog
2
N) complexity. Simulations in this thesis
and the papers use the Cooley-Tukey FFT algorithm
12
with N=2
16
unless otherwise noted.
A finite DFT spectrum can have incongruities compared to the real DTFT spectrum of a
desired function. If the input signal is a function x
in
[n] defined in n-,, picking a limited
sample set of length N to obtain (163) can be rewritten as:

[ ] [ ] [ ] [ ]
1 , 0 1
,
0 , otherwise
def
in
n N
x n x n w n w n

= =

. (164)

12
Cooley-Tukey is the algorithm used by the default FFT function in Matlab
Appendix 1 Frequency Analysis

88
The input function is multiplied with a rectangular window w of length N, meaning that
there is spectral convolution:

( ) ( ) ( ) ( )
( 1)
2
sin
2
,
sin
2
N
in
N
X X W W e

= =

i
. (165)

The Fourier transform of w is the aliased sinc-function or Dirichlet-kernel. Imagine the
input function is a sinusoid x
in
[n]=sin(
x
n): Then its spectrum is zero everywhere but
x
. But
because the truncated function x[n] is spectrally convolved with the Dirichlet-kernel it will
have frequency domain smearing and ringing. When N equidistant samples are taken for the
DFT it will have non-zero energy also for other samples than the one closest to
x
. This is
referred to as spectral leakage. In fig.81 the result of leakage is illustrated for an example
DFT with N=64.

Figure 81: Illustration of DFT spectral leakage

Leakage is less severe with large N, but high resolution SNR simulations are ruined by
leakage even if the DFT is extremely long. Because of this windowing of the DFT sample set
is absolutely necessary. Windowing means to replace the rectangular window defined by
picking a sample set of a function with a different window function. A smoother window
function will give less ringing by reducing the abrupt end points of the rectangular window.
That frequency-domain ringing is complementary to time-domain discontinuities is known
from the description of the Gibbs effect [170]. Most simulations in this thesis use the hann-
window [171], defined as:


89
[ ]
2
0.5 1 cos
1
n
w n
N

=

. (166)

When the signal is multiplied with the hann-window before doing the DFT, the result for a
sinusoid looks like shown in fig.82. As seen the ringing is greatly suppressed and a DFT of
reasonable length can now be used for very high resolution simulations. A drawback with
windowing is that although side lobes are better attenuated, the main lobe becomes wider.
This implies decreased spectral resolution; if there are two distinct tones close in frequency
their main lobes from convolution with the window may smear together and it then appears as
only one tone in the DFT. Spectral resolution vs. side lobe attenuation is an active trade-off to
make when choosing the windowing function. A comparison of the most common windowing
functions is found in [172].

Figure 82: Spectrum of sine multiplied with rectangular (top) and hann (bottom) window

An alternative technique to avoid leakage often used for single-tone simulations is so-called
coherent sampling [173]. The point with coherent sampling is to set the frequency of an input
sinusoid such that the DFT samples correspond exactly to zeros in the convoluted spectrums
side lobes (and the centre of the main lobe). This can be ensured by using an input sinusoid
x
in
[n]=sin(
x
n) with a frequency that fulfils:

2
x x s
K K
f f
N N
= = . (167)

K is the integer giving f
x
closest to the originally intended input frequency. In this case the
time-domain sinusoid has exactly an integer number of cycles and end-point discontinuities
are not present. The DFT result is shown in fig.83; it is apparent how X
in
(k) now will not have
any leakage. The result can be confirmed theoretically by correlating the input signal with the
DFT basis functions.
It has also been suggested that K should be prime to ensure irreducibility [174]. Then the
number of different levels that are excited is maximized, reducing the risk of hidden INL
errors. This special case of coherent sampling is known as prime sampling.

90

Figure 83: Convoluted spectrum and DFT samples with coherent sampling

A delta-sigma modulator complicates matters somewhat because spectral leakage might
impair the results even when coherent sampling is used. The output from a DSM consists of
two components, the signal component x and the quantization noise component e
dsm
. Now
even if f
x
is chosen coherent and x has no spectral leakage to other DFT bins, the quantization
noise e
dsm
might leak into the signal band. Since the noise is very strongly shaped, especially
in high order modulators, leakage from the powerful out-of-band noise may significantly
affect the very low in-band noise. This is illustrated in fig.84.

Figure 84: Illustration of signal leakage and noise leakage impairing DSM DFT

In the time-domain noise leakage can be intuitively understood since even though a
sinewave has exactly an integer number of cycles within the length of the DFT, the
quantization error superimposed on it may lead to end point discontinuities. For a high order
DSM the quantization error is a shaped noise signal whose time sequence is not possible to
derive, and it cant be known how noise leaks in-band. It is therefore strongly recommended
to use both coherent sampling and windowing when doing spectral analysis of a DSM. How
this improves DFT resolution by reducing noise leakage is seen in fig.84. Simulations in this
thesis use prime sampling and hann windowing. These considerations are treated in more
detail in paper VI.

91
Appendix 2

Paper I:

I. Lkken, A. Vinje, T. Sther, Noise Power Modulation in Dithered and Undithered High-
Order Sigma-Delta Modulators, J. Audio Eng. Soc., vol.54, no.9, pp.841-854 (2006 Sept.).

2006 AES. Reprinted, with permission, from the Journal of the Audio Engineering Society
(ISSN 1549-4950)

Appendix 2 Paper I

92

Appendix 2 Paper I

93

Appendix 2 Paper I

94

Appendix 2 Paper I

95

Appendix 2 Paper I

96

Appendix 2 Paper I

97

Appendix 2 Paper I

98

Appendix 2 Paper I

99

Appendix 2 Paper I

100

Appendix 2 Paper I

101

Appendix 2 Paper I

102

Appendix 2 Paper I

103

Appendix 2 Paper I

104

Appendix 2 Paper I

105

Appendix 2 Paper I

106

107
Appendix 3

Paper I Errata:

Errata: I.Lkken, A.Vinje, T.Sther, Noise Power Modulation in Dithered and Undithered
High-Order Sigma-Delta Modulators, Journal of the Audio Engineering Society., vol. 54, pp.
841854 (2006 Sept.)

- On p. 843, in the first full paragraph, the first sentence should have read: In audio we are interested in
having no harmonic distortion and no noise power modulation.
- In Section 2.3.1, p. 849, the text starting on line 14 should have read: At the same time the tonal
performance is comparable to the statically dithered version. The noise power modulation and overall
dynamic range are also improved significantly as compared to the statically dithered case because. . . .
- Fig.1, p.842 should have read: Midthread quantizer.
- Eq.22 should have negative sign.
- In fig.10; the x-axis range should be 0.5 or 1 in absolute values, since =2. The same applies for
figures 12-14 and 19.

Thanks to Prof. Stanley P. Lipshitz for feedback and comments and JAES chief editor Gerri
Calamusa for printing parts of this errata in JAES vol.54, no.10.

Appendix 3 Paper I Errata

108

109
Appendix 4

Paper II:

I. Lkken, A. Vinje, T. Sther, B. Hernes: "Quantizer Nonoverload Criteria in Sigma-Delta
Modulators", IEEE Trans. Circuits and Systems Part II: Express Briefs, vol.53, no.12,
pp. 1383-1387, (2006 Dec.)

2006 IEEE. Reprinted, with permission, from IEEE Transactions on Circuits and Systems
Part II: Express Briefs (ISSN 1549-7747)
Appendix 4 Paper II

110

Appendix 4 Paper II

111

Appendix 4 Paper II

112

Appendix 4 Paper II

113

Appendix 4 Paper II

114

115
Appendix 5

Paper III:

I. Lkken, A. Vinje, T. Sther: "Segmented Dynamic Element Matching Using Delta-Sigma
Modulation", Proc. 31st Conference of the Audio Engineering Society New Directions is
High Resolution Audio, London UK, (2007 June).
Appendix 5 Paper III

116


117


118


119


120

121
Appendix 6

Paper IV:

I. Lkken, A. Vinje, T. Sther, B. Hernes: "Error Estimation in Delta-Sigma DA-Converters",
Submitted to Analog Integrated Circuits and Signal Processing
Appendix 6 Paper IV

122

Appendix 6 Paper IV

123

Appendix 6 Paper IV

124

Appendix 6 Paper IV

125

Appendix 6 Paper IV

126

Appendix 6 Paper IV

127

Appendix 6 Paper IV

128

Appendix 6 Paper IV

129

Appendix 6 Paper IV

130

Appendix 6 Paper IV

131

Appendix 6 Paper IV

132

Appendix 6 Paper IV

133

Appendix 6 Paper IV

134

Appendix 6 Paper IV

135

Appendix 6 Paper IV

136

137
Appendix 7

Paper V:

I. Lkken, A. Vinje, T. Sther, "Delta-Sigma DAC Topologies for Improved Jitter
Performance", Audio Eng. Soc. Convention Paper 7497, 124th Convention of the Audio Eng.
Soc. Discover New Horizons in Audio, Amsterdam NL, (2008 May).

2008 AES. Reprinted with permission.

Appendix 7 Paper V

138

Appendix 7 Paper V

139

Appendix 7 Paper V

140

Appendix 7 Paper V

141

Appendix 7 Paper V

142

Appendix 7 Paper V

143

Appendix 7 Paper V

144

Appendix 7 Paper V

145

Appendix 7 Paper V

146

Appendix 7 Paper V

147

Appendix 7 Paper V

148

Appendix 7 Paper V

149

Appendix 7 Paper V

150

151
Appendix 8

Paper VI:

I. Lkken, A. Vinje, "Some Considerations for Spectral Analysis of Delta-Sigma Data
Converters", ISAST Trans. Electronics and Signal Processing, vol.3, no.2, pp. 19-25 (2008
October).

2008 ISAST. Reprinted with permission from the ISAST Transactions on Electronics and
Signal Processing (ISSN 1797-2329)

Appendix 8 Paper VI

152

Appendix 8 Paper VI

153

Appendix 8 Paper VI

154

Appendix 8 Paper VI

155

Appendix 8 Paper VI

156

Appendix 8 Paper VI

157

Appendix 8 Paper VI

158

159
Bibliography

[1] T.A.Edison, Phonograph or Speaking Machine, US. Patent 200,521, (1878 Feb.)

[2] H.Fletcher and W.A.Munson, Loudness, its Definition, Measurement and Calculation,
J. Acoust. Soc. Am., vol.5, pp.82-108, (1933 May)

[3] D.W.Robinson and R.S.Dadson, "A Re-Determination of the Equal-Loudness Relations
for Pure Tones", Br. J. Appl.Phys., vol.7, pp.166-181, (1956)

[4] T.Oohashi et al., Inaudible High-Frequency Sounds Affect Brain Activity: Hypersonic
Effect, J.Neurophysiol, vol.83 no.6, pp.3548-3558, (2000 Jun.)

[5] S.Kiryu, Detection of Threshold for Tones Above 22kHz, Audio Eng. Soc.
Convention Paper 5401, 110th AES Convention, Amsterdam, (2001 May)

[6] J.Boyk, Theres Life Above 20kHz! A Survey of Musical Instrument Spectra to
102.4kHz, California Institute of Technology, available on-line at:
http://www.cco.caltech.edu/~boyk/spectra/spectra.htm, (2000 May)

[7] Acoustic Renaissance for Audio, A Proposal for High-Quality Application of High-
Density CD Carriers, J. Japan Audio Soc., vol.35, available on-line at:
http://www.meridian-audio.com/ara, (1995 Oct.)

[8] J.Atkinson, A.B.Krueger, The Great Debate; Subjectivism on Trial, Home
Entertainment Show 2005, New York. Audio recording available on-line at:
http://stereophile.com/news/050905debate/, (2005 May)

[9] G.E.Moore, Cramming More Components Onto Integrated Circuits, Electronics
Magazine, vol.38, no.8, (1965)

[10] Philips Intellectual Property and Standards, IEC-908: Compact Disc Digital Audio
The Red Book, International Electrotechnical Commission Standards Document, no.
28/10/04-3122 783 0027 2, (1980 Jun.)

[11] G.Theile, On the Performance of Two-Channel and Multi-Channel Stereophony,
Audio Eng. Soc. Convention Paper 2887, 88th AES Convention, Montreux, (1990 Mar.)

[12] P.Craven, Toward the 24-bit DAC: Novel Noise-Shaping Topologies Incorporating
Correction for the Nonlinearity in a PWM Output Stage, J. Audio Eng. Soc., vol.41,
no.5, pp.291-313, (1993 May)

[13] J.v.d.Verbakel, L.v.d.Kerkhof, M.Maeda, Y.Inazawa, Super Audio CD Format,
Audio Eng. Soc. Convention Paper 4705, 104th AES Convention, Amsterdam (1998
May)

[14] N.Fuchigami, T.Kuroiwa, B.H.Suzuki, DVD-Audio Specifications, J. Audio Eng.
Soc., vol.48, no.12, pp.1228-1230, 1232-1238, 1240; (2000 Dec.)
Bibliography

160
[15] D.Blech, M.C.Yang, DVD-Audio Versus SACD: Perceptual Discrimination of Digital
Audio Coding Formats, Audio Eng. Soc. Convention Paper 6086, 116
th
AES
Convention, Berlin, (2004 May)

[16] J.Atkinson, Hi-Rez Media: When Will They Learn?, Stereophile Magazine, vol.28,
no.3, (2005 Mar.)

[17] P.J.Alexander, Peer-to-Peer File Sharing: The Case of the Music Recording Industry,
Review of Industrial Organization, vol.20, pp.151-161, (2002)

[18] T.Painter, A.Spanias, Perceptual Coding of Digital Audio, Proc. IEEE, vol.88, no.4,
pp.451-515, (2000 Apr.)

[19] F.E.Toole, Loudspeaker Measurements and Their Relationship to Listener Preferences:
Part I-II, J. Audio Eng. Soc., vol.34, no.4, pp. 227-235 and no.5, pp.323-348, (1986)

[20] R.Levine, The Death of High Fidelity, Rolling Stone Magazine, available on-line at:
http://www.rollingstone.com/news/story/17777619, (2007 Dec.)

[21] H.Nyquist, Certain Topics in Telegraph Transmission Theory", Trans. AIEE, vol.47,
pp.617-644, (1928 Apr.)

[22] V.A.Kotelnikov, On the Carrying Capacity of the Ether and Wire in
Telecommunications, 1
st
All-Union Conference on Questions of Communication, Lzd.
Red. Upr. Svyazi RKKA, Moscow, (1933)

[23] C.E.Shannon, A Mathematical Theory of Communication, Bell System Tech. J.,
vol.27, pp.379423, 623656, (1948)

[24] C.E.Shannon, "Communication in the Presence of Noise", Proc. Institute Radio Eng.,
vol.37, no.1, pp.10-21, (1949 Jan.)

[25] E.T.Whittaker, On the Functions Which are Represented by the Expansions of the
Interpolation Theory, Proc. Royal Soc. Edinburgh, vol.35, pp.181-194, (1915)

[26] H.D.Lke, The Origins of the Sampling Theorem, IEEE Communications Magazine,
pp.106108, (1999 Apr.)

[27] H.S.Black and J.O.Edson, "Pulse Code Modulation," AIEE Transactions, vol.66,
pp.895-899 (1947)

[28] W.R.Bennett, "Spectra of Quantized Signals," Bell System Tech. J., vol.27, pp.446-471,
(1948 July)

[29] B.Widrow; A Study of Rough Amplitude Quantization by Means of Nyquist Sampling
Theory", IRE Trans. Circuit Theory, vol.CT-3, pp.266-276, (1956 Dec.)

[30] R.M.Gray, Quantization Noise Spectra, IEEE Trans. Inform. Theory, vol.36, (1990
Nov.)

Bibliography

161
[31] G.R.Ritchie, J.C.Candy, and W.H.Ninke, "Interpolative Digital-to-Analog Converters",
IEEE Trans. Communications, vol.22, pp.1797-1806, (1974 Nov.)

[32] H.G.Musmann and W.W.Korte, "Generalized Interpolative Method for Digital/Analog
Conversion of PCM Signals", U.S. Patent 4,467,316, (filed 1981 June)

[33] M.Bellanger et al., Digital Filtering by Polyphase Network: Application to Sample-
Rate Alteration and Filter Banks, IEEE Trans. Acoustics, Speech and Signal
Processing, vol.24, pp.109-114, (1976 Apr.)

[34] T.Saramki, Design of FIR Filters as a Tapped Cascaded Interconnection of Identical
Subfilters, IEEE Trans. Circuits and Systems, vol.34, no.9, pp.1011-1029, (1987 Sept.)

[35] T.Saramki, Y. Neuvo, S.K.Mitra, Design of Computationally Efficient Interpolated
FIR-filters, IEEE Trans. Circuits and Systems, vol.35, no.1, pp.70-88, (1988 Jan.)

[36] O.Pirochta, Hardware Implementations of Digital FIR Filters in FPGA, Proc.17
th

International Conference Radioelektronika 2007, Brno Czech Republic, (2007 Apr.)

[37] S.Mitra, Digital Signal Processing: A Computer-Based Approach, third edition,
McGraw-Hill International Press, ISBN: 007-124467-0, (2006)

[38] L.G.Roberts, "Picture Coding Using Pseudo-Random Noise". IEEE Trans. Information
Theory, vol.8, pp.145154, (1962 Feb.)

[39] L.Schuchman, "Dither Signals and Their Effect on Quantization Noise", IEEE Trans.
Communications, vol.12, pp.162165 (1964 Dec.)

[40] S.P.Lipshitz, J.Vanderkooy, Dither in Digital Audio, J. Audio Eng.Soc., vol.35, no.12,
pp.966-975, (1987 Dec.)

[41] S.P.Lipshitz, R.A.Wannamaker, and J.Vanderkooy, Quantization and Dither; A
Theoretical Survey, J. Audio Eng. Soc., vol. 40, pp. 355375 (1992 May).

[42] S.P.Lipshitz, J.Vanderkooy, Dither Myths and Facts, Audio Eng. Soc. Convention
paper 6279, 117th AES Convention, San Francisco, (2004 Oct.)

[43] R.A.Wannamaker, The Theory of Dithered Quantization, Ph.D. Thesis, Dept. for
Applied Mathematics, University of Waterloo, Waterloo, ON, Canada (1997 June)

[44] H.Inose, Y.Yasuda and J.Marakami; A Telemetering System by Code Modulation,
Delta-Sigma Modulation, IRE Trans. Space, Electronics and Telemetry, SET-8, pp.
204-209, (1962 Sept.)

[45] D.J.Goodman, The Application of Delta Modulation of Analog-to-PCM Encoding,
Bell System Tech.J., vol.48, pp.321-343, (1969 Feb.)

[46] C.C.Cutler, Transmission Systems Employing Quantization, U.S. Patent 2,927,962,
(1960 Mar.)

[47] Dan Sheingold, Sigma-Delta or Delta-Sigma?, Analog Dialogue, vol.24, no.2,
editors note, (1990)
Bibliography

162

[48] N.H.C.Gilchrist, Analogue-to-Digital and Digital-to-Analogue Converters for High
Quality Sound, Audio Eng. Soc. Convention Paper 1583, 65th AES Convention,
London, (1980 Feb.)

[49] R.W.Adams; Design and Implementation of an Audio 18-Bit Analog-to-Digital
Converter Using Oversampling Techniques, J. Audio Eng. Soc., vol.34 no.3, pp.153-
166, (1986 March).

[50] J.T.Caves, M.A.Copeland, C.F.Rahim and S.D.Rosenbaum, Sampled-Data Filters
Using Switched Capacitors as Resistor Equivalents, IEEE J. Solid-State Circuits,
vol.12, pp.592-600, (1977 Dec.)

[51] J.A.C.Bingham, Application of Direct-Transfer SC Integrator, IEEE Trans. Circuits
and Systems, vol.31, pp.419420, (1984 Apr.).

[52] N.S.Sooch et al., 18-b Stereo D/A Converter with Integrated Digital and Analog
Filters, Audio Eng. Soc. Convention Paper 5603, 91st AES Convention, New Your,
(1991 Oct.)

[53] I.Fujimori, A.Nogi, T.Sugimoto; A Multibit Delta-Sigma Audio DAC With 120-dB
Dynamic Range, IEEE J. Solid-State Circuits, vol.35, no.8, pp.1066-1073 (2000 Aug.)

[54] I.Fujimori, T.Sugimoto, A 1.5 V, 4.1 mW Dual-Channel Audio DeltaSigma D/A
Converter, IEEE J. Solid-State Circuits, vol.33, no.12, pp.1863-1870, (1998 Dec.)

[55] A.Paul Brokaw, Digital-to-Analog Converter with Current Source Transistors
Operated Accurately at Different Current Densities," U.S. Patent 3,940,760, (filed 1975
Mar.)

[56] N.Terada, S.Nakao, A 126DB D-Range Current-Mode Advanced Segmented DAC,
Proc. 16
th
Audio Eng. Soc. UK Conference Silicon for Audio, (2001 Mar.)

[57] K.B.Amulya, Binomial Theorem in Ancient India, Indian J. Hist. Sci., vol.1, pp.68
74, (1966)

[58] G.Boole, An Investigation of the Laws of Thought on Which are Founded the
Mathematical Theories of Logic and Probabilities, Macmillan Publishers (1854),
reprinted with corrections, Dover Publications, New York, ISBN 978-0486600284,
(1958).

[59] C.Shannon, "The Symbolic Analysis of Relay and Switching Circuits", Trans. Am. Inst.
Electrical Eng., vol.57, pp.713-723, (1938 Mar.).

[60] J.J.Wikner, Studies on CMOS Digital-to-Analog Converters, Ph.D. Thesis, Dept. for
Electrical Engineering, Linkping University, Linkping, Sweden, ISBN 91-7219-910-
5, (2001)

[61] Q.Li, INL, DNL and Performance of Analog-to-Digital Converters, project report for
the course Learning from Data at Portland State university, available on-line at:
http://web.cecs.pdx.edu/~edam/Reports/2002/Li.pdf

Bibliography

163
[62] C.Dunn, M.O.J.Hawksford; Is the AESEBU/SPDIF Digital Audio Interface Flawed?,
Audio Eng. Soc. Convention Paper 3360, 93rd AES Convention, San Francisco, (1992
Oct.)

[63] J.Dunn; Jitter: Specification and Assessment in Digital Audio Equipment, Audio Eng.
Soc. Convention Paper 3361, 93rd AES Convention, San Francisco, (1992 Oct.)

[64] J.Dunn et al., Toward Common Specifications for Digital Audio Interface Jitter,
Audio Eng. Soc. Convention Paper 3705, 95
th
AES Convention, New York, (1993 Oct.)

[65] J.Dunn, Sample Clock Jitter and Real-time Audio Over the IEEE1394 High
Performance Serial Bus, Audio Eng. Soc. Convention Paper 4920, 106
th
AES
Convention, Munich, (1999 Apr.)

[66] P.Heydari; Analysis of the PLL Jitter Due to Power/Ground and Substrate Noise,
IEEE Trans. Circuits and Systems I: Regular Papers, vol.51, no.12, pp.24042416,
(2004 Dec.)

[67] J.A.McNeill; Jitter in Ring Oscillators, IEEE J. Solid State Circuits, vol.32, no.6,
pp.870-879, (1997 June)

[68] AES-12id-2006; AES Information Document for Digital Audio Measurements Jitter
Performance Specifications, (2006)

[69] M.O.J.Hawksford: Jitter Simulation in High Resolution Digital Audio, Audio Eng.
Soc. Convention Paper 6864, 121
st
AES Convention San Francisco, (2006 Oct.)

[70] K.Doris, A.van Roermund, D. Leenaerts; A General Analysis on the Timing Jitter in
D/A Converters, Proc. IEEE Intern. Symp. Circuits and Systems ISCAS 2002, pp.117-
120, (2002 May)

[71] L.Angrisani, M.DApuzzo, M.DArco; Modeling Timing Jitter Effects in Digital-to-
Analog Converters, 2005 IEEE International Workshop on Intelligent Signal
Processing, pp.254-259, (2005 Sept.)

[72] B.Putzeys, R.de saint Moulin, Effects of Jitter on AD/DA Conversion, Audio Eng.
Soc.Convention Paper 6122,, 116
th
AES Convention, Berlin, (2004 May)

[73] R.H.M. van Veldhoven; A Triple-Mode Continuous-Time Modulator with
Switched-Capacitor Feedback DAC for a GSM-EDGE/CDMA2000/UMTS receiver,
IEEE J. Solid State Circuits, vol.38, no.12, pp.2069-2076, (2003 Dec.)

[74] K.Ashihara et al., Detection Threshold for Distortions Due to Jitter on Digital Audio,
ACJ J. Acoust. Science and Technology, vol.26, no.1, pp.50-54 (2005)

[75] R.W.Adams, Jitter Analysis of Asynchronous Sample-Rate Conversion, Audio Eng.
th
AES Convention New York, (1993 Oct.)

[76] F.M.Rotacher; Sample-Rate Conversion; Algorithms and VLSI Implementation,
PhD-thesis, Swiss Federal Institute of Technology, Zrich, (1995)

Bibliography

164
[77] M.J.M. Pelgrom et al., "Matching Properties of MOS Transistors", IEEE J. Solid-State
Circuits, vol.24, no.5, pp.1433-1439, (1989 Oct.).

[78] K.O.Andersson, J.J.Wikner; Modeling of the Influence of Graded Element Matching
Errors in CMOS Current-Steering DACs, Proc. 17
th
Norchip Conference, Oslo
Norway, (1999 Nov.).

[79] M.Clara, A.Wiesbauer,W.Klatzer; Nonlinear Distortion in Current-Steering D/A-
Converters Due to Asymmetrical Switching, Proc. IEEE Intern. Symp. Circuits and
Systems ISCAS 2004, pp.285-288, (2004 May).

[80] B.P.Del Signore et al.; A Monolithic 20-b Delta-Sigma A/D Converter, IEEE J. Solid-
State Circuits, vol.25, no.6, pp.1311-1317, (1990 Dec.)

[81] Luschas S. and Lee H.-S., Output Impedance Requirements for DACs, Proc. IEEE
Intern. Symp. Circuits and Systems ISCAS 2003, pp.861864, (2003 May)

[82] Texas Instruments, DSD1792A 24-Bit 192 kHz Sampling Advanced Segment Audio
Stereo DAC, Data Sheet SLES106 Rev.B, (2006 Nov.)

[83] J.Silva, U.Moon, J.Steensgaard and G.C.Temes Wideband Low-Distortion Delta-
Sigma ADC Topology, Electronic Letters, vol.37, no.12, pp.737-738, (2001 June)

[84] J.C.Candy, A Use of Double Integration in Sigma-Delta Modulation, IEEE Trans.
Communications, vol.33, no.3, pp.249-258, (1985 Mar.)

[85] R.W.Adams, Theory and Practical Implementation of a Fifth-Order Sigma-Delta A/D
Converter, J. Audio Eng. Soc., vol.39, no.7/8, pp.515-528, (1991 Jul./Aug.)

[86] B.E.Boser, B.A.Wooley, The Design of Sigma-Delta Modulation Analog-to-Digital
Converters, IEEE J. Solid State Circuits, vol.23, no.6, pp.1298-1308, (1988 Dec.)

[87] S.Hein, A.Zakhor, On the Stability of Sigma Delta Modulators, IEEE Trans. Signal
Processing, vol.41, no.7, pp.2322-2348, (1993 Jul.)

[88] S.Lipshitz, J.Vanderkooy, R.A.Wannamaker, Minimally Audible Noise Shaping, J.
Audio Eng. Soc., vol.39, no.11, pp.836-852, (1991 Nov.)

[89] H.Takahashi, A.Nishio Investigation of Practical 1-bit DeltaSigma Conversion for
Professional Audio Applications, Audio Eng. Soc. Convention Paper 5392, 110
th
AES
Convention, Amsterdam, (2001 Apr.)

[90] P.J.Naus, et al., A CMOS Stereo 16-bit D/A Converter for Digital Audio, IEEE J.
Solid State Circuits, vol.22, no.3, pp.390-395, (1987 June).

[91] P.Kiss, J.Arias, D.Li, V.Boccuzzi, Stable High-Order Delta-Sigma DACs, IEEE
Trans. Circuits and Systems I, Reg.Papers, vol.51, no.1, pp.200-205, (2004 Jan.)

[92] T. Hayashi et al., A Multistage Delta-Sigma Modulator without Double Integration
Loop, ISSCC Dig. Technical Papers, pp.182-183, (1986 Feb.)

Bibliography

165
[93] Y.Matsuya et al., A 16-bit Oversampling A-to-D Conversion Technology using Triple-
Integration Noise Shaping, IEEE J. Solid State Circuits, vol.22, no.6, pp.921-929,
(1987 Dec.)

[94] W.Chou et al., Multistage Sigma-Delta Modulation, IEEE Trans. Information Theory,
vol.35, no 4, (1989 July)

[95] H.Kato, Trellis Noise-Shaping Converters and 1-bit Digital Audio, AES Convention
Paper 5615, 112
th
AES Convention, Munich Germany, (2002 May).

[96] A.J.Viterbi, Error Bounds for Convolutional Codes and an Asymptotically Optimum
Decoding Algorithm, IEEE Trans. Information Theory, vol.13, no.2, pp.260269,
(1967 Apr.)

[97] E.Janssen, D.Reefman, Advances in Trellis based SDM structures, AES Convention
Paper 5993, 115
th
AES Convention, New York USA, (2003 Oct.)

[98] J.A.S.Angus, The Efficiency of Pruned Tree versus Stack Algorithms for Look-
Ahead Sigma-Delta Modulators, J. Audio Eng. Soc., vol.54, no.6, pp.477-494, (2006
June)

[99] W.L.Lee, C.G.Sodini, A Toplogy for Higher-Order Interpolative Coders, Proc. IEEE
Int. Symp. Circuits and Systems, pp.459-462, (1987 May)

[100] D.L.Wellard et al., Delta-Sigma Modulator with Oscillation Detect and Reset Circuit,
US Patent 5,012,244, (1991 Apr.)

[101] W.Rhee, B.S.Song, A.Ali, A 1.1-GHz CMOS Fractional-N Frequency Synthesizer
with a 3-b Third-Order Modulator, IEEE J. Solid State Circuits Part I, vol.35,
no.10, pp.1453-1460, (2000 Oct.)

[102] E.F.Stikvoort, "Some Remarks on the Stability and Performance of the Noise Shaper or
Sigma-Delta Modulator," IEEE Trans. Comm., vol.36, pp.1157-1162, (1988 Oct.)

[103] T.Ritoniemi, T.Karema, H.Tenhunen; Design of Stable High Order 1-bit Sigma-Delta
Modulators, Proc. 1990 IEEE Int. Symp. Circuits and Systems, pp.32673270, (1990
May)

[104] R.Schreier; An Empirical Study of High-Order Single Bit Delta-Sigma Modulators,
IEEE Trans. Circuits and Systems Part II, vol.40, pp.461-466, (1993 Aug.)

[105] S.H.Ardalan,J.J.Paulos, Analysis of Nonlinear Behaviour in Delta-Sigma Modulators,
IEEE Trans. Circuits and Systems Part I, vol.34, no.6, pp.593-603, (1987 June)

[106] S.Hein, A.Zakhor, On the Stability of Sigma Delta Modulators, IEEE Trans. Signal
Processing, vol.41, no.7, pp.2322-2348, (1993 July)

[107] H.Wang, A Study of Sigma Delta Modulations as Dynamical Systems, PhD Thesis,
Colombia University, New York, AAT 9333879, (1993)

Bibliography

166
[108] O.Feely, L.O.Chua, The Effect of Integrator Leak in Modulation, IEEE Trans.
Circuits and Systems, vol.38, no.11, pp.1293-1305, (1991 Nov.)

[109] L.Risbo, Delta-Sigma Modulators: Stability Analysis and Optimization, PhD. thesis,
Technical University of Denmark, Lyngby, Denmark, (1994 June)

[110] J.Reiss; Towards a Procedure for Stability Analysis of High Order Sigma Delta
Modulators, Audio Eng. Soc. Convention Paper 6549, 119
th
AES Convention, New
York, (2005 Oct.)

[111] G.Tsenov, V.Mladenov, and J.Reiss, A Comparison of Theoretical, Simulated, and
Experimental Results Concerning the Stability of Sigma Delta Modulators, Audio Eng.
th
AES Convention, Amsterdam, (2008 May)

[112] M.Goodson, B.Zhang, and R.Schreier, "Proving Stability of Delta-Sigma Modulator
Using Invariant Sets," Proc. Int. Symp Circuits and Systems ISCAS95, pp.633-636,
(1995 May)

[113] R.Schreier, M.Goodson, B.Zhang, "An Algorithm for Computing Convex Positively
Invariant Sets for Delta-Sigma Modulators," IEEE Trans. Circuits and Systems Part-I:
Fundamental Theory and Applications, vol. 44, no.1 pp.38-44, (1997 Jan.)

[114] J.C.Candy, O.J.Benjamin, The Structure of Quantization Noise from Sigma-Delta
Modulation, IEEE Tran. Communications, vol.29, no.9, pp.1316-1323 (1981 Sept.)

[115] V.Friedman, The Structure of the Limit Cycles in Sigma Delta Modulation, IEEE
Trans. Communications, vol.36, no.8, pp.972-979 (1988 Aug.).

[116] D.Reefman, J.Reiss, E.Janssen, M.Sandler, Description of Limit Cycles in Sigma-
Delta Modulators, IEEE Trans. Circuits and Systems I; Regular Papers, vol.52, no.6,
pp.1211-1223, (2005 June)

[117] J.Reiss, M.Sandler, They Exist: Limit Cycles in High Order Sigma Delta Modulators,
Audio Eng. Soc. Convention Paper 5832, 114
th
AES Convention, Amsterdam, (2003
Feb.)

[118] J.Reiss, Understanding Sigma-Delta Modulation: The Solved and Unsolved Issues, J.
Audio Eng. Soc., vol.56, no.1/2, pp.49-64, (2008 Jan.)

[119] R.Schreier, On the Use of Chaos to Reduce Idle-Channel Tones in Delta-Sigma
Modulators, IEEE. Trans. Circuits and Systems Part-I, vol.41, no.8, pp.539-547 (1994
Aug.)

[120] S.R.Norsworthy, Dynamic Dithering of Delta-Sigma Modulators, Audio Eng. Soc.
Convention Paper 4103, 99th AES Convention, New York, (1995 Oct.)

[121] J.Reiss, M.Sandler, Dither and Noise Modulation in Sigma-Delta Modulators, Audio
Eng. Soc. Convention Paper 5935, 115
th

[122] D.Campbell, The Delta-Sigma Modulator as a Chaotic Dynamical Non-Linear
System, PhD. thesis, University of Waterloo, Ontario, Canada, (2006)
Bibliography

167

[123] J.G.Kenney and L.R.Carley, Design of Multibit Noise-Shaping Data Converters, J.
Analog Int. Circuits Signal Processing, vol.3, no.3, pp.259-272, (1993 May)

[124] R.J.Van De Plassche; Dynamic Element Matching for High Accuracy Monolithic DA
Converters, IEEE J. Solid State Circuits, vol.11, no.6, pp.795-800, (1976 Dec.)

[125] L.R.Carley, A Noise Shaping Coder Topology for 15+ bit Converters, IEEE J. Solid
State Circuits, vol.24, no.2, pp.267-273, (1989 Apr.)

[126] B.H.Leung and S.Sutarja, Multibit Sigma-Delta A/D Converter Incorporating a Novel
Class of Dynamic Element Matching Techniques, IEEE Trans. Circuits and Systems
Part-II: Analog and Digital Signal Processing, vol.39, no.1, pp.35-51, (1992 Jan.)

[127] R.T.Baird, T.S.Fiez, Linearity Enhancement of Multi-Bit - A/D and D/A Converters
Using Data Weighted Averaging, IEEE Trans. Circuits and Systems Part-II: Analog
and Digital Signal Processing, vol.42, no.12, pp.753-762, (1995 Dec.)

[128] M.Vadipour, Techniques for Preventing Tonal Behaviour of Data Weighted Averaging
Algorithm in -Modulators, IEEE Trans. Circuits and Systems Part-II, vol.47, no.11,
pp 1137-1144, (2000 Nov.)

[129] A.A.Hamoui and K.Martin, Linearity Enhancement of Multibit Modulators Using
Pseudo Data-Weighted Averaging, Proc. IEEE International Symp. Circuits and
Systems ISCAS02, pp.III 285-288, (2002 May)

[130] K.D.Chen and T.H.Kuo, An Improved Technique for Reducing Baseband Tones in
Sigma-Delta Employing Data Weighted Averaging Algorithms without Adding Dither,
IEEE Trans. Circuits and Systems Part-II, vol.46, no.1, pp 53-68, (1999 Jan.)

[131] R.K.Henderson and O.Nys, Dynamic Element Matching Techniques with Arbitrary
Noise Shaping Function, Proc. IEEE Int. Symp. Circuits and Systems ISCAS96,
pp.293-296, (1996 May)

[132] X.M.Gong, An Efficient Second-Order Dynamic Element Matching Technique for a
120 dB Multi-Bit Delta-Sigma DAC, Audio Eng. Soc. Convention Paper 5124, 108
th

AES Convention, Paris, (2000 Feb.)

[133] R.W.Adams, Data Directed Scrambler for Multi-Bit Noise Shaping D/A Converters,
U.S.Patent no. 5,404,142, (1995 Apr.)

[134] I.Galton, Spectral Shaping of Circuit Errors in Digital-to-Analog Converters, IEEE
Trans. Circuits and Systems Part-II: Analog and Digital Signal Processing, vol. 44, no.
10, pp. 808-817, (1997 Nov.)

[135] J.Welz, I.Galton, E.Fogleman, "Simplified Logic for First-Order and Second-Order
Mismatch-Shaping Digital-to-Analog Converters," IEEE Trans. Circuits and Systems
Part-II: Analog and Digital Signal Processing, vol.48, no.11, pp.1014-1028, (2001
Nov.)

Bibliography

168
[136] E.Fogleman, J.Welz, I.Galton, "An Audio ADC Delta-Sigma Modulator with 100-dB
Peak SINAD and 102-dB DR Using a Second-Order Mismatch-Shaping DAC," IEEE J.
Solid-State Circuits, vol. 36, no. 3, p.339-348, (2001 Mar.)

[137] E.N. Aghdam, P. Benabes, Higher Order Dynamic Element Matching by Shortened
Tree-Structure in Delta-Sigma Modulators, Proceedings of the 2005 European
Conference Circuit Theory and Design, vol.1, pp.I/201- I/204, (2005 Sept.)

[138] R.Schreier, B.Zhang Noise-Shaped Multibit D/A Converter Employing Unit
Elements, Electronic Letters, vol.31, no.20, pp.1712-1713, (1995 Sept.).

[139] J.A.Schoeff, "An Inherently Monotonic 12 Bit DAC," IEEE J. Solid State Circuits,
vol.14, no.6, pp.904-911, (1979 Dec.)

[140] R.Adams, K.Nguyen, K.Sweetland, A 112dB SNR Oversampling DAC with
Segmented Noise Shaped Scrambling, Audio Eng. Soc. Convention Paper 4774, 105
th

AES Convention, San Francisco, (1998 Sept.)

[141] J.Steensgaard-Madsen, High Performance Data Converters, Ph.D. thesis, Technical
University of Denmark Dept. Inf. Tech., (1999)

[142] A.Fishov, E.Siragusa, J.Welz, E.Fogleman, I.Galton, Segmented Mismatch-Shaping
D/A Conversion, Proc. IEEE International Symp. Circuits and Systems ISCAS02,
(2002 May)

[143] M.O.J.Hawksford, Digital-to-Analog Converter with Low Intersample Transition
Distortion and Low Sensitivity to Sample Jitter and Transresistance Amplifier Slew
Rate, J. Audio Eng.Soc., vol.42, no. 11, pp.901-917, (1994 Nov.).

[144] R.Adams, K.Nguyen, K.Sweetland, A 113dB SNR Oversampling DAC with
Segmented Noise-Shaped Scrambling, IEEE J. Solid State Circuits, vol.33, no.12,
pp.1871-1878, (1999 Dec.)

[145] M.Clara, W.Klatzer, A.Wiesbauer, D.Straeussnigg; A 350MHz Low-OSR Delta-
Sigma Current-Steering DAC with Active Termination in 0.13m CMOS, Proc. IEEE
International Solid-State Circuits Conference ISSCC 2005, pp.118-588, (2005 Feb.)

[146] D. Su and B. Wooley, A CMOS Oversampling D/A Converter with a Current-Mode
Semi-Digital Reconstruction Filter, IEEE J. Solid-State Circuits, vol. 28, pp. 1224-
1233, (1993 Dec.)

[147] W.R.Bennett, "New Results in the Calculation of Modulation Products", Bell System
Tech. J., vol. 12, pp. 228-243, (1933 Apr.)

[148] B.D. Josephson, "Pulse Width Modulated Audio Amplifiers", Wireless World, letter to
the editor, vol.71, pp. 335-336, (1965 July).

[149] J.D.Martin, "Theoretical Efficiencies of Class-D Power Amplifiers", Proc. IEE.,
vol.117, no.6, pp.1089-1090, (1970)

Bibliography

169
[150] Y. Mitsuhashi, Mathematical Analysis of a Pulse Width Modulation Digital to Analog
Converter, J. Audio Eng.Soc., vol.31, no.3 pp.135-138; (1983 Mar.)

[151] M.Sandler, Towards a Digital Power Amplifier, AES Convention Paper 2135, 76
th


[152] List of PWM amplifiers, http://www.avsforum.com/avs-vb/showthread.php?t=594707

[153] A.Hewitt, A Simple Approximation for the Distortion in a Pulse-Width-Modulation
Digital-to-Analogue Converter, Audio Eng.Soc. Convention Paper 4598, 103
rd
AES
Convention, New York (1997 Sept).

[154] J.M.Goldberg, M.B.Sandler, Noise Shaping and Pulse-Width Modulation for an All-
Digital Audio Amplifier, J.Audio Eng.Soc., vol.39, no.6, pp.449-460, (1991 June)

[155] K.Nielsen, Audio Power Amplifiers with Energy Efficient Power Conversion, Ph.D.
Thesis, Tech.University of Denmark, Lyngby, (1998 Apr.)

[156] E.Gaalaas, B.Y.Liu, N.Nishimura, R.Adams, K.Sweetland, Integrated Stereo Class-
D Amplifier, IEEE J. Solid State Circuits, vol.40, no.12, (2005 Dec.).

[157] R.Khoini-Poorfard, D.A.Johns, On the Effect of Comparator Hysteresis in
Interpolative Modulators, Proc. Int. Symp. Circuits and Systems ISCAS93, vol.2,
pp.1148-1151, (1993 May)

[158] T.S.Doorn, E.van Tuijl et al., An Audio FIR-DAC in a BCD Process for High-Power
Class-D Amplifiers, Proc. 31
st
European Solid State Circuits Conference ESSCIRC
2005, pp.459-462, (2005 Sept.)

[159] T.Rueger et al., A 110dB Ternary PWM Current-Mode Audio DAC with Monolithic
2Vrms Driver, Int. Solid State Circuits Conference Digest of Tech. Papers ISSCC
2004, vol.1, pp.372-533, (2004 Feb.)

[160] D.Reefman, J.v.d.Homberg, E.van Tuijl, et al., A New Digital to-Analog Converter
Design Technique for HiFi Applications, Audio Eng. Soc. Convention Paper 5846,
114
th
AES Convention, Amsterdam, (2003 March)

[161] P.Kiss, J.Arias, D.Li, and V.Boccussi, Stable High-Order Delta-Sigma DACs, IEEE
Trans. Circuits and Systems Part I: Regular Papers, vol.51, no.1, (2004 Jan.)

[162] Y.Cheng, C.Petrie, B.Nordick, D.Comer, Multibit Delta-Sigma Modulator with Two-
Step Quantization and Segmented DAC, IEEE Trans. Circuits and Systems Part II:
Express Briefs, vol.53, no.9, pp.848-852, (2006 Sept.)

[163] H.J.Schouwenaars et al., A Monolithic Dual 16-bit D/A Converter, IEEE J. Solid
State Cicuits, vol.21, no.3, pp.424-429, (1986 Jun.)

[164] P.J.A.Naus et al., A CMOS Stereo 16-bit D/A Converter for Digital Audio, IEEE J.
Solid State Circuits, vol.22, no.3, pp.390-395, (1987 Jun.)

[165] J.Sneep et al., A Bit-Stream Digital to Analog Converter with 18-b Resolution, IEEE
J. Solid State Circuits, pp.1757-1763, vol.26, no.12, (1991 Dec.)
Bibliography

170

[166] S.Nakano et al., A 117dB D-Range Current-Mode Multi-Bit Audio DAC for PCM and
DSD Audio Playback, Audio Eng. Soc. Convention paper 5190, 109
th
AES
Convention, Los Angeles, (2000 Sept.)

[167] R.H.Walden, Analog-to-Digital Converter Survey and Analysis, IEEE J. Selected
Areas in Communications, vol.17, pp.539-550, (1999 Apr.)

[168] AES17-1998 (r2004): AES Standard Method for Digital Audio Engineering
Measurement of Digital Audio Equipment (Revision of AES17-1991, (1998)

[169] P.Duhamel, M.Vetterli, Fast Fourier Transforms: A Tutorial Review and State of the
Art, J. Signal Processing, vol.19, no.4, pp.259-299 (1990 Apr.)

[170] J.W.Gibbs, Fourier Series, Nature 59, 200 (1898) and 606 (1899).

[171] R.B.Blackman and J.W.Tukey: Particular Pairs of Windows, The Measurement of
Power Spectra, From the Point of View of Communications Engineering. New York:
Dover, (1959)

[172] F.J.Harris, On the use of Windows for Harmonic Analysis with the Discrete Fourier
Transform, Proc. of the IEEE, vol.66, no.1, pp.51-83, (1978 Jan.)

[173] Maxim Application Note 1040, Coherent Sampling vs. Window Sampling, available
on-line from: http://www.maxim-ic.com/an1040

[174] J.Blair, Histogram Measurement of ADC Nonlinearities Using Sine Waves, IEEE
Trans. Instrumentation and Measurement, vol.43, pp.373-383, (1994 June)

Digital To Analog Conversion in High Resolution Audio - v2

Uploaded by

Document Informationclick to expand document information

Copyright:

Available Formats

Digital To Analog Conversion in High Resolution Audio - v2

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Digital To Analog Conversion in High Resolution Audio - v2

Uploaded by

Copyright:

Available Formats

Digital-to-Analog Conversion in

High Resolution Audio

6.02 1.76 10 log [dB] .

3.5 in 3-bit modulators [101]. Alternatively

. If the NTF is basic modN; ||ntf||

. This means that the

You might also like

Digital To Analog Conversion in High Resolution Audio - v2

Uploaded by

Document Informationclick to expand document informationUpdated version of PhD-thesis.

Document Informationclick to expand document information

Copyright:

Available Formats

Digital To Analog Conversion in High Resolution Audio - v2

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Digital To Analog Conversion in High Resolution Audio - v2

Uploaded by

Copyright:

Available Formats

Digital-to-Analog Conversion in

High Resolution Audio

6.02 1.76 10 log [dB] .

3.5 in 3-bit modulators [101]. Alternatively

. If the NTF is basic modN; ||ntf||

. This means that the

You might also like