1
A largely process and structurally determined quantity related to the gate length,
materials doping and geometric factors is the transconductance which is representative
of the gain of a particular process or transistor – you’ve probably come across this
before but we’ll see that it is intrinsically linked to the switching speed of the transistor –
a high transconductance is better than a low transconductance but as we’ll see later
there are trade-offs in getting a high transconductance – particularly off-state leakage.
The quantity transconductance is a key parameter when comparing transistors and
especially transconductance per unit gate width (S/mm or S/um) – a higher
transcontuctance per unit width means a less wide transistor can give more gain and
take up less area. Do not confuse gate width with gate length (gate length defines the
length the carriers travel under the gate from the source region to the drain region and is
a key performance defining quantity), Lg is often used as a shorthand value to
summarise the key technology and performance defining aspects of a processor and is
closely related to the switching and thus clock speed and overall performance but there
is much, much more involved in performance architectures than just the minimum gate
length – which I hope guide you though an introductory understanding of – the field is
moving so fast that there are many deep and specialised areas and this course aims to
give you a foundation should you wish to get further involved.
2
3
4
Uyemura is a slightly cheaper and shorter alternative to VLSI Systems – A logic, Circuit
and System Perspective
5
This is a more detailed version of the previous recommended book since I think it may
be easier for you to find copies of this book. It seems and indeed is a lot but I’ll be
skimming over most of it and focussing on the principles and trade-offs rather than
focussing too much on the details – the key details and calculations we can cover in the
examples classes and the CAD work.
6
CMOS complementary metal oxide semiconductor – MOS devices connected in a
complementary way – complementary in this context means that when one transistor is
ON (channel open – in a relatively low resistance state) the other is OFF (channel closed,
or in a very high resistance state) and this complementary switching controls and
reduces power dissipation unless the switching between states becomes very fast. I’ll
sometimes use the word impedance in place of resistance – strictly speaking it’s the
channel resistance to which I’m referring to here..
These days you would nearly always use a digital design approach and more often than
not plug-and-play building blocks but this was not always the case – as little as 40 years
ago most electronics featured analogue circuits which were prone to failing, hard to
design, consumed high powers (eg a linear class-A amplifier is only 30% efficient
(theoretically 50% max) with 70% dissipated as heat continuously!) and new applications
often needed completely new designs and were hostage to component tolerances and
degradation – i.e. unreliable. The flexibility of digital designs, their very low power
dissipation due to the complementary nature of the transistor switching, their reliability
(in no small part due to the noise margins and switching behaviour and Boolean logic)
and their ability to be mass produced, and it would turn out later benefiting from what
has become Moore’s Law where year on year the complexity and cost would drop nearly
exponentially. Now a designer would use pre-existing digital chips, perhaps some
specialist chips from time to time and then rely on analogue to digital and digital to
7
analogue converters to “get the real-world (analogue) signals in and out from the digital
domain – provided the ADC’s and DAC’s are fast enough. Once in the digital domain then
algorithms can be written and rewritten to realize any required functionality..
The first machine age was essentially machines that gave mechanical advantage over
human and animal power the second machine age is where machines can now give
intellectual advantage – access to unheard of data, global communications, and things
like computer and finite element modelling.
7
This is a key slide - detailing the key strengths of CMOS logic and upon which this lecture
course will build and develop out from. These bullet points essentially xplain why CMOS
has come to dominate and underpin our modern lives. Although for the same area a
bipolar npn device may be faster (remember the base region is very thin and the current
is from the top down through these layers rather than across as in a field effect device)
bipolar’s do suffer other issues such as high turn on voltages that make them less
desirable as a logic switch than and EMOSFET. Nevertheless as we’ll see later it(npn
bipolar transistor) can be very good for driving large loads or fan-out (Capacitances).
Note the HP calculator – in the Apollo missions this was sewn into a leg of the
astronaut’s suit so that if the on-board computers failed they could use this in place of a
slide-rule to try to get them home – this was the mid-50’s to 60’s after all!
8
This funnel is a not too serious illustration (i.e. not literal or not strictly accurate) but
more or less representative picture of what goes into making the IC’s that most people
now take for granted, although this is hugely simplified but essentially the sum of these
areas has been refined by about 100 billion hours of combined human effort - nothing
else in the history of humankind has benefited from this much focussed development –
all leading imo to the “evolution” of a new sentient lifeform in the next 30years.
9
“In the trade” simply translates to “in the area or field of discussion”. These modules:
consist of registers, and shift registers, memory, buffers, arithmetic and logic units
(ALU’s), groups of combinational logic.
Hierarchy is simply the systematic approach of going from the top down from the top
level complex function to ever simpler functional blocks which are then are reduced to a
collection of simpler interconnected blocks (functions) then eventually into a multiplicity
of vastly interconnected transistors (at the silicon surface).
Regularity is the repeated use of well understood and optimised functional blocks –
which often even look similar and give the chip the appearance of a city (with different
regions) as seen from space (this comparison applies more to modern cities such as
down-town Manhatten, for example – which we will use again later, where the buildings
and roads (in our case data and power supply lines) are laid out in a regular linear North-
South and East-West grid pattern .
10
Approximate radius of an electron ~10E-16m, Atomic radius of Si~0.132nm, nearest
neighbour separation is 0.235nm and a lattice constant (physical separation of the unit
cells) of 0.543nm (i.e. how much you would need displace the unit cell by in any
direction to repeat the structure) atomic density on 100 plane is 6.78E14 /cm2 so
average separation in 100 plane s 0.384nm so around 26 silicon atoms per 10nm
distance.
11
This is a somewhat simplified flow from top-down of the steps required to make a
microprocessor and indeed any other integrated circuit (IC). You are more likely to be
working from top down but if you were a process engineer, semiconductor physicist or
device engineer you would spend your life working with materials, semiconductor
physics, layout etc, and building next generation transistors and leaf-cells (logic
primitives) from the bottom up. Both extremes of the IC design process would rarely be
deeply knowledgeable about their opposite end and can operate perfectly well on a
need-to-know basis – which is the whole point of systematically capturing and codifying
complexity and interdependences and ideally building this into advances computer
design tools via what are called design-rules (an attempt to limit choices and force the
user into following a structured design process)
12
Again just another way of representing the design hierarchy in the form of nested shells
with the silicon transistor junctions at the centre – down at the silicon level and the
related device physics is by far the most physically detailed and complex level with 100’s
of billions of transistors, billions of interconnected logic gates and is really too complex
in its entirety, to comprehend, therefore higher level views of higher levels of abstraction
(vagueness – like zooming out from a fractal coastline and losing focus as you move to
lower zoom levels). Teams tend to focus on the design and optimisation of their modules
with sometimes very little awareness of distant modules – this is also another way for a
company to maintain technological advantages over others and control the leakage of
design info.
13
Silicon compilation was first described in 1979 by David L. Johannsen, under the
guidance of his thesis adviser, Carver Mead.[1] Johannsen, Mead, and Edmund K. Cheng
subsequently founded Silicon Compilers Inc. in 1981.
14
This is not unlike the 7 layer OSI (open systems interconnection) network model with
Applications>Presentation>Session>Transport>network>data-link>Physical
15
This slide is to illustrate how transistor area has shrunk and how then they have become
faster (we will see later that as a very convenient consequence of reducing the size the
power dissipation per transistor decreases by the square of the scaling factor (S) AND
the switching speed decreases by the scaling factor as well making the power-delay
product S^3 better - whilst also allowing more gates and thus processing power per chip.
Sadly however, as we get down to <5nm gate lengths – which is only about 20 Silicon
atom spacing wide (Lamda being only 10 atoms in length) we seem to be approaching
the limit of manufacturability.
On August 18, 2008, AMD, Freescale, IBM, STMicroelectronics, Toshiba, and the College
of Nanoscale Science and Engineering (CNSE) announced that they jointly developed and
manufactured a 22 nm SRAM cell, built on a traditional six-transistor design on a
300 mm wafer, which had a memory cell size of just 0.1 μm2.[1] The cell was printed
using immersion lithography.[2]
16
Only 10 or 20 years ago the sort of processing power in your pocket and gaming
computer was the preserve of well funded governments
17
This figure shows a cut through of a piece of silicon integrated circuit. All of the
transistor switching happens on the surface of the silicon – you cant see it but the
surface is rippled or wrinkled (like the radiator fins on the sides of a heat-sink) which is
to increase the surface area to allow more smaller transistors to be made. The electrons
(carriers) travel along these layers (in a sheet about 20-200 atoms thick) near the
interface between the gate insulator and the silicon below.
Note the 10 levels of metal (light grey) interconnects and insulator (black) above the
silicon surface. These alternate layers are used to connect the transistors together
forming a simple logic gate (which we will study later) then moving up a layer connecting
these simple logic gates together.
The bottom leftmost coloured figure shows a conventional planar FET and the carriers
move along the yellow stripe under the blue gate to the drain side of the yellow stripe
(the yellow stripe denotes the source and drain diffusion (doping). The middle coloured
figure shows a fin-fet where the silicon surface is now in the form of a raised fin and the
source and drain contacts wrap around this fin – this essentially allows a much greater
surface area to be exposed making the transistor bigger in area (wider) without taking
up more lateral area. This folded silicon surface means more transistors can be squeezed
onto a wafer increasing processing power – the limitation is removing the heat..
18
Most contemporary electronics designs are likely to use programmable devices (Pic’s,
microcontrollers, CPLDs and FPGA) as the “glue logic” and other ASIC’s for specific
functions. The internal architecture of these programmable devices is such that memory,
logic functions and interconnects can be programmed at power on or sometimes can
retain the programming. The programmability requires different internal structures (lots
of programmable logic gates) to a fixed or dedicated IC but apart from this they are
largely similar to the other custom or application specific integrated circuits (ASIC).
These programmable elements can be based around floating gate transistors (able to
retain their programming) or SRAM style gates, in which the programmed functionality is
uploaded externally or from a ROM block prior to operating. The flexibility (and
complexity) and functionality is still determined by the underlying hardware and the
technology upon which it is based.
19
Custom IC design tends to be the preserve of highly specialized markets, where
performance is everything eg Medical, aerospace, military or high performance systems
– start-of-the-art imaging or communications (ADC,DAC’s, DSP) or data capture or lowest
possible power-drain (IoT) mixed signal (analogue and digital), SoC.
Even for custom IC, predesigned sub-blocks (functional blocks and logic gates) are used
where possible. These have been optimized for layout area, power consumption and
speed.
Multi-project wafers are a method for chip manufacturers to bring together a group of
completely independent design (the multiple projects) into a series of production runs.
These projects and their designers and indeed the IC manufacturers no nothing of what’s
actually being designed – akin to separate independent architects designing different
buildings or groups of buildings in a city to a set of rules (only sharing information about
the where the entrances to the buildings are, how many occupants at what times of day
and the utilities needed, not what the building will be used for (and maybe sharing some
non-confidential information about some restrictions as to what not to have next door –
the architects maybe wouldn’t want a residential block next to a night club if possible –
analogy is highly sensitive low threshold logic / amplifiers may not want to be near an
high power input-output stage with high currents and thermal gradients / temperature
swings) and then a separate group of construction engineers and builders build
20
everything including the roads and utilities using a set of common materials, building-
blocks and rules.
The Gate arrays and final-metal programming – using the analogy above is akin to saying
you can have these pre-constructed units and its up to you how your use them and how
you want to configure the entrances and build the pathways between. Needless to say
this is not the best use of the space or most efficient arrangement – but it’s a lot cheaper
and you can move in faster.
20
A logic level 1 or high voltage (determined by whatever the maximum logic rail voltage
swing is i.e. VDD) – depending on the size of the transistors (constant field scaling – which
will be covered later) VDD could be 5V, 3.3V, 2.5V, 1.5V, 1V or latest highest performance
transistors 0.5V. Such “high” voltages switch on the NMOS transistor switches and hold
off the PMOS transistors.
A logic Level 0 would usually be 0V. Such “low” voltages switch on the PMOS transistor
switches and keep off the NMOS transistors – hence the complementety designation in
the term CMOS
21
So long as the gate voltage on the PMOS transistor switch is at lease a threshold voltage
lower that either the Drain or the Source then the PMOS transistor switch will be
opened. i.e. if the threshold voltage for the PMOS was -1V and if the gate voltage was
say +2V then provided the voltage on either the drain or source was greater than +3V
the PMOS transistor would be switched on with a low channel resistance. i.e. the gate
voltage can be said to be -1V relative to the source or drain voltage. For the NMOS
transistor switch so long as the gate voltage is a threshold voltage above the source of
drain voltage then it will be considered on and presenting a low channel resistance. i.e. it
its threshold voltage was +1V then if the source and drain were at, say 2V, then the
NMOS would be off, but if the gate voltage was 3V then it would be on and with a low
channel resistance or high channel conductance (g=1/r)
22
If you remember back to first year you represented logic functions as a sum of products
form i.e. a.b + a.c.d the bottom AND gate represents the product part and the top OR
gate the sum so be combining groups of AND and OR gates then arbitrary logic functions
can be realized and from this adders, multiplexors and with the addition of memory
latches then state machines and decision making – that you have used PIC controllers to
do.
23
The output node Y can either be charged to whatever voltage 1 is r discharged through
sw1 to 0V – we will see later that the speed of charging or discharging depends on the
channel resistances of the NMOS or PMOS transistor and the capacitance on the Y node
(the fan-out) by the simple RC time constant, or more precisely RChannelCLoad, and we will
see the factors that determine this channel cross-section and the load capacitance and
thus to overall switching and clock timing
24
The final aspect before moving on is an appreciation that underlying all this CMOS logic
and switching and digital computing is the fact that there are a range of voltages
between the maximum of the logic rail voltage swing and some minimum that represent
and will be interpreted correctly as a logic level 1, and similarly a range of voltages
between 0 (zero) and some higher value that will be correctly interpreted, correctly, as a
logic level zero. Between these voltage rangers we try to avoid making decisions as the
result would be likely to be ambiguous or unstable, and sometimes (remember the RC
time constant) we need to wait or hold-off making decisions (testing the logic state) until
the outputs and inputs have reached a stable value – or near their final value which for
true CMOS we refer to as fully restored value i.e. VDD or 0V
25
a) Raw wafer (can often be pre-doped p-type), b) thick protective coating of Oxide
insulator after rigorous surface cleaning and preparation, c) remove the thick protective
insulator coating where we want to make the transistors, d) form a very thin very high
quality insulating layer over a carefully prepared atomically clean surface, e) cover with
what will become the gate metal or polysilicon and, once patterned the self aligned gate,
f) pattern the gate by etching away the polysilicon or metal covering but leaving a patch
which will define the gate, g) diffuse in or implant the source and drain area doping using
the polysilicon gate material as a self defined mask, h) aneal the source and drain doping
i.e. “activate” the phosphorous impurities, i) cover the source and drain contact with
Silicon Nitride metal to form a very low resistance contact to both the source, drain and
indeed gate areas, j) pattern to define the contact areas, k) deposit the interconnect
metal (could be aluminium or copper), l) pattern the first level metal to form the
interconnected transistors and simple logic gates – then not shown multiple over-layers
of insulator, metal and insulator to interconnect the logic blocks in ever more complex
ways forming functional blocks, latches, registers etc then inputs and outputs etc.
26