Fpgas 29032016
Fpgas 29032016
Fpgas 29032016
– Gates
– Flip-Flops D Q
>
– Interconnect
(or routing) D Q
>
D Q
>
D Q
>
2
Digital Logic Landscape
The following slides provide a history of the various logic devices
Design Capacity Full
(gates) Custom
Standard
Cell
Gate
Programmable Array
Logic
FPGA
CPLD
SPLD
Standard
Logic
Development Time
hours days weeks months years
3
Digital Logic History - PLDs
interconnect gates flip flops
• Developed in
the late 70s D Q A very common
> low cost IC package
• Major player D Q
has pins on all 4 sides
called a Plastic-Leaded
today: Lattice > Chip Carrier (PLCC)
• 50 – 200 gates D Q
>
4
PLD Example
5
Digital Logic History - Gate Array
Definition: A pre-built IC consisting of a regular arrangement of gates and interconnect
(routing) where the interconnect is modified to achieve a customer’s desired
functions.
– The customer designs the behaviors/functions
– The vendor manipulates/changes the interconnect gates
metal interconnect to arrive at the
customer’s specified functions
(that is, the vendor hooks up the gates)
– Sometimes called an
Uncommitted Logic Array (ULA).
Packaging Enhancement:
To increase the number
of I/Os (Inputs/Outputs), the
pin thickness and spacing
(pitch) are dramatically
Gate Array in a
reduced in this Thin Quad
TQFP package
FlatPack package (TQFP). 1,000,000+ gates
6
Gate Array
• The ultimate building tool set for digital designers
• Advantages
7
Digital Logic History - Standard Cell
• This device features a series of customized “cells”
– Each cell is optimized for its “standard” function
• Cells are chosen form a library from the Standard Cell vendor,
customized, and connected to the other cells and the routing on the
part.
• There are no standard layers to the device; each layer is a unique
design
• Advantages:
– More optimized die size compared to GA
– Cheaper device price compared to GA
– Can add analog functions
• Disadvantages:
– Extremely high NRE charges (up to $1M)
– Requires >250k+ units/year
– Much longer development time
– Much higher risk (re-spins, etc.)
8
CPLDs, FPGAs
Design Capacity Full
(gates) Custom
Standard
Cell
Gate
Array
Programmable FPGA
Logic
CPLD
SPLD
Standard
Logic
Development Time
hours days weeks months years
9
Digital Logic History - CPLD
Complex Programmable Logic Device
interconnect macrocells
Definition:
A CPLD contains a bunch of PLD blocks
whose inputs and outputs are
connected together by a global
interconnection matrix.
10
CPLDs
• Vendors: Altera, Lattice, Cypress, Xilinx
• 2 Primary Technologies
– EEPROM
(old technology)
– FLASH
(technology used by Xilinx CPLDs)
11
Digital Logic History - FPGA
Field Programmable Gate Array
Definition:
interconnect logic cells
• An array of “logic cells” surrounded by
substantial routing, both of which are under
the user’s control
12
FPGA Building Blocks
13
An Early Xilinx CLB
14
Digital Logic History
FPGA - Field Programmable Gate Array
2 types of FPGAs LUT flip flop
• Reprogrammable (SRAM-based)
0110 0
A
B
horizontal used
interconnect interconnect
path
gates
16
Basic Concepts - I/Os
Inputs and Outputs
17
Basic Concepts
Propagation Delay (tPD)
“A” “B”
“A” “B”
18
Basic Concepts
Path Delay
Definition: The sum of all the gate and net delays from
starting to ending point.
“C”
fanout=2
“A” “B”
tCQ = 2.5ns tPD = 1ns tPD = 2ns tPD = 0.5ns tPD = 2ns
1
fMAX =
longest flip-flop path delay
Slice
I3 SET
CE
I2 O D Q
I1
RST
I0
I3 SET
CE
I2 O D Q
I1
RST
I0
• LUT
• Flip flop
Typical 4 Input LUT
• 4 Inputs
• One Output
• Input Set D Q
• Input Reset
RST
• Output Q
Making the Most of Controls
Dedicated Flip-Flop controls make designs smaller and faster.
LUT4
SET
I3 CE
1 level of logic - fast and small I2 O D Q
I1
Up to 4 data inputs plus 3 controls I0
RST
tSU
2 levels of logic - significantly slower and twice the size (and cost)
LUT4 LUT4
SET
I3 I3 CE
I2 O I2 O Q
net D
I1 I1
I0 I0
RST
tSU tSU
process (clk,reset)
begin
if reset='1' then reset
data_out <= '0';
elsif clk'event and clk='1' then
if enable='1' then enable
if force_high='1' then
set
data_out <= '1';
else
data_out <= a and b and c and d; logic
end if;
end if;
end if;
end process;
LUT4
SET
I3 CE
1 level of logic - fast and small I2 O D Q
I1
Up to 4 data inputs plus 3 controls I0
RST
tSU
2 levels of logic - significantly slower and twice the size (and cost)
LUT4 LUT4
SET
I3 I3 CE
I2 O I2 O Q
net D
I1 I1
I0 I0
RST
tSU tSU
process (clk,reset)
begin
if reset='1' then reset
data_out <= '0';
elsif clk'event and clk='1' then
if enable='1' then enable
if force_high='1' then
set
data_out <= '1';
else
data_out <= a and b and c and d; logic
end if;
end if;
end if;
end process;
Cell Usage :
# BELS : 2
TWICE as Big as it # LUT2 : 1
should be and Slow! # LUT4 : 1
# FlipFlops/Latches : 1
# FDCE : 1
enable
LUT4
LUT2 PRE
force_high I3 CE
d I1 b I2 data_out
O O D Q
c I0 I1
a I0
CLR
Solution
reset
Slice Slice
PRE PRE
LUT Carry D Q LUT Carry D Q
CE CE
CLR CLR
CLR CLR
Slice
Slice
Switch Matrix
Clocks
Switch Matrix
Slice
Slice
Data Data
Fabric Routing
• Connections between CLBs and other resources use the fabric routing
resources
• Routing lines connect to the switch
matrices adjacent to the resources
• Routes connect resources vertically,
horizontally, and diagonally
• Routes have different spans
• Horizontal: Single, Dual, Quad, Long (12)
• Vertical: Single, Dual, Hex, Long (18)
• Diagonal: Single, Dual, Hex
Different Architectures:
6 Input LUTs
• 6-input LUT can be two 5-input LUTs with common inputs
• Minimal speed impact to
a 6-input LUT 6-LUT
• One or two outputs A6
A5
A4 D O5
A3
5-LUT
A2
A1
Different Architectures:
Slice Structure with 4 LUTs
• Four six-input Look Up Tables (LUT)
• Wide multiplexers
LUT/RAM/SRL
• Carry chain
• Four flip-flop/latches LUT/RAM/SRL
01
More Detailed Look at Flip Flops
• All flip-flops are D type D Q
CE
CE
• All flip-flops have a single clock input (CLK) CK
CK
D SR D SR
0 D Q
CE D Q
CE rst_clkA
CK
CK CK
CK
SR configured as
SR SR asynchronous,
clkA SRVAL=1
Synchronous Reset
• A synchronous reset will not take effect until the first active clock
edge after the assertion of the RST signal
• The RST pin of the flip-flop is a regular timing path endpoint
• The timing path ending at the RST pin will be covered by a PERIOD constraint
on the clock
32k x 1, 16Kx2,
16Kx1, 8Kx2, 4Kx4, 1 read/write port
Single Port 8Kx4, 4Kx9,
2Kx9, 1Kx18 Read OR write in 1 cycle
2Kx18, 1Kx36
32K x 1, 16Kx2,
16Kx1, 8Kx2, 4Kx4,
8Kx4, 4Kx9, 1 read port and 1 write port
Simple Dual Port 2Kx9, 1Kx18,
2Kx18, 1Kx36, Read AND write in 1 cycle
512x36
512x72
SelectI/O
5.0V 1.8V 3.3V 2.5V SelectI/O Allows Connection
Directly to External Signals of
Varied Voltages & Thresholds
4 System Interfaces
SelectI/O
• Allows Connection & Use of a Wide Variety of Devices
• Processors, Memory, Bus Specific Standards, Mixed Signal...
• Provides Industry Standard IEEE/JDEC I/O Standards
• Maximizes Speed/Noise Tradeoff - Use Only What is Needed
• Can Connect to or Create High Performance Backplanes
• PCI, GTL+, HSTL
• DIY - Virtex Based Backplane Design in Progress
• Define I/O by Simply Placing Desired Input And/Or Output
Buffers Into the Design
• Special IBUF and OBUF Components Provided in Schematic Based and
HDL Based Design Flows
• For Example: SSTL3, Class I Output Buffer - OBUF_SSTL3_I
Simplified IOB Structure
• Fast I/O Drivers
DFF/LATCH
OBUF_SSTL3_I IBUF_SSTL3_I
Page 51
7 Series Slice Structure
• Four six-input Look Up Tables (LUT)
• Wide multiplexers
LUT/RAM/SRL
• Carry chain
• Four flip-flop/latches LUT/RAM/SRL
01
7-Series I/O Block Diagram
Logical Resources Electrical Resources
OLOGIC/
ODELAY
OSERDES
P
Master
LVDS
Termination
Slave
ILOGIC/
IDELAY
ISERDES
N
OLOGIC/
ODELAY
OSERDES
7 Series FPGAs DSP
• 7 series FPGAs DSP slice 100% based on Virtex-6 FPGA
DSP48E1
• 25x18 multiplier
• 25-bit pre-adder
• Flexible pipeline
• Cascade in and out
• Carry in and out
• 96-bit MACC
• SIMD support
• 48-bit ALU
• Pattern detect
• 17-bit shifter
• Dynamic operation (cycle by cycle)
Programmable
Systems
Integration
Programmable
Highly Capable, Dedicated DSP Logic in Every 7 Series FPGA
Systems Integration
Page 54
7-Series Gigabit Transceivers
2
Tx
FPGA
PMA PCS
Fabric
2 Interface
Rx
PMA PCS