Define TAP_C_HANDSHAKE_LEN as DH_LEN+KEY_LEN+PK_PAD_LEN.
Define TAP_S_HANDSHAKE_LEN as DH_LEN+HASH_LEN.
The payload for a CREATE cell is an 'onion skin', which consists of
the first step of the DH handshake data (also known as g^x). This
value is encrypted using the "legacy hybrid encryption" algorithm
(see 0.4 above) to the server's onion key, giving a client handshake:
PK-encrypted:
Padding [PK_PAD_LEN bytes]
Symmetric key [KEY_LEN bytes]
First part of g^x [PK_ENC_LEN-PK_PAD_LEN-KEY_LEN
bytes]
Symmetrically encrypted:
Second part of g^x [DH_LEN-(PK_ENC_LEN-PK_PAD_LEN-
KEY_LEN)
bytes]
The payload for a CREATED cell, or the relay payload for an
EXTENDED cell, contains:
DH data (g^y) [DH_LEN bytes]
Derivative key data (KH) [HASH_LEN bytes] <see 5.2 below>
Once the handshake between the OP and an OR is completed, both can
now calculate g^xy with ordinary DH. Before computing g^xy, both
parties
MUST verify that the received g^x or g^y value is not degenerate;
that is, it must be strictly greater than 1 and strictly less than p-1
where p is the DH modulus. Implementations MUST NOT complete a
handshake
with degenerate keys. Implementations MUST NOT discard other "weak"
g^x values.
(Discarding degenerate keys is critical for security; if bad keys
are not discarded, an attacker can substitute the OR's CREATED
cell's g^y with 0 or 1, thus creating a known g^xy and impersonating
the OR. Discarding other keys may allow attacks to learn bits of
the private key.)
Once both parties have g^xy, they derive their shared circuit keys
and 'derivative key data' value via the KDF-TOR function in 5.2.1.
5.1.4. The "ntor" handshake
This handshake uses a set of DH handshakes to compute a set of
shared keys which the client knows are shared only with a particular
server, and the server knows are shared with whomever sent the
original handshake (or with nobody at all). Here we use the
"curve25519" group and representation as specified in "Curve25519:
new Diffie-Hellman speed records" by D. J. Bernstein.
[The ntor handshake was added in Tor 0.2.4.8-alpha.]
In this section, define:
H(x,t) as HMAC_SHA256 with message x and key t.
H_LENGTH = 32.
ID_LENGTH = 20.
G_LENGTH = 32
PROTOID = "ntor-curve25519-sha256-1"
t_mac = PROTOID | ":mac"
t_key = PROTOID | ":key_extract"
t_verify = PROTOID | ":verify"
MULT(a,b) = the multiplication of the curve25519 point 'a' by the
scalar 'b'.
G = The preferred base point for curve25519 ([9])
KEYGEN() = The curve25519 key generation algorithm, returning
a private/public keypair.
m_expand = PROTOID | ":key_expand"
KEYID(A) = A
To perform the handshake, the client needs to know an identity key
digest for the server, and an ntor onion key (a curve25519 public
key) for that server. Call the ntor onion key "B". The client
generates a temporary keypair:
x,X = KEYGEN()
and generates a client-side handshake with contents:
NODEID Server identity digest [ID_LENGTH bytes]
KEYID KEYID(B) [H_LENGTH bytes]
CLIENT_PK X [G_LENGTH bytes]
The server generates a keypair of y,Y = KEYGEN(), and uses its ntor
private key 'b' to compute:
secret_input = EXP(X,y) | EXP(X,b) | ID | B | X | Y | PROTOID
KEY_SEED = H(secret_input, t_key)
verify = H(secret_input, t_verify)
auth_input = verify | ID | B | Y | X | PROTOID | "Server"
The server's handshake reply is:
SERVER_PK Y [G_LENGTH bytes]
AUTH H(auth_input, t_mac) [H_LENGTH bytes]
The client then checks Y is in G^* [see NOTE below], and computes
secret_input = EXP(Y,x) | EXP(B,x) | ID | B | X | Y | PROTOID
KEY_SEED = H(secret_input, t_key)
verify = H(secret_input, t_verify)
auth_input = verify | ID | B | Y | X | PROTOID | "Server"
The client verifies that AUTH == H(auth_input, t_mac).
Both parties check that none of the EXP() operations produced the
point at infinity. [NOTE: This is an adequate replacement for
checking Y for group membership, if the group is curve25519.]
Both parties now have a shared value for KEY_SEED. They expand this
into the keys needed for the Tor relay protocol, using the KDF
described in 5.2.2 and the tag m_expand.
5.1.5. CREATE_FAST/CREATED_FAST cells
When initializing the first hop of a circuit, the OP has already
established the OR's identity and negotiated a secret key using TLS.
Because of this, it is not always necessary for the OP to perform the
public key operations to create a circuit. In this case, the
OP MAY send a CREATE_FAST cell instead of a CREATE cell for the first
hop only. The OR responds with a CREATED_FAST cell, and the circuit is
created.
A CREATE_FAST cell contains:
Key material (X) [HASH_LEN bytes]
A CREATED_FAST cell contains:
Key material (Y) [HASH_LEN bytes]
Derivative key data [HASH_LEN bytes] (See 5.2.1 below)
The values of X and Y must be generated randomly.
Once both parties have X and Y, they derive their shared circuit keys
and 'derivative key data' value via the KDF-TOR function in 5.2.1.
The CREATE_FAST handshake is currently deprecated whenever it is not
necessary; the migration is controlled by the "usecreatefast"
networkstatus parameter as described in dir-spec.txt.
[Tor 0.3.1.1-alpha and later disable CREATE_FAST by default.]
5.2. Setting circuit keys
5.2.1. KDF-TOR
This key derivation function is used by the TAP and CREATE_FAST
handshakes, and in the current hidden service protocol. It shouldn't
be used for new functionality.
If the TAP handshake is used to extend a circuit, both parties
base their key material on K0=g^xy, represented as a big-endian unsigned
integer.
If CREATE_FAST is used, both parties base their key material on
K0=X|Y.
From the base key material K0, they compute KEY_LEN*2+HASH_LEN*3 bytes
of
derivative key data as
K = H(K0 | [00]) | H(K0 | [01]) | H(K0 | [02]) | ...
The first HASH_LEN bytes of K form KH; the next HASH_LEN form the
forward
digest Df; the next HASH_LEN 41-60 form the backward digest Db; the next
KEY_LEN 61-76 form Kf, and the final KEY_LEN form Kb. Excess bytes from
K
are discarded.
KH is used in the handshake response to demonstrate knowledge of the
computed shared key. Df is used to seed the integrity-checking hash
for the stream of data going from the OP to the OR, and Db seeds the
integrity-checking hash for the data stream from the OR to the OP. Kf
is used to encrypt the stream of data going from the OP to the OR, and
Kb is used to encrypt the stream of data going from the OR to the OP.
5.2.2. KDF-RFC5869
For newer KDF needs, Tor uses the key derivation function HKDF from
RFC5869, instantiated with SHA256. (This is due to a construction
from Krawczyk.) The generated key material is:
K = K_1 | K_2 | K_3 | ...
Where H(x,t) is HMAC_SHA256 with value x and key t
and K_1 = H(m_expand | INT8(1) , KEY_SEED )
and K_(i+1) = H(K_i | m_expand | INT8(i+1) , KEY_SEED )
and m_expand is an arbitrarily chosen value,
and INT8(i) is a octet with the value "i".
In RFC5869's vocabulary, this is HKDF-SHA256 with info == m_expand,
salt == t_key, and IKM == secret_input.
When used in the ntor handshake, the first HASH_LEN bytes form the
forward digest Df; the next HASH_LEN form the backward digest Db; the
next KEY_LEN form Kf, the next KEY_LEN form Kb, and the final
DIGEST_LEN bytes are taken as a nonce to use in the place of KH in the
hidden service protocol. Excess bytes from K are discarded.
5.3. Creating circuits
When creating a circuit through the network, the circuit creator
(OP) performs the following steps:
1. Choose an onion router as an end node (R_N):
* N MAY be 1 for non-anonymous directory mirror, introduction
point,
or service rendezvous connections.
* N SHOULD be 3 or more for anonymous connections.
Some end nodes accept streams (see 6.1), others are introduction
or rendezvous points (see rend-spec-{v2,v3}.txt).
2. Choose a chain of (N-1) onion routers (R_1...R_N-1) to constitute
the path, such that no router appears in the path twice.
3. If not already connected to the first router in the chain,
open a new connection to that router.
4. Choose a circID not already in use on the connection with the
first router in the chain; send a CREATE/CREATE2 cell along
the connection, to be received by the first onion router.
5. Wait until a CREATED/CREATED2 cell is received; finish the
handshake and extract the forward key Kf_1 and the backward
key Kb_1.
6. For each subsequent onion router R (R_2 through R_N), extend
the circuit to R.
To extend the circuit by a single onion router R_M, the OP performs
these steps:
1. Create an onion skin, encrypted to R_M's public onion key.
2. Send the onion skin in a relay EXTEND/EXTEND2 cell along
the circuit (see sections 5.1.2 and 5.5).
3. When a relay EXTENDED/EXTENDED2 cell is received, verify KH,
and calculate the shared keys. The circuit is now extended.
When an onion router receives an EXTEND relay cell, it sends a CREATE
cell to the next onion router, with the enclosed onion skin as its
payload.
When an onion router receives an EXTEND2 relay cell, it sends a CREATE2
cell to the next onion router, with the enclosed HLEN, HTYPE, and HDATA
as its payload. The initiating onion router chooses some circID not yet
used on the connection between the two onion routers. (But see section
5.1.1 above, concerning choosing circIDs.)
As special cases, if the EXTEND/EXTEND2 cell includes a legacy identity,
or
identity fingerprint of all zeroes, or asks to extend back to the relay
that sent the extend cell, the circuit will fail and be torn down.
Ed25519 identity keys are not required in EXTEND2 cells, so all zero
keys SHOULD be accepted. If the extending relay knows the ed25519 key
from
the consensus, it SHOULD also check that key. (See section 5.1.2.)
If an EXTEND2 cell contains the ed25519 key of the relay that sent the
extend cell, the circuit will fail and be torn down.
When an onion router receives a CREATE/CREATE2 cell, if it already has a
circuit on the given connection with the given circID, it drops the
cell. Otherwise, after receiving the CREATE/CREATE2 cell, it completes
the specified handshake, and replies with a CREATED/CREATED2 cell.
Upon receiving a CREATED/CREATED2 cell, an onion router packs it payload
into an EXTENDED/EXTENDED2 relay cell (see section 5.1.2), and sends
that cell up the circuit. Upon receiving the EXTENDED/EXTENDED2 relay
cell, the OP can retrieve the handshake material.
(As an optimization, OR implementations may delay processing onions
until a break in traffic allows time to do so without harming
network latency too greatly.)
5.3.1. Canonical connections
It is possible for an attacker to launch a man-in-the-middle attack
against a connection by telling OR Alice to extend to OR Bob at some
address X controlled by the attacker. The attacker cannot read the
encrypted traffic, but the attacker is now in a position to count all
bytes sent between Alice and Bob (assuming Alice was not already
connected to Bob.)
To prevent this, when an OR gets an extend request, it SHOULD use an
existing OR connection if the ID matches, and ANY of the following
conditions hold:
- The IP matches the requested IP.
- The OR knows that the IP of the connection it's using is canonical
because it was listed in the NETINFO cell.
ORs SHOULD NOT check the IPs that are listed in the server descriptor.
Trusting server IPs makes it easier to covertly impersonate a relay,
after
stealing its keys.
5.4. Tearing down circuits