8000 esp32: SSL handshake fails in latest build: esp32-idf3-v1.12-68-g3032ae115 on 2020-01-15 · Issue #5543 · micropython/micropython · GitHub
[go: up one dir, main page]

Skip to content

esp32: SSL handshake fails in latest build: esp32-idf3-v1.12-68-g3032ae115 on 2020-01-15 #5543

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Carglglz opened this issue Jan 16, 2020 · 19 comments

Comments

@Carglglz
Copy link
Contributor
Carglglz commented Jan 16, 2020

SSL handshake fails with error:

mbedtls_ssl_handshake error: -4310
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/lib/ssl_repl.py", line 217, in start
  File "/lib/ssl_repl.py", line 41, in __init__
  File "/lib/ssl_repl.py", line 49, in connect_SOC
OSError: [Errno 5] EIO

The previous build : esp32-idf3-20200114-v1.12-63-g1c849d63a.bin works well.

Are there any significant changes in the ssl module in the latest build?
Is it related to this issue maybe?

I checked the error code here and I think it points to this (but still not sure how to read the error code since '-0x10d6' does not appear anywhere)

MBEDTLS_SSL_ALERT_MSG_UNSUPPORTED_CERT 43 /* 0x2B */

Maybe the previous build uses modussl_axtls.c by default instead of modussl_mbedtls.c and the latest is the other way around ?

@jimmo
Copy link
Member
jimmo commented Jan 16, 2020

Thanks for the report. The main change since yesterday's build is #5524 which updates the IDF version (which would possibly include a newer version of mbedtls).
Do you know if this happens on all ssl hosts? I will investigate when I find some time.

@Carglglz
Copy link
Contributor Author
Carglglz commented Jan 16, 2020

Do you know if this happens on all ssl hosts?

I don't know, the host is my computer that listens on a ssl wrapped socket (Python 3)

But I've just flashed the latest build and run this http_client_ssl.py example from the repo and gives this error (I saved the script with another name but it is the same code)

Address infos: [(2, 1, 0, 'google.com', ('216.58.211.46', 443))]
Connect address: ('216.58.211.46', 443)
mbedtls_ssl_handshake error: -4290
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "http_ssl_test.py", line 27, in <module>
  File "http_ssl_test.py", line 16, in main
OSError: [Errno 5] EIO

After a machine reset works well:

Address infos: [(2, 1, 0, 'google.com', ('172.217.17.14', 443))]
Connect address: ('172.217.17.14', 443)
<_SSLSocket 3ffe78d0>
b''

Then I tried to disable key, cert authentication (since the example does not use them)
self.cli_soc = ssl.wrap_socket(self.cli_soc) instead of
self.cli_soc = ssl.wrap_socket(self.cli_soc, key=self.key, cert=self.cert)
I disabled also ssl.CERT_REQUIRED in the python 3 server side

Now the handshake is done well, but after a few messages esp32 freezes and STA resets:

>>>
MicroPython v1.12-68-g3032ae115 on 2020-01-16; ESP32 module with ESP32
Type "help()" for more information.
>>>
>>> import gc;import uos;from upysh import *;gc.collect()
>>> help('modules')
__main__          gc                uctypes           urequests
_boot             inisetup          uerrno            uselect
_onewire          machine           uhashlib          usocket
_thread           math              uhashlib          ussl
_webrepl          micropython       uheapq            ustruct
apa106            neopixel          uio               utime
btree             network           ujson             utimeq
builtins          ntptime           umqtt/robust      uwebsocket
cmath             onewire           umqtt/simple      uzlib
dht               sys               uos               webrepl
ds18x20           uarray            upip              webrepl_setup
esp               ubinascii         upip_utarfile     websocket_helper
esp32             ubluetooth        upysh
flashbdev         ucollections      urandom
framebuf          ucryptolib        ure
Plus any modules on the filesystem
>>> I (808265) wifi: bcn_timout,ap_probe_send_start
I (810765) wifi: ap_probe_send over, resett wifi status to disassoc
I (810765) wifi: state: run -> init (c800)
I (810765) wifi: pm stop, total sleep time: 153461641 us / 180675245 us

I (810775) wifi: new:<11,0>, old:<11,0>, ap:<11,1>, sta:<11,0>, prof:11
I (810775) wifi: STA_DISCONNECTED, reason:200
beacon tidupterm: Exception in read() method, deactivating: OSError: -113
meout
I (810915) wifi: new:<11,0>, old:<11,0>, ap:<11,1>, sta:<11,0>, prof:11
I (810915) wifi: state: init -> auth (b0)
I (810915) wifi: state: auth -> assoc (0)
I (810925) wifi: state: assoc -> run (10)
I (811035) wifi: connected with TP-Link_DD98, aid = 2, channel 11, BW20, bssid = 70:4f:57:b5:80:cc
I (811035) wifi: security type: 4, phy: bgn, rssi: -58
I (811035) wifi: pm start, type: 1

I (811045) network: CONNECTED
I (811065) wifi: AP's beacon interval = 102400 us, DTIM period = 1
I (811985) event: sta ip: 192.168.1.53, mask: 255.255.255.0, gw: 192.168.1.1
I (811985) network: GOT_IP

So I guess there are two problems here:

  • Key and cert authentication
  • SSL sockets too slow/freezing ?

As you said this is something related to the new IDF version since the previous build worked well.
I will check then if there are some changes related to mbedtls in this IDF version.

Finally are these issues related maybe (from esp-idf repo)?
#4594
#4357
Seems to be related to a memory allocation error ?

@Carglglz
Copy link
Contributor Author

Hi @jimmo I got this working but without key/cert.
If I try to use key/cert for client authentication it throws mbedtls_ssl_handshake error: -4310 or STA resets I (808265) wifi: bcn_timout,ap_probe_send_start , so this means the error occurs while trying to send the certificate. It's hard to say why, maybe due to a memory allocation error as I said.

@dpgeorge
Copy link
Member

@Carglglz I tested the http_client_ssl.py example included in this repo and it works ok for me on the latest build.

If you could provide a full working example (server and client code) that shows the issue then we can look further into the problem.

@Carglglz
Copy link
Contributor Author

@dpgeorge Thanks for taking the time,

If you could provide a full working example (server and client code) that shows the issue then we can look further into the problem.

Yes I should have done that in first place. Here is the test example : SSL_server_client_test.py
I generated the key and certificate with openssl:
openssl req -x509 -newkey rsa:4096 -keyout SSL_key.pem -out SSL_certificate.pem -days 365
Then I converted them to '.der' format:

  • Certificate: openssl x509 -outform der -in SSL_certificate.pem -out SSL_certificate.der
  • Key: openssl rsa -in SSL_key.pem -out SSL_key.der -outform DER

Then I uploaded key and cert to the esp32 (build idf3 from 2020-01-14)
Now in Python 3(server side, I tried with 3.6, 3.7 and 3.8 ):

>>> from SSL_server_client_test import HOST_SSL_socket_server
Python 3 version
>>> server = HOST_SSL_socket_server(auth=True)
192.168.1.33
>>> server.start_SOC()
Server listening...
Enter PEM pass phrase: # After introducing the pass phrase will wait for a connection
Connection received...
('192.168.1.53', 53328)
>>>

MicroPython 1.12 (client side):

>>> from SSL_server_client_test import SSL_socket_client_repl
>>> ssl_repl = SSL_socket_client_repl('192.168.1.33', auth=True)
>>> 

So until that build this worked ok, but after that client authentication does not work.
Authentication is done in Python 3 server side by:

if self.auth:
    self.context.verify_mode = ssl.CERT_REQUIRED
    self.context.load_verify_locations(cafile=self.cert)

In MicroPython by:

if self.auth:
    self.cli_soc = ssl.wrap_socket(self.cli_soc, key=self.key, cert=self.cert)
else:
    self.cli_soc = ssl.wrap_socket(self.cli_soc)

Without authentication (auth=False) works good in the latest build too, but with authentication It is throwing this error now:

Python 3 server side (I tried with 3.6 and 3.7 too):

Python 3.8.0 (default, Nov 24 2019, 03:13:55)
[Clang 11.0.0 (clang-1100.0.33.8)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from SSL_server_client_test import HOST_SSL_socket_server
Python 3 version
>>> server = HOST_SSL_socket_server(auth=True)
192.168.1.33
>>> server.start_SOC()
Server listening...
Enter PEM pass phrase:
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/carglglz/MICROPYTHON/TOOLS/upydev_test/esp32_30aea4233564/SSL_TEST/SSL_server_client_test.py", line 170, in start_SOC
    self.conn = self.context.wrap_socket(self.conn, server_side=True)
  File "/Users/carglglz/.pyenv/versions/3.8.0/lib/python3.8/ssl.py", line 500, in wrap_socket
    return self.sslsocket_class._create(
  File "/Users/carglglz/.pyenv/versions/3.8.0/lib/python3.8/ssl.py", line 1040, in _create
    self.do_handshake()
  File "/Users/carglglz/.pyenv/versions/3.8.0/lib/python3.8/ssl.py", line 1309, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLError: [SSL: TLSV1_ALERT_DECRYPT_ERROR] tlsv1 alert decrypt error (_ssl.c:1108)

MicroPython client side:

MicroPython v1.12-76-gdccace6f3 on 2020-01-22; ESP32 module with ESP32
Type "help()" for more information.
>>> from SSL_server_client_test import SSL_socket_client_repl
>>> ssl_repl = SSL_socket_client_repl('192.168.1.33', auth=True)
>>>
mbedtls_ssl_handshake error: -4290
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "SSL_server_client_test.py", line 85, in __init__
  File "SSL_server_client_test.py", line 91, in connect_SOC
OSError: [Errno 5] EIO

So maybe there is something different in how mbedtls handles certificates in the new version?
I think the error code corresponds to "MBEDTLS_SSL_ALERT_MSG_BAD_CERT" but I'm not sure...

@Carglglz
Copy link
Contributor Author
Carglglz commented Jan 22, 2020

OK, I think I "solved" this or at least I found out why authentication is not working in the latest builds.
It is related to the RSA key length that is used:

  • RSA 4096 bits (-newkey rsa:4096) throws this error : mbedtls_ssl_handshake error: -4290
  • RSA 2048 bits (-newkey rsa:2048) throws this error: mbedtls_ssl_handshake error: -4310
  • RSA 1024 bits (-newkey rsa:1024) works well.
    This could be a consequence of esp32: mbedtls can't allocate in/out buffers on IDF4.x #5303 perhaps?
    Right now I don't have a special requirement for a 2048/4096 bits RSA key, but I think this "issue" should be documented somewhere.
    Also should I change the title of the issue to "esp32: SSL handshake fails with RSA key >= 2048 bits in latest idf3 builds (since 2020-01-15)" or something like that?
    [EDIT]:
    Although authentication does work with RSA 1024 bits, the performance of the ssl connection is worse than in the previous builds (this happens without authentication too) after a few messages
    STA resets I (808265) wifi: bcn_timout,ap_probe_send_start and after a soft-reset trying to connect with a key 1024 bits throws this error too mbedtls_ssl_handshake error: -4310. It needs a hard reset to get it working again...
    [EDIT 2]:
    idf4 performance is better, but still STA resets from time to time and a hard reset is needed after that...

@Carglglz
Copy link
Contributor Author

@dpgeorge @jimmo
My final conclusion: it is definitely related to #5303 , jimmo@d8c8188 and
#5306
From esp-idf /mbedtls/esp_config.h:

def MBEDTLS_SSL_OUT_CONTENT_LEN
Maximum outgoing fragment length in bytes.
Uncomment to set the size of the outward TLS buffer independently of the inward buffer.

It is possible to save RAM by setting a smaller outward buffer, while keeping the default inward 16384 byte buffer to conform to the TLS specification.

The minimum required outward buffer size is determined by the handshake protocol's usage. Handshaking will fail if the outward buffer is too small. The specific size requirement depends on the configured ciphers and any certificate data which is sent during the handshake.

For absolute minimum RAM usage, it's best to enable MBEDTLS_SSL_MAX_FRAGMENT_LENGTH and reduce MBEDTLS_SSL_MAX_CONTENT_LEN. This reduces both incoming and outgoing buffer sizes. However this is only guaranteed if the other end of the connection also supports the TLS max_fragment_len extension. Otherwise the connection may fail.

So here is the reason why Handshake fails, why using a smaller RSA key does work, and probably why STA resets while trying to send a "long message" (and not just randomly or from time to time as I initially thought). For "long message" I mean something like this:

help('modules')
__main__          gc                uctypes           urequests
_boot             inisetup          uerrno            uselect
_onewire          machine           uhashlib          usocket
_thread           math              uhashlib          ussl
_webrepl          micropython       uheapq            ustruct
apa106            neopixel          uio               utime
btree             network           ujson             utimeq
builtins          ntptime           umqtt/robust      uwebsocket
cmath             onewire           umqtt/simple      uzlib
dht               sys               uos               webrepl
ds18x20           uarray            upip              webrepl_setup
esp               ubinascii         upip_utarfile     websocket_helper
esp32             ubluetooth        upysh
flashbdev         ucollections      urandom
framebuf          ucryptolib        ure
Plus any modules on the filesystem

Which seems odd, because this doesn't look like a 4kB of data message, however after encryption, the encrypted message probably takes more than 4kB and then connection may fail.
I think this is the same that happens at handshaking with client authentication.
The esp32 (client) tries to send the encrypted certificate but fails, since the encrypted certificate is too long. So using a smaller key results in a smaller encrypted certificate and therefore handshake succeeds.

From jimmo's PR:

See #5303 for more details, but in summary:

  • The default 16kiB output buffer (same as the input buffer) is expensive.
  • On ESP32 (non-psram) we don't have enough RAM available to allocate the original 16kiB due to RAM region fragmentation in IDF 4.x.

So right now the actual configuration limits:

  • The length of the message sent (which depends on the configured ciphers)
  • The handshake with client authentication (which depends on the length of the RSA key)

It is possible to increase MBEDTLS_SSL_OUT_CONTENT_LEN to 8kB ?
If not, Is there any workaround to this?, Like allocate memory somewhere else in case a message is bigger than 4kB? (I don't know how this would be possible, just taking a long shot...)

If there is no workaround to this then I think it should be specified the next parameters:

  • Set of ciphers that are more likely to work (which increase less the length of a message)
  • Max length of message with this set of ciphers (to avoid connection failure)
  • Max RSA key length for client authentication

From Wikipedia entry on Cipher suites:

Cipher suites for constrained devices:
Encryption, key exchange and authentication algorithms usually require a large amount of processing power and memory. To provide security to constrained devices with limited processing power, memory, and battery life such as those powering the Internet of things there are specifically chosen cipher suites. Two examples include:

TLS_PSK_WITH_AES_128_CCM_8 (pre-shared key)[20]
TLS_ECDHE_ECDSA_WITH_AES_128_CCM_8 (raw public key)

I can see that TLS_ECDHE_ECDSA_WITH_AES_128_CCM_8 is supported by esp32 so I think I will try to set the server to use this cipher suite and test if this works better...

@jimmo
Copy link
Member
jimmo commented Jan 24, 2020

hi @Carglglz -- thanks for digging into this and for the excellent summary.

I had a very brief look also and came to the same conclusion. When you look at the memory regions (i.e. the same information as what I showed in #5303) it seems that enabling BLE and NimBLE in IDF 3.3.1 has caused a significant amount of additional fragmentation (similar but perhaps marginally worse than what we saw in the move to 4.x as discussed in #5303).

I agree that increasing the buffer size to 8kiB will address the issue here, but will possibly impact the number of concurrent SSL sockets (as discussed in #5303) if it's unable to find 8kiB regions. Anyway when I get some time I can measure this and see what the impact is.

I also saw a report on the forum from someone seeing issues on a PSRAM build which I need to investigate.

Some areas that I want to investigate:

  • Force the MicroPython heap to use a smaller region rather than the largest available, which will give more flexibility to the IDF to find regions for the SSL data. (This is an unfortunate regression... on non-PSRAM builds you'll now get ~70kiB of MicroPython heap rather than the ~100kiB previously).
  • On PSRAM builds, perhaps leave some PSRAM behind for the IDF to use (instead of MicroPython just using the whole thing) -- this might be an easy win to just make the issues disappear on PSRAM builds.
  • Make mbedtls use the MicroPython heap rather than using IDF malloc directly, and make the MicroPython heap able to use multiple non-contiguous regions (i.e. py/gc: Support multiple heaps (version 2). #3580)

Perhaps we need to provide non-BLE builds...but this is starting to get a bit crazy having to juggle all the tradeoffs.

@Carglglz
Copy link
Contributor Author

@jimmo @dpgeorge
I got this working with TLS_ECDHE_ECDSA_WITH_AES_128_CCM_8 cipher suite.
This requires a recent version of python-ssl

>>> ssl.OPENSSL_VERSION
'OpenSSL 1.1.1d  10 Sep 2019'

Generate a ec key and cert with openssl :
openssl ecparam -out ec_key.pem -name secp256r1 -genkey
openssl req -new -key ec_key.pem -x509 -nodes -days 365 -out cert.pem
Convert key and certificate to '.DER' format:
openssl x509 -in cert.pem -out cert.der -outform DER
openssl ec -in ec_key.pem -out ec_key.der -outform DER

Upload key/cert to the esp32, and set the cipher in the python3 server side:
self.context.set_ciphers('ECDHE-ECDSA-AES128-CCM8')

I tested this in the latest idf4
The handshake works, client authentication works, "long messages" are no longer a problem,
so in general works great but there is no chance for a second ssl socket connection

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/lib/ssl_repl.py", line 118, in connect_SOC
OSError: [Errno 12] ENOMEM

In idf3 only works without client authentication, with client authentication throws this error:

mbedtls_ssl_handshake error: -10
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/lib/ssl_repl.py", line 225, in start
  File "/lib/ssl_repl.py", line 44, in __init__
  File "/lib/ssl_repl.py", line 53, in connect_SOC
OSError: [Errno 5] EIO

Also in both idf3 and idf4 a soft reset won't release any memory in the output buffer, which makes a new ssl connection crash and the only way to make it work again is a hard reset.

If there any chance to make output buffer 8kB?, at least to test?

I (534) heap_init: Initializing. RAM available for dynamic allocation:
I (541) heap_init: At 3FFAFF10 len 000000F0 (0 KiB): DRAM
I (547) heap_init: At 3FFB6388 len 00001C78 (7 KiB): DRAM
I (553) heap_init: At 3FFB9A20 len 00004108 (16 KiB): DRAM
I (559) heap_init: At 3FFBDB5C len 00000004 (0 KiB): DRAM
I (565) heap_init: At 3FFCC820 len 000137E0 (77 KiB): DRAM
I (571) heap_init: At 3FFE0440 len 00003AE0 (14 KiB): D/IRAM
I (577) heap_init: At 3FFE4350 len 0001BCB0 (111 KiB): D/IRAM
I (584) heap_init: At 40099A70 len 00006590 (25 KiB): IRAM

Is there any chance to use some of this "RAM available for dynamic allocation" or it is not possible/already used after MicroPython boots? (This is really way out of my scope, so I have literally no clue...)

[EDITED, I've just read your last message jimmo, thanks for the extensive explanation]

I've found this PR to mbedtls : Let input and output buffers len be set at runtime, per connection
And this post from mbedtls: Let mbed TLS use static memory instead of the heap

But I guess this needs to be implemented in mbedtls library and not in MicroPython?
Or something like micropython.alloc_emergency_exception_buf(size) but for ssl output buffer?

Anyway I don't really need this to work on new releases of idf3-idf4 right now, I just wanted to know why it was failing and if I could do something about.
So yeah, a think the path to follow / recommendations for SSL in MicroPython are:

  1. Optimise the cipher suite with the most efficient/recommended for constrained devices (server side)
  2. Optimise output ssl buffer length with the options you mentioned. (client side)

Thanks again for taking the time

Perhaps we need to provide non-BLE builds...but this is starting to get a bit crazy having to juggle all the tradeoffs.

I wish I could help but C is out of my scope right now... (I started with MicroPython/embedded "developing" a year ago :( )
Anyway I will keep an eye on this, in case I can help.

PD: sorry for the TL;DR messages, I just wanted to be as clear as possible

@tve
Copy link
Contributor
tve commented Jan 24, 2020

Thanks for all the long explanations! SSL is just a PITA on small devices... I need to do more testing specifically with the PSK cipher suites... IMHO, this is the best avenue in the medium term:

Make mbedtls use the MicroPython heap rather than using IDF malloc directly, and make the MicroPython heap able to use multiple non-contiguous regions (i.e. #3580)

I'm actually wondering: wouldn't it be best if all of ESP-IDF used the MP heap so there's only one heap and not multiple? Maybe that's tricky to init given that the esp-idf starts before MP, so to speak.

@Carglglz
Copy link
Contributor Author
Carglglz commented Jan 28, 2020

@tve

SSL is just a PITA on small devices

Yes, it is indeed true :( , but I think that down the road it would be good for MicroPython to have "state of the art" support for SSL/TLS and therefore some sort of IoT "security" (if this is actually possible...)

I need to do more testing specifically with the PSK cipher suites

Yes this would be nice to implement, mbedtls supports TLS-PSK-WITH-AES-128-CCM-8 cipher suite too but I think ESP-IDF still doesn't support it (in fact any PSK, or at least they don't appear in the 'Client Hello' of esp32, just ECDHE, DHE, or RSA)

[EDIT I've just 8000 read your PR #5544 on this and "upvoted" 👍 ]
I don't think this needs a separate module/library, since the 'Pre-Shared Key' aka PSK can be passed in the key param or maybe add psk and id parameters, so at the user side the change will be minimal.
As I said I don't know C so in how to implement the PSK cipher suites I cannot help :(.

But I will leave here some useful links that I've found during this SSL/TLS "crash course week" that may be helpful for developing/general info (for you or anybody that steps into this issue):

Just for clarity SSL and TLS refers to the "same" protocol but different release versions which I think is: (1994) SSLv1 (Never released) --> (1995) SSLv2 --> (1996) SSLv3-->(Renamed)-->(1999)TLSv1---> TLSv1.1 (2006) --> TLSv1.2 (2008) ---> TLSv1.3 (2018)

So right now the common practice would be support TLSv1.2 and work towards TLSv1.3 compatibility/optimisations using the most suitable cipher suites for embedded devices I guess?

From the PR docs:

To use PSK:
Generate a random hex string (generating an MD5 on some random data is one way to do this)

In Google cloud VPN docs they recommend to use openssl $ openssl rand -base64 24 to generate this PSK
And I think python secrets library is valid for this too, something like:

>>> import secrets
# As of 2015, it is believed that 32 bytes (256 bits) of randomness is sufficient for the typical use-case expected for the secrets module.
>>> safe_PSK = secrets.token_hex(32)
>>> safe_PSK
'a5cbebd37bff4dce3b70037bd5f64e3a5e20e525b6fc1d5c5ac2e22f77248662'
>>> safe_PSK2 = secrets.token_urlsafe(32)
>>> safe_PSK2
'JNGkSesjuQiV1IM4rlKYGq0L7h0LzERmIHGHzB_PJec'
>>>

As I said I will keep an eye on this in case I can help with more documentation of SSL/TLS or help with MicroPython / Python implementation side. 👍

@tve
Copy link
Contributor
tve commented Jan 28, 2020

Something that would be interesting to test is to see how blocking the TLS connect is. I assume you saw my timings in my PSK PR. What happens if you run the TLS wrap_socket in a separate thread while the main thread blinks an LED at 10Hz? Does it hiccup or continue blinking?

@Carglglz
Copy link
Contributor Author
Carglglz commented Jan 28, 2020

@tve
Yes I saw the timings and I yes what is blocking in the TLS connection is the initial handshake in TLSv1.2 is like this (Takes 4 messages):

Client                                               Server

     ClientHello                  -------->
                                                     ServerHello
                                                    Certificate*
                                              ServerKeyExchange*
                                             CertificateRequest*
                                  <--------      ServerHelloDone
     Certificate*
     ClientKeyExchange
     CertificateVerify*
     [ChangeCipherSpec]
     Finished                     -------->
                                              [ChangeCipherSpec]
                                  <--------             Finished
     Application Data             <------->     Application Data

            Figure 1.  Message flow for a full handshake

After the handshake the connection should be practically as fast as a normal tcp socket connection
since I've reached up to 70 kB/s when transferring a file through TLS sockets:
1550.06 KB took 21 seconds
on Normal TCP sockets:
1550.06 KB took 20.3 seconds

Looking at Wireshark for the time that takes the client handshake with TLS_ECDHE_ECDSA_WITH_AES_128_CCM_8 cipher suite with client authentication:
Client Hello : at second 59.71

#     |    Time(s)   |      Source     |  Destination  |  Protocol | Length | Info
530	59.712850	192.168.1.35	192.168.1.42	TLSv1.2  	274	Client Hello
532	59.713956	192.168.1.42	192.168.1.35	TLSv1.2  	994	Server Hello, Certificate, Server Key Exchange, Certificate Request, Server Hello Done
554	60.554577	192.168.1.35	192.168.1.42	TLSv1.2  	700	Certificate
571	61.356446	192.168.1.35	192.168.1.42	TLSv1.2  	129	Client Key Exchange
593	61.819970	192.168.1.35	192.168.1.42	TLSv1.2  	137	Certificate Verify
595	61.824763	192.168.1.35	192.168.1.42	TLSv1.2  	97	Change Cipher Spec, Encrypted Handshake Message
597	61.825223	192.168.1.42	192.168.1.35	TLSv1.2  	912	New Session Ticket, Change Cipher Spec, Encrypted Handshake Message
599	61.828205	192.168.1.42	192.168.1.35	TLSv1.2. 	77	Application Data

First Application Data (End of the handshake): at second 61.82
Which makes in fact 1-2 seconds of handshake. (bottleneck is in certificate-client key exchange)

In TLSv1.3 this is improved and takes only 2 messages to do the handshake.

Figure 1 below shows the basic full TLS handshake:

       Client                                           Server

Key  ^ ClientHello
Exch | + key_share*
     | + signature_algorithms*
     | + psk_key_exchange_modes*
     v + pre_shared_key*       -------->
                                                  ServerHello  ^ Key
                                                 + key_share*  | Exch
                                            + pre_shared_key*  v
                                        {EncryptedExtensions}  ^  Server
                                        {CertificateRequest*}  v  Params
                                               {Certificate*}  ^
                                         {CertificateVerify*}  | Auth
                                                   {Finished}  v
                               <--------  [Application Data*]
     ^ {Certificate*}
Auth | {CertificateVerify*}
     v {Finished}              -------->
       [Application Data]      <------->  [Application Data]

And It seems that with PSK is even faster.
However I think mbedtls doesn't support TLSv1.3 yet :(

What happens if you run the TLS wrap_socket in a separate thread while the main thread blinks an LED at 10Hz? Does it hiccup or continue blinking?

I'm not sure about this but I guess that at the "C level" the handshake can be done in an asynchronously manner maybe? Like keep on doing "main tasks" until it needs to send/receive handshake messages? kind of non-blocking handshake? so the LED does not hiccup?.
But again I think that after the handshake the connection should be almost as fast as "plain" TCP one.
In the drafts I linked, there is useful and extensive information about the TLS protocols which I think worth a read and does help a lot. :)
[EDIT]
I've just found this repo that implements PSK to ssl Python library
sslpsk
So yes sooner or later Python should implement this too. 👍

@Carglglz
Copy link
Contributor Author

Also you may be interested in the "Session Resumption" feature which improves further client handshakes after the initial one:

7 . Session Resumption
Session resumption is a feature of the core TLS/DTLS specifications
that allows a client to continue with an earlier established session
state. The resulting exchange is shown in Figure 11. In addition,
the server may choose not to do a cookie exchange when a session is
resumed. Still, clients have to be prepared to do a cookie exchange
with every handshake. The cookie exchange is not shown in the
figure.

     Client                                               Server
     ------                                               ------

     ClientHello                   -------->
                                                      ServerHello
                                               [ChangeCipherSpec]
                                   <--------             Finished
     [ChangeCipherSpec]
     Finished                      -------->
     Application Data              <------->     Application Data

                Figure 11: DTLS Session Resumption

Constrained clients MUST implement session resumption to improve the
performance of the handshake. This will lead to a reduced number of
message exchanges, lower computational overhead (since only symmetric
cryptography is used during a session resumption exchange), and
session resumption requires less bandwidth.

@andymule
Copy link
andymule commented Feb 9, 2020

I think its related:
cant install SSL packages from upip on latest build, min repro is really easy:

import upip
upip.install('picoweb')

mbedtls_ssl_handshake error: -71

using esp32-idf3-20200207-v1.12-154-gce40abcf2

rolling back to last stable resolved the issue

@george-hawkins
Copy link

I see the same issue as @andymule when using the latest firmware - esp32-idf4-20200429-unstable-v1.12-418-g2e3c42775.bin:

>>> import upip
I (159660) modsocket: Initializing
>>> upip.install('micropython-logging')
Installing to: /lib/
I (183850) wifi: bcn_timout,ap_probe_send_start
I (186350) wifi: ap_probe_send over, resett wifi status to disassoc
I (186350) wifi: state: run -> init (c800)
I (186350) wifi: pm stop, total sleep time: 47458331 us / 60262159 us

I (186350) wifi: new:<1,0>, old:<1,0>, ap:<255,255>, sta:<1,0>, prof:1
mbedtls_ssl_handshake error: -71
I (186360) wifi: STA_DISCONNECTED, reason:200
beacon timeout
Error installing 'micropython-logging': [Errno 5] EIO, packages may be partially installed
>>>

@tve
Copy link
Contributor
tve commented Apr 29, 2020

Quick FYI: I have a number of TLS fixes/improvements in the PR queue. They don't reduce memory consumption however. I want to make a further change, which is to delete the server certificate which will free up some storage. Whether it's enough for this specific use-case, we will only find out once this is implemented.

There are a couple of other avenues we could pursue, I'm not sure which ends up being the best. I think the sticking point here is that for new users the upip install is convenient and should work out-of-the-box. I have the feeling that long term MP users don't use upip. Anyway, thoughts:

  1. provide a build that disables bluetooth, that frees up quite a chunk of memory (even if not used, BT allocates quite some static memory which ultimately reduces the heap).
  2. provide a build that disables some other functionality, MDNS comes to mind, I don't know how much that frees (mdns used to be a pig on the esp8266).
  3. reduce the MP heap by enough to make upip work, e.g. instead of allocating the largest chunk we could add something to guarantee at least N KB free for esp-idf.
  4. wait for jimmo's awesome combined idf/mp allocator, but that will only work with esp-idf v4.2 and up.

@enesbcs
Copy link
67E6
enesbcs commented May 14, 2021

#7038

@jimmo
Copy link
Member
jimmo commented Jul 21, 2022

This issue is now being tracked in #8940.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants
0