-
Notifications
You must be signed in to change notification settings - Fork 165
Pycom frozen MQTTClient leaks memory when publishing messages under 1.11.0.b1 #108
Comments
Here's an even shorter main.py removing the class entirely.
|
I replaced the Pycom provided MQTTClient with umqtt.simple from micropython-lib and the issue is gone. Here's the test using umqtt.simple...
Here's the beginning of an example run...
|
This is not a leakage but your output queue is growing till no memory is left. In your logs i cannot see the line "Packet sent. (Length: XXXX)". Please check the connection to your broker. The library you are using is AWS compliant and you should interface with the MQTTLib.py file. Please see this link for a usage example: [https://docs.pycom.io/chapter/tutorials/all/aws.html] If you use the publishing example you should something like this: Packet sent. (Length: 4) Iteration 1997 GC free 29504 |
Hello @oligauc, Thank you for pointing out that I was improperly accessing MQTTClient. I'm still learning where all the documentation is. I've conducted further testing and amended the code to address the points you raised. I believe I'm still seeing the same issue I originally raised and I also have a design concern. Design Concern: If you connect to a non-existent host in the first place the Pycom MQTT client doesn't seem to raise any exception at all. Am I missing something here? I would also tend to expect client software to send something like ECONNRESET (or whatever) in the event the connection disappears mid-run. Original Issue: Client still seems to leak memory for some reason even with mqtt_client.configureOfflinePublishQueueing(0) and mqtt_client.configureDrainingFrequency(10). I would expect this to disable publish queuing or at the very least try to empty the queue every 1/10th of a second (and my code only try's to send a message every quarter of a second). Here's my amended test case code that addresses your points. I ran it against a RabbitMQ broker in my testing.
Here's the beginning and end of a sample run...
I'm definitely seeing packets get sent in my output but it doesn't seem to occur at the frequency I would expect. Did I not disable queueing/draining properly? |
The MQTT library requires 3 types of connections:
In the first 2 types, an exception is thrown if something goes wrong. In the 3rd type, conn.connect() will return false The Pycom MQTT library was designed to work with AWS (Amazon IoT). It takes about 0.05 sec to send a publish request, process the reply and call the callback. You can test this with the chronometer. In your case you are using the RabbitMQ. RabbitMQ is replying to the connect and ping requests, but it is not sending a reply to the publish request. This is making the receive timeout at 3 sec. If you are connected, the queue does not drop messages. You need to configure the maximum queue size, with a non zero value, to start dropping messages. |
I don't understand what you're trying to say about there being "3 connections". To the best of my knowledge the socket connection to the server is made directly to RabbitMQ. This is the connection to the broker so there are in fact only two connections, right? In my case they are...
Regarding the design concern: The client behavior seems extremely odd to me. If you feed the Pycom MQTT client an invalid MQTT host to connect to it will "pretend to connect" and just loop infinitely never even notifying the user there's a problem. This behavior really doesn't seem right to me. Why would this code not throw a sane error like "host not found" or "connection timed out"? It seems wrong for the end user to have to check conn.connect() and then throw a meaningful error because the Pycom code does not. Perhaps I should open a separate issue to address this. Regarding the original issue: When you say "You need to configure the maximum queue size, with a non zero value, to start dropping messages." are you referring to ie mqtt_client.configureOfflinePublishQueueing(1)? |
The 3 rd connection is a packet. When you call conn.connect() a socket is set up and a connect packet is sent to the MQTT server. If you try to set up a socket connection to a non existing server an exception is thrown. |
@oligauc Would you prefer I open a separate issue for the design concern? Does it seem reasonable to you to expect the SocketError to not be suppressed? At lines 92-99 the socket error is suppressed and not re-thrown. This hides the connection failure from downstream code unless the downstream code explicitly evaluates the return, this does not seem right to me. Why would it be desirable to return a boolean defining success/failure instead of letting the socket error bubble up? |
If the connection to the server is lost (e.g server down) the device re-attempts to connect. Ideally you want the device to recover from any failure instead of just stop working. |
I definitely agree having the client try to re-establish the connection is a good idea. However, it is not necessary to permanently suppress the exception in order to attempt to re-establish the connection. The code could, for example, try to re-establish the connection N times (N being configurable) and then allow the exception to bubble up. |
I agree, it's a good idea. @oligauc can we do that? Add a maximum number of retries? Thanks. |
OK i will implement it for the next release |
I'm seeing this on a WiPy 2.0 running my own build of 1.11.0.b1. The only changes I made to the firmware was to add datetime into the frozen modules area. In the test case below I didn't even load datetime so I believe that's irrelevant. This same memory leak seems to occur in the last release of the firmware as well when the frozen MQTTClient was introduced (at least that's how it looked to me).
Exact steps to cause this issue
Here's the main.py
And here's the boot.py
The text was updated successfully, but these errors were encountered: