8000 [aioble] After timeout, reconnection is successful. (Peripherals are always powered off) · Issue #950 · micropython/micropython-lib · GitHub
[go: up one dir, main page]

Skip to content

[aioble] After timeout, reconnection is successful. (Peripherals are always powered off) #950

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
63rabbits opened this issue Dec 22, 2024 · 8 comments

Comments

@63rabbits
Copy link

The other device is powered off. However, after a connection timeout, it appears to be connected when retried. In fact, it is failing.

I'm not a native speaker, so sorry if my writing is wrong.

  • Environment

    • Raspberry Pi Pico W with RP2040
    • MicroPython v1.24.0 on 2024-10-25
    • IDE:Thonny
  • Steps to reproduce

    1. Power off the peripheral device.
    2. Timeout occurs when trying to connect.
    3. Retry and the connection succeeds. (It actually fails.)
  • Code used for testing

from micropython import const
import asyncio
import aioble
# import test_aioble as aioble

import sys


_TARGET_DEVICE_ADDR = const("d7:de:4c:f4:56:e5")


async def gather(ai):
    ret=[]
    async for x in ai: ret.append(x)
    return ret


async def device_details(connection):

    if connection is None:
        return

    try:
        services = await gather(connection.services())
    except asyncio.TimeoutError as e:
        print("Timeout. (discovering services)")
        sys.print_exception(e)
    except Exception as e:
        print("Error (discovering services): {}".format(e))
        sys.print_exception(e)
    else:
        if services is None or len(services) <= 0:
            print("\n\"Service\" not found.")
            return
        
        for s in services:
            print("\t", s)


async def print_details(device):

    connection = None

    # connecting to device
    try:
        print("\nConnecting to {} ... ".format(device), end="")
        connection = await device.connect(timeout_ms=2000)
    except asyncio.TimeoutError as e:
        print("Timeout.")
        sys.print_exception(e)
    except Exception as e:
        print("Error: {}".format(e))
        sys.print_exception(e)
    else:
        print("Connected.")
        await device_details(connection)
        await connection.disconnect()


async def main():

    device = aioble.Device(aioble.ADDR_RANDOM, _TARGET_DEVICE_ADDR)

    while True:
        await print_details(device)
        r = input("\nenter return (to retry) / q(uit). > ").strip()
        if r.upper() == "Q":
            sys.exit(0)


asyncio.run(main())
  • Output
MPY: soft reboot
MicroPython v1.24.0 on 2024-10-25; Raspberry Pi Pico W with RP2040

Type "help()" for more information.

>>> 

>>> %Run -c $EDITOR_CONTENT

MPY: soft reboot

Connecting to Device(ADDR_RANDOM, d7:de:4c:f4:56:e5) ... Timeout.
Traceback (most recent call last):
  File "<stdin>", line 47, in print_details
  File "aioble/device.py", line 149, in connect
  File "aioble/central.py", line 140, in _connect
  File "aioble/device.py", line 94, in __exit__
TimeoutError: 

enter return (to retry) / q(uit). > 

Connecting to Device(ADDR_RANDOM, d7:de:4c:f4:56:e5, CONNECTED) ... Connected.
Error (discovering services): can't convert NoneType to int
Traceback (most recent call last):
  File "<stdin>", line 24, in device_details
  File "<stdin>", line 14, in gather
  File "aioble/client.py", line 128, in __anext__
  File "aioble/client.py", line 120, in _start
  File "aioble/client.py", line 193, in _start_discovery
TypeError: can't convert NoneType to int

enter return (to retry) / q(uit). > 

  • How to fix it
    I added exception handling because of the following problem. (It is not a good idea to fix it in this place.)
    • When the timeout occurs, the variable "_connection" in aioble is not in the correct state.
    • When the timeout occurs, the internal state of the module "bluetooth" seems to remain connected.
# https://github.com/micropython/micropython-lib/blob/master/micropython/bluetooth/aioble/aioble/central.py#L107

async def _connect(
    connection, timeout_ms, scan_duration_ms, min_conn_interval_us, max_conn_interval_us
):

                << Omitted >>

    try:
        with DeviceTimeout(None, timeout_ms):
            ble.gap_connect(
                device.addr_type,
                device.addr,
                scan_duration_ms,
                min_conn_interval_us,
                max_conn_interval_us,
            )

            # Wait for the connected IRQ.
            await connection._event.wait()
            assert connection._conn_handle is not None

            # Register connection handle -> device.
            DeviceConnection._connected[connection._conn_handle] = connection
    except:
        device._connection = None
        ble.gap_connect(None)
        raise
    finally:
        # After timeout, don't hold a reference and ignore future events.
        _connecting.remove(device)
@brianreinhold
Copy link

I am facing a similar issue which appears to be related. I have a crappy device that after it disconnects it continues to send trailing connectable advertisements but when a connection attempt is made, the device simply does not respond. Its advertisements also cease. Eventually the connection attempt times out, but the system remains 'connected' in some sense and the system is corrupted. If I take a measurement with another device that is known (a bonded reconnect) I get EINVAL 22 errors all over the place. If, instead, I try and connect to the original device that had the trailing advertisements I get a connection event in the aioble event handler but nothing further happens. The only recovery is to reboot that application.

@brianreinhold
Copy link

I think I have found the reason for the issue and the solution. When the connection attempt times out, aioble fails to cancel the ongoing connection. If you then make tour device discoverable again, the connection will complete (the IRQ gets handled) but there is no longer any support for it so it goes nowhere. If you try to connect with another device, when aioble calls the ble.gap_connect() method it is already established for a different device and you will get EINVAL 22 errors like crazy.

I was able to solve the issue by calling ble.gap_connect(None) which does a cancel at the low levels in the device.py file

        try:
            if exc_type == asyncio.CancelledError:
                # Case 2, we started a timeout and it's completed.
                if self._timeout_ms and self._timeout_task is None:
                    ble.gap_connect(None)  # Cancel ongoing connection
                    raise asyncio.TimeoutError

I am assuming this timeout get signaled ONLY when the connection attempt times out.

@63rabbits
Copy link
Author
  • About ble.gap_connect(None):

    Since DeviceTimeout is used in read/write/notify/indicate for timeout, it is better not to call ble.gap_connect(None) in device.py. It may not have a negative effect, as it only cancels the connection attempt. However, it is better not to run it.
    As ble.gap_connect() is only called in one place in central.py, you can limit the extent of its effect by running ble.gap_connect(None) nearby.

# https://github.com/micropython/micropython-lib/blob/master/micropython/bluetooth/aioble/aioble/client.py#L253

        with self._connection().timeout(timeout_ms):

  • About "device._connection = None":

    At the moment I am not sure that it is correct to clear _connection directly in the exception handler as described above. Do you have a better idea?

@brianreinhold
Copy link

I would like to be able to call the cancel connect on the timeout of the connect attempt, but it was not clear how to do that. I could not find a way to identify the cause of the timeout so I handled it in the timeout. So far it solves the problem I have and has not introduced a problem.

How would I do this?
you can limit the extent of its effect by running ble.gap_connect(None) nearby.

I could not see how to do it in this block of code

with self._connection().timeout(timeout_ms):

which is where I would really like to do it.

@63rabbits
Copy link
Author
63rabbits commented Jan 31, 2025

Sorry for the poor explanation. (I use a translator.)

Call ble.gap_connect(None) in the exception handler as follows.
# ble.gap_connect() is only executed at this location, so it is better to execute ble.gap_connect(None) near here.

# https://github.com/micropython/micropython-lib/blob/master/micropython/bluetooth/aioble/aioble/central.py#L107

async def _connect(
    connection, timeout_ms, scan_duration_ms, min_conn_interval_us, max_conn_interval_us
):

                << Omitted >>

    try:
        with DeviceTimeout(None, timeout_ms):
            ble.gap_connect(
                device.addr_type,
                device.addr,
                scan_duration_ms,
                min_conn_interval_us,
                max_conn_interval_us,
            )

            # Wait for the connected IRQ.
            await connection._event.wait()
            assert connection._conn_handle is not None

            # Register connection handle -> device.
            DeviceConnection._connected[connection._conn_handle] = connection
#     except:
    except asyncio.TimeoutError:
        device._connection = None   # Clear connection.
        ble.gap_connect(None)       # Cancel ongoing connection.
        raise
    finally:
        # After timeout, don't hold a reference and ignore future events.
        _connecting.remove(device)

In the following, the DeviceTimeout class is used to show where ‘read’ starts the timeout.

with self._connection().timeout(timeout_ms):

In the code below, gap_connect(None) is called on a 'read' timeout.

        try:
            if exc_type == asyncio.CancelledError:
                # Case 2, we started a timeout and it's completed.
                if self._timeout_ms and self._timeout_task is None:
                    ble.gap_connect(None)  # Cancel ongoing connection
                    raise asyncio.TimeoutError

@anacrolix
Copy link

I statically analyzed the code and found the same issue. There's a missing gap_connect(None) after timeouts. I can make a PR?

@63rabbits
Copy link
Author

Yes, please create a PR.
I've never done PR.
I was wondering what to do.
Thanks.

@brianreinhold
Copy link

The company I worked for folded so I am not doing this anymore and have gone back to meteorological research vs Bluetooth health devices. I will say that the fix I used has worked and has not caused any unwanted side effects in operation.

Some other issues you might want to look at in this PR is better handling of disconnects from the remote peer. There needs to be a more direct connection between aioble and the application to assure that the application does not try to do something as if connected when in fact it is disconnected. As it is know, by the time the application gets signaled of the disconnect the application can try and do something that it should not do in the disconnected state. That will cause catastrophic failures in aioble. I had to create an application 'global' that was set in aioble in the disconnect callback. The application would use this variable to check the connection state instead of the aioble APIs.

The other big ticket item is pairing/bonding support. It does not exist for the PICO-W. We had to do a lot of low level work to support bonding and bonded reconnects using the BlueKitchen btStack. We could not operate with many medical health devices without it. In this task one needs to consider

  • peripheral security requests
  • insufficient authentication errors

which are handled differently in the btStack (unfortunately). The btStack handles the security requests internally (but the application does not know it so you have to add code to tell the application that pairing/encryption has happened). The btStack does not handle the insufficient authentication error and that needs to be passed up to the application so it can invoke pairing/encryption

  • pairing passkeys also need to be supported.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants
0