Initial PoC for code autogeneration #4284

harshil21 · 2024-05-31T04:56:16Z

Using libcst, it is possible to modify the source code programmatically. This is better than regex since that can be very error prone.

For now will support only simple operations such as adding a new parameter to method.

Will also need its own tests.

Progress:

Core:

Add a single parameter to a bot method programmatically
Do it for all failing tests
also do it for extbot.py
Implement parsing of failing user/chat/message shortcuts tests and use that to update those methods

Refactoring:

Make use of https://libcst.readthedocs.io/en/latest/codemods_tutorial.html ?
??

Bibo-Joshi · 2024-05-31T07:10:13Z

Looks very interesting! I never fail to be amazed of what people have buid for Python and what you can do with it :)

Let me play the devils advocate as usual 😈

My guess is that this approach will provide us with a best-effort starting point at best. Full, correct & stable automation will be extremely hard to reach for all of ptbs convenience functionality (and that's actually the point of tdlib/telegram-bot-api#41 (comment) 😅 )

That's why I would personally view this kind of tool more like https://github.com/python-telegram-bot/ptb-changelog-helper/:

a tool useful for the maintainers of PTB, but for rarely someone else
no stability of the interface, no strict guarantees on functionality
output needs manual checking in any case
same result can be achieved by different means and/or manual work

I'm therefore hesitant to add this to the main repository. I would instead propose to create a standalone repository for it in the ptb organization. The dependency on the scraper functionality from test_official could be resolved by also moving that to a standalone repository that both ptb-api-update-helper and python-telegram-bot/requirements-dev.txt can rely on. A ptb-bot-api-parser could even be somewhat useful for other people (e.g. we could use pydantic instead of dataclasses and auto-generate an OpenAPI schema).

What do you think?

harshil21 · 2024-05-31T08:06:23Z

The fact that we need some tool like this at all is kinda sad, since >80% of any Bot API update is just boilerplate changes.

My guess is that this approach will provide us with a best-effort starting point at best

yes, it's intended to shorten developer time, not eliminate it - that's the goal of an AI tool

I would instead propose to create a standalone repository for it in the ptb organization.

agreed, that's why I already isolated it from everything else. I only added it here for now because its easier to access test_official and the telegram files. If it's outside the repository, we will have to specify the path to find the source files every time.

The dependency on the scraper functionality from test_official could be resolved by also moving that to a standalone repository that both ptb-api-update-helper and python-telegram-bot/requirements-dev.txt can rely on

interesting.. that could make test_official a little harder to change/refactor. If we'll go with this, test_official.scraper will need to be more "independent" than it currently is.

I also did identify other functions inside test_official which will need to be used by this tool, like the TYPE_MAPPING and its accompanying mapper. So either way we need to preprocess the data well, either through our current test_official method or a 3rd party/in-house api schema for code autogeneration to be reliable.

The fact that we need some tool like this at all is kinda sad, since >80% of any Bot API update is just boilerplate changes.

My guess is that this approach will provide us with a best-effort starting point at best

yes, it's intended to shorten developer time, not eliminate it - that's the goal of an AI tool

👍

I would instead propose to create a standalone repository for it in the ptb organization.

agreed, that's why I already isolated it from everything else.

👍

I only added it here for now because its easier to access test_official and the telegram files. If it's outside the repository, we will have to specify the path to find the source files every time.

I would expect that a config file can largely circumvent that :)

The dependency on the scraper functionality from test_official could be resolved by also moving that to a standalone repository that both ptb-api-update-helper and python-telegram-bot/requirements-dev.txt can rely on

interesting.. that could make test_official a little harder to change/refactor. If we'll go with this, test_official.scraper will need to be more "independent" than it currently is.

I also did identify other functions inside test_official which will need to be used by this tool, like the TYPE_MAPPING and its accompanying mapper. So either way we need to preprocess the data well, either through our current test_official method or a 3rd party/in-house api schema for code autogeneration to be reliable.

Good points! It we go that way, those functions should then also be part of the external repos interface.

To summarize the dicsussion in the dev chat a bit:

there are some 3rd party repos that parse the api docs to openapi/custom schemas (see below). One could consider building up on those, e.g. in combination with https://github.com/koxudaxi/datamodel-code-generator/
keeping the code of test_official as is is a valid option even if a 3rd party API parser is used for a ptb-api-update-helper
rough criteria for using 3rty party tooling:
criteria would be roughly:
- for test_official we need something that we can query on demand (i.e. parse the docs every time we run test_official) and that we see as sufficiently stable
- for a ptb-api-update-helper we'd need something that we see as sufficiently stable and is updated quickly enough after the api updates

3rd party tools that I've found in 20min search:

~~https://botapi.apimatic.dev/#/http/getting-started only known 5.0 apparnetly~~
https://github.com/sys-001/telegram-bot-api-versions knows only up to 7.0, new versions apparenlty need to be requested via ticket. supports return types
https://github.com/PaulSonOfLars/telegram-bot-api-spec seems to be active. supports return types
https://github.com/alserom/telegram-bot-api-spec seems rather active. Has automated PRs. Provdies the specs also in a custom schema tailored to the bot api in additionan to "plain" openapi. supports return types
~~https://github.com/tranql/telegram-bot-api-schema looks abandoned~~
~~https://github.com/eliask/telegram-bot-api-schema looks abondened~~
https://github.com/ark0f/tg-bot-api | https://ark0f.github.io/tg-bot-api/ looks up to date, rust based, has a custom schema as well. Supports return types

Pydantic types projects:

~~https://github.com/devtud/pygramtic looks abandoned~~
~~https://github.com/isys35/pyteledantic looks abandoned~~

Bibo-Joshi added 🔌 enhancement and removed WIP labels

Nov 3, 2024

I vote to close this since we have AI agents now. While it would be a fun thing to code this up, maintaining this itself would be like adjusting test official for every api update. I haven't used AI agents yet, but this would be the kind of thing you would just tell it to copy and patch. The new VSCode Copilot agent sounds pretty promising, we should probably give it a shot and see how it does.

I agree on closing. AI tools do help with the updates, though I personally still see customized tooling for specific repeating tasks as valid. You usually want deterministic output for that, which is not as easy with AI :D
My gut feeling is that we'd either have to rebuild the API wrapper part from ground up based on Auto-Generation or don't do Auto-Generation at all ...

Bibo-Joshi closed this

Apr 16, 2025

github-actions locked and limited conversation to collaborators

Apr 24, 2025

initial PoC for code autogeneration

0670e42

harshil21 added enhancement labels May 31, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Initial PoC for code autogeneration #4284

Initial PoC for code autogeneration #4284

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Initial PoC for code autogeneration #4284

Initial PoC for code autogeneration #4284

Uh oh!

Conversation

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants