Replies: 2 comments 3 replies
-
|
For Code blocks we have the CodeFormula vlm which is really tuned for doing the OCR task of reading code, i.e. taking care of proper syntax, spaces, etc. The current implementation relies on the model to use the transformers runtime (sometime problematic, like on macos) but we are just about to generalize the model runtimes to allow it to also run as API or mlx, etc. |
Beta Was this translation helpful? Give feedback.
-
|
Thanks for a quick reply! I guess the same problem as reported here #2833 Understood - will try it out, but just to be sure, there is no way to treat the code blocks as any other images? Or in more general way: extract text only and send all the rest of found figures/blocks into external VLM for transcriptions as images? My reasoning is that properly deployed VLM can be very performant and/or have better outputs than small internal Docling models. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi, I noticed that code blocks are ignored with disabled OCR. My plan was to send the code blocks to external VLM for transcription as any other image, but that does not work. General Images and diagrams work, code blocks do not.
Having expensive OCR enabled only for that does not seem right (and its slow).
Or is that somehow configurable? Thanks!
Beta Was this translation helpful? Give feedback.
All reactions