Replies: 1 comment
-
|
from pathlib import Path Try this import first (most common in 2025–2026 releases)try: INPUT_PDF = Path("out.pdf") def extract_all_text_robust(pdf_path: Path) -> str: ) def main(): if name == "main": |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hey everyone,
I'm currently exploring the capabilities of docling for parsing PDF files and converting them into markdown. In my tests I've noticed that header images were not recognized as furniture, while the footer was successfully identified. I don't want to have the header image in the exported markdown file.
Are there options to:
Or are there any other options to exclude the header images from the exported markdown?
Thank you!
Beta Was this translation helpful? Give feedback.
All reactions