A Python CLI to test, benchmark, and find the best RAG chunking strategy for your Markdown documents.
-
Updated
Nov 23, 2025 - Python
8000
A Python CLI to test, benchmark, and find the best RAG chunking strategy for your Markdown documents.
Chunk smarter, not harder — built for LLMs, RAG pipelines, and beyond.
A lightweight Python library for metadata-rich document chunking in Retrieval-Augmented Generation (RAG) workflows. It leverages Azure AI Document Intelligence to enhance chunking by retaining hierarchical structure, page numbers, and bounding boxes for seamless integration with PDF viewers.
"My complete LangChain learning journey — from basics to advanced RAG, LCEL, LangGraph, LangServe, LangSmith with hands-on code examples."
This repository provides a fully modular implementation of a Retrieval-Augmented Generation (RAG) pipeline tailored for Italian legal-domain documents.
Add a description, image, and links to the document-chunking topic page so that developers can more easily learn about it.
To associate your repository with the document-chunking topic, visit your repo's landing page and select "manage topics."