RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
-
Updated
Oct 2, 2024 - Python
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
A Repo For Document AI
A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Tongyi Lab, Alibaba Group.
mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding
A curated list of resources for Document Understanding (DU) topic
Code for the paper "PICK: Processing Key Information Extraction from Documents using Improved Graph Learning-Convolutional Networks" (ICPR 2020)
Official PyTorch implementation of LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding (ACL 2022)
Sample applications and demos for Document AI, the end-to-end document processing platform on Google Cloud
Minimal sharded dataset loaders, decoders, and utils for multi-modal document, image, and text datasets.
Algorithms, papers, datasets, performance comparisons for Document AI. Continuously updating.
A Curated List of Awesome Table Structure Recognition (TSR) Research. Including models, papers, datasets and codes. Continuously updating.
Doc2Graph transforms documents into graphs and exploit a GNN to solve several tasks.
ReadingBank: A Benchmark Dataset for Reading Order Detection
Object Detection Model for Scanned Documents
Checkbox Detection Model for Scanned Documents
Datasets and Evaluation Scripts for CompHRDoc
TAT-DQA: Towards Complex Document Understanding By Discrete Reasoning
Official release of RFUND introduced in the MM'2024 paper "PEneo: Unifying Line Extraction, Line Grouping, and Entity Linking for End-to-end Document Pair Extraction"
Implementation of the paper: Going Full-TILT Boogie on Document Understanding with Text-Image-Layout Transformer.
[MM'2024] Official implementation of "PEneo: Unifying Line Extraction, Line Grouping, and Entity Linking for End-to-end Document Pair Extraction."
Add a description, image, and links to the document-understanding topic page so that developers can more easily learn about it.
To associate your repository with the document-understanding topic, visit your repo's landing page and select "manage topics."