OCRFlux API - FastAPI Wrapper for ChatDOC's OCRFlux

A production-ready FastAPI wrapper around ChatDOC's OCRFlux, providing REST API endpoints for PDF processing with state-of-the-art OCR capabilities. OCRFlux delivers superior document understanding by converting PDFs and images to high-quality Markdown format.

Features

Official OCRFlux Integration - Uses ChatDOC's OCRFlux-3B model
Automatic Model Download - Downloads OCRFlux-3B model automatically on first run
FastAPI REST API with automatic documentation and validation
Health check endpoint for monitoring service status
PDF processing from URLs with comprehensive OCR extraction
Docker containerization with GPU support
Persistent Model Storage - Models are cached in Docker volumes
Markdown output with advanced document structure understanding
Multi-language support for global document processing
Production-ready with proper error handling and logging

Requirements

Hardware Requirements

NVIDIA GPU with at least 12GB of GPU RAM (tested on RTX 3090, 4090, L40S, A100, H100)
20GB free disk space for models and processing
16GB system RAM recommended

Software Requirements

Docker with NVIDIA GPU support
Docker Compose

Quick Start

1. Clone and Deploy

# Clone this repository
git clone <repository-url>
cd ocrflux

# Build and start the service
docker-compose up --build

# The service will automatically:
# 1. Download OCRFlux-3B model from HuggingFace (first run only)
# 2. Cache the model in a Docker volume
# 3. Start the FastAPI server

2. Access the API

API Base URL: http://localhost:8000
Interactive API Docs: http://localhost:8000/docs
Alternative Docs: http://localhost:8000/redoc

3. First Run

On the first run, the service will:

Download the OCRFlux-3B model (~3-5GB) from HuggingFace
Cache it in a Docker volume for future use
This may take 10-15 minutes depending on your internet connection

API Endpoints

Health Check

Monitor service status, GPU availability, and model download status.

Endpoint: GET /health

Response:

{
  "status": "healthy",
  "timestamp": "2024-01-15T10:30:00.000Z",
  "service": "OCRFlux-API",
  "version": "1.0.0",
  "gpu_available": true,
  "model_ready": true
}

Model Information

Get details about the loaded OCRFlux model and download status.

Endpoint: GET /model-info

Response:

{
  "model_name": "OCRFlux-3B",
  "model_path": "/OCRFlux-3B",
  "provider": "ChatDOC",
  "description": "OCRFlux is a state-of-the-art OCR model for document understanding",
  "supports_languages": "Multi-language support",
  "output_format": "Markdown",
  "gpu_required": true,
  "min_gpu_memory": "12GB",
  "model_downloaded": true,
  "huggingface_repo": "ChatDOC/OCRFlux-3B",
  "auto_download": true
}

Process PDF

Extract structured content from PDF documents as Markdown.

Endpoint: POST /process-pdf

Request Body:

{
  "pdf_url": "https://example.com/document.pdf",
  "options": {
    "max_page_retries": 3,
    "gpu_memory_utilization": 0.8,
    "max_model_len": 8192
  }
}

Response:

{
  "success": true,
  "message": "PDF processed successfully with OCRFlux",
  "processing_time": 45.67,
  "data": {
    "success": true,
    "document_text": "# Document Title\n\nThis is the extracted markdown content...",
    "metadata": {
      "original_path": "/tmp/input.pdf",
      "num_pages": 10,
      "fallback_pages": [],
      "processing_method": "OCRFlux-3B"
    },
    "pages": [
      {
        "page_number": 1,
        "text": "# Page 1 Content\n\nMarkdown formatted content...",
        "format": "markdown",
        "text_length": 1524
      }
    ],
    "summary": {
      "total_pages": 10,
      "processed_pages": 10,
      "fallback_pages": 0,
      "total_text_length": 15240,
      "processing_time": 45.67,
      "has_fallbacks": false
    }
  }
}

Configuration

Environment Variables

Configure the service using environment variables in docker-compose.yml:

Variable	Default	Description
`MODEL_PATH`	`/OCRFlux-3B`	Path to OCRFlux model in container
`HUGGINGFACE_MODEL_NAME`	`ChatDOC/OCRFlux-3B`	HuggingFace model repository
`AUTO_DOWNLOAD_MODEL`	`true`	Enable automatic model downloading
`MAX_PDF_SIZE`	`104857600`	Max PDF size in bytes (100MB)
`MAX_PAGES`	`1000`	Maximum pages to process
`MAX_PAGE_RETRIES`	`3`	Retry failed pages
`REQUEST_TIMEOUT`	`600`	Request timeout in seconds
`GPU_MEMORY_UTILIZATION`	`0.8`	GPU memory utilization for vLLM
`MAX_MODEL_LEN`	`8192`	Maximum model context length
`LOG_LEVEL`	`INFO`	Logging level

Custom Model Storage

By default, the model is stored in a Docker-managed volume. For custom storage:

Option 1: Bind Mount (Custom Host Directory)

volumes:
  # Replace Docker volume with bind mount
  - /path/to/your/models:/OCRFlux-3B

Option 2: Keep Docker Volume (Recommended)

# Find the volume location
docker volume inspect ocrflux_model

# Access the volume data
docker run --rm -v ocrflux_model:/models alpine ls -la /models

OCRFlux Options

The options parameter in the API request supports:

max_page_retries: Number of retries for failed pages
gpu_memory_utilization: GPU memory usage (0.0-1.0)
max_model_len: Maximum context length

Usage Examples

Using curl

# Health check
curl -X GET http://localhost:8000/health

# Get model information
curl -X GET http://localhost:8000/model-info

# Process PDF
curl -X POST http://localhost:8000/process-pdf \
  -H "Content-Type: application/json" \
  -d '{
    "pdf_url": "https://example.com/sample.pdf",
    "options": {
      "max_page_retries": 3,
      "gpu_memory_utilization": 0.8
    }
  }'

Model Management

Automatic Download

The service automatically downloads the OCRFlux-3B model on first run:

First Run: Downloads model from HuggingFace (~3-5GB)
Subsequent Runs: Uses cached model from Docker volume
Model Location: Stored in Docker-managed volume
Download Methods:
- Primary: HuggingFace Hub library
- Fallback: Git clone from HuggingFace

Manual Model Management

To manually manage the model:

# Check model status
curl http://localhost:8000/model-info

# View volume contents
docker run --rm -v ocrflux_model:/models alpine ls -la /models

# Remove model cache (will re-download on next start)
docker volume rm ocrflux_model

# Start fresh
docker-compose down
docker volume rm ocrflux_model
docker-compose up --build

Monitoring and Troubleshooting

GPU Monitoring

Check GPU usage and memory:

# Monitor GPU usage
nvidia-smi

# Watch GPU usage in real-time
watch -n 1 nvidia-smi

Service Logs

View service logs:

# View logs
docker-compose logs -f ocrflux

# Check for GPU-related issues
docker-compose logs ocrflux | grep -i gpu

# Check model download progress
docker-compose logs ocrflux | grep -i download

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
app		app
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

OCRFlux API - FastAPI Wrapper for ChatDOC's OCRFlux

Features

Requirements

Hardware Requirements

Software Requirements

Quick Start

1. Clone and Deploy

2. Access the API

3. First Run

API Endpoints

Health Check

Model Information

Process PDF

Configuration

Environment Variables

Custom Model Storage

OCRFlux Options

Usage Examples

Using curl

Model Management

Automatic Download

Manual Model Management

Monitoring and Troubleshooting

GPU Monitoring

Service Logs

About

Uh oh!

Releases

Packages

Uh oh!

Languages

ankesh7/OCRFlux-API

Folders and files

Latest commit

History

Repository files navigation

OCRFlux API - FastAPI Wrapper for ChatDOC's OCRFlux

Features

Requirements

Hardware Requirements

Software Requirements

Quick Start

1. Clone and Deploy

2. Access the API

3. First Run

API Endpoints

Health Check

Model Information

Process PDF

Configuration

Environment Variables

Custom Model Storage

OCRFlux Options

Usage Examples

Using curl

Model Management

Automatic Download

Manual Model Management

Monitoring and Troubleshooting

GPU Monitoring

Service Logs

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages