Skip to content

Conversation

Copilot
Copy link
Contributor

@Copilot Copilot AI commented Sep 27, 2025

This PR implements support for iOS-OCR-Server as a new OCR provider, enabling users to turn their iPhone or iPad into a powerful local OCR server using Apple's Vision Framework.

What's New

iOS-OCR-Server Provider (ios_ocr)

  • Leverages Apple's native Vision Framework for high-quality OCR
  • Provides completely local processing with no cloud dependencies
  • Offers unlimited usage without API costs
  • Returns both OCR text and bounding box coordinates

Key Features

  • Privacy-Focused: All OCR processing happens locally on your iOS device
  • Cost-Effective: No per-request charges or API limits
  • High Quality: Uses Apple's advanced Vision Framework
  • Easy Setup: Simple HTTP API integration with multipart file uploads
  • Robust Error Handling: Comprehensive error handling and retry logic

Configuration

Add the following environment variables to use iOS-OCR-Server:

OCR_PROVIDER: "ios_ocr"
IOS_OCR_SERVER_URL: "http://your-ios-device-ip:8080"

Implementation Details

The implementation follows the established OCR provider patterns in the codebase:

  • Provider Interface: Implements the standard Provider interface with ProcessImage() method
  • HTTP Client: Uses retryablehttp.Client for reliable communication with iOS devices
  • Response Parsing: Handles the iOS-OCR-Server JSON response format including OCR text and bounding boxes
  • Metadata: Captures image dimensions and bounding box count in OCR results
  • Validation: Proper configuration validation with helpful error messages

Testing

  • Comprehensive test suite with 100% pass rate
  • Tests cover successful processing, error handling, and edge cases
  • Integration tests verify provider factory registration
  • Manual testing confirms proper application startup and validation

Documentation

Updated README.md with:

  • Complete setup instructions for iOS-OCR-Server
  • Environment variable documentation
  • Docker Compose configuration examples
  • Feature comparison with other OCR providers

This addition expands paperless-gpt's OCR capabilities to include a privacy-focused, cost-effective option for users with iOS devices, complementing the existing cloud-based and self-hosted OCR providers.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • esm.ubuntu.com
    • Triggering command: /usr/lib/apt/methods/https (dns block)
  • openaipublic.blob.core.windows.net
    • Triggering command: /tmp/go-build1921101790/b001/paperless-gpt.test -test.testlogfile=/tmp/go-build1921101790/b001/testlog.txt -test.paniconexit0 -test.timeout=10m0s (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Original prompt

This section details on the original issue you should resolve

<issue_title>New OCR Provider: iOS-OCR-Server</issue_title>
<issue_description>https://github.com/riddleling/iOS-OCR-Server

Turn your iPhone into a powerful local OCR server using Apple's Vision Framework. No cloud dependencies, unlimited usage, complete privacy.

It has API, should be easy to integrated

I run it with the LLM-Based OCR: Compare for Yourself examples on the index.

OCR Results

Example 1

{
  "message": "File uploaded successfully",
  "image_width": 446,
  "ocr_result": "La Grande Récré\nCentre Comercial L'EspLanade\n1348 LOUVAIN LA NEUVE\nTVA 860826401 Tel : 010 45 95 12\nTicket 14421 1e 03/11/2006 à 13:27:18\n4007176614518 DINOS TYRANNOSA\n14.90\nTOTAL T.T.C.\n14.\nRéglé par Cash Euro\n50.00\nRendu en Cash Euro\n35.10\nV.14.6\nHotesse : VALERIE\nTICKET A CONSERVER POUR TOUT ECHANGE\nMERCI ET A BIENTOT",
  "ocr_boxes": [
    {
      "text": "La Grande Récré",
      "w": 388,
      "x": 30.000001888888843,
      "h": 45.99999999999997,
      "y": 35.999999744444466
    },
    {
      "text": "Centre Comercial L'EspLanade",
      "w": 282,
      "x": 70.00000105999999,
      "h": 24.000000000000014,
      "y": 83.99999976
    },
    {
      "text": "1348 LOUVAIN LA NEUVE",
      "w": 208.00000000000003,
      "x": 108.00000183999995,
      "h": 26.000000000000043,
      "y": 107.99999973999998
    },
    {
      "text": "TVA 860826401 Tel : 010 45 95 12",
      "w": 310,
      "x": 51.9999977857143,
      "h": 24,
      "y": 138.0000000857143
    },
    {
      "text": "Ticket 14421 1e 03/11/2006 à 13:27:18",
      "w": 378.7994079589844,
      "x": 33.59443994401789,
      "h": 32.84000205993652,
      "y": 158.24676726130787
    },
    {
      "text": "4007176614518 DINOS TYRANNOSA",
      "w": 286.5103759765625,
      "x": 31.73426893709754,
      "h": 28.93005561828616,
      "y": 187.82401693712825
    },
    {
      "text": "14.90",
      "w": 58,
      "x": 345.99999942,
      "h": 24.00000000000003,
      "y": 188.00000007999998
    },
    {
      "text": "TOTAL T.T.C.",
      "w": 118.1504211425781,
      "x": 129.92479027146985,
      "h": 26.691444396972628,
      "y": 215.65427778228616
    },
    {
      "text": "14.",
      "w": 26,
      "x": 358.000000325,
      "h": 19.99999999999997,
      "y": 215.99999985000005
    },
    {
      "text": "Réglé par Cash Euro",
      "w": 186,
      "x": 33.99999838000002,
      "h": 24,
      "y": 244.00000023999996
    },
    {
      "text": "50.00",
      "w": 52,
      "x": 360.00000009999997,
      "h": 26.00000000000003,
      "y": 240.00000021666662
    },
    {
      "text": "Rendu en Cash Euro",
      "w": 178,
      "x": 32.00000097999998,
      "h": 24,
      "y": 269.99999976000004
    },
    {
      "text": "35.10",
      "w": 54.46428680419922,
      "x": 359.7678571359908,
      "h": 26.978559494018555,
      "y": 265.5107201431787
    },
    {
      "text": "V.14.6",
      "w": 64.24905395507812,
      "x": 31.87547306498469,
      "h": 24.672264099121094,
      "y": 295.6638677108738
    },
    {
      "text": "Hotesse : VALERIE",
      "w": 176.00000000000003,
      "x": 119.99999780000005,
      "h": 24,
      "y": 296.00000009999997
    },
    {
      "text": "TICKET A CONSERVER POUR TOUT ECHANGE",
      "w": 352,
      "x": 31.99999648000005,
      "h": 26,
      "y": 320.00000014
    },
    {
      "text": "MERCI ET A BIENTOT",
      "w": 176.00000000000003,
      "x": 139.99999853333335,
      "h": 26,
      "y": 346.0000001166667
    }
  ],
  "success": true,
  "image_height": 408
}

Example 2

{
  "ocr_result": "FedEx\nInvoice Number: 1-996-84199\nInvoice Date:\nSep 01, 2014\nAccount Number:\n1334-8037-4\nPage:\n1 of 2\nFedEx Tax ID: 71-0427007\nIRIS INC\nSHARON ANDERSON\n4731 W ATLANTIC AVE STE B1\nDELRAY BEACH FL 33445-3897\nBilling Account Shipping Address:\nIRIS INC\n4731W ATLANTIC AVE\nDELRAY BEACH FL 33445-3897 US\nRECEIVED\nSEP\n8 RECD\nBY:\nInvoice Questions?\nContact FedEx Revenue Services\nPhone:\nFax:\n(800) 622-1147 M-F 7-6 (CST)\n(800) 548-3020\nInternet:\nwww.fedex.com\nInvoice Summary Sep 01, 2014\nFedEx Ground Services\nOther Charges\nTotal Charg........\nTOTAL THIS INVOICE ....\n. USD S\nUSD S\nThe only charges accrued for this period is the Weekly Service Charge.\n11.00\n11.00\n11.00\npasted 9/21/14\nThe FedEx Ground accounts referenced in this invoice have been transferred and assigned to, are owned by, and are payable to FedEx Express.\nTo ensure proper credit, please return this portion with your payment to FedEx.\nPlease do not staple or fold. Please make your check payable to FedEx.\n• For change of address, check here and complete form craverse ride\nRemittance Advice\nYour payment is due by Sep 16, 2004\nInvoice\nNumber\n1-996-84199\n133480371996841993200000110073\nAccount\nNumber\n1334-8037-4\nAmount\nDue\nUSD $ 11.00\nAT 01 031292 46844B196 A**3DGT\nIRIS ...

</details>
Fixes icereed/paperless-gpt#652

<!-- START COPILOT CODING AGENT TIPS -->
---

💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more [Copilot coding agent tips](https://gh.io/copilot-coding-agent-tips) in the docs.

@Copilot Copilot AI changed the title [WIP] New OCR Provider: iOS-OCR-Server Add iOS-OCR-Server provider support for local OCR processing Sep 27, 2025
@Copilot Copilot AI requested a review from icereed September 27, 2025 06:19
Copilot finished work on behalf of icereed September 27, 2025 06:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants