Skip to content

pdfix/pdfix-accessibility-automation-example

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Here's a comprehensive README.md file for your GitHub repository based on the three code sources you provided:

PDF/UA Compliance Automation Tool

Overview

This tool provides automated PDF/UA compliance processing using the PDFix SDK. It performs tagging, validation, and correction of PDF documents to meet PDF/UA or WCAG 2.2 accessibility standards.

Features

  • Automatic PDF tagging - Adds proper structure and tags for accessibility
  • PDF/UA validation - Checks compliance with PDF/UA standards
  • Automated fixes - Corrects common compliance issues
  • Validation reporting - Provides detailed XML reports of compliance issues

Requirements

  • Python 3.6+
  • PDFix SDK (properly installed and licensed)
  • Java Runtime Environment (for validation tool)
  • greenfield-apps-1.27.0-SNAPSHOT.jar validation tool veraPDF

Installation

  1. Clone this repository and initialize python virtial environemnt

    Linux / macOS

    python3 -m venv .venv
    source .venv/bin/activate

    Windows:

    python3 -m venv .venv
    env/Scripts/activate
  2. Install dependencies:

    pip install pdfix-sdk

    or

    pip install -r requirements
  3. Ensure Java is installed and in your PATH

  4. Place the validation JAR file in the expected location:

    {project_root}/validation/greenfield-apps-1.27.0-SNAPSHOT.jar
    

Usage

Command Line

python main.py

Processing Workflow

  1. Opens the input PDF (pdf/example.pdf)
  2. Automatically tags the document
  3. Performs initial validation (saves to pdf/validate.pdf)
  4. Applies necessary fixes based on validation results
  5. Performs final validation (saves to pdf/tagged.pdf)

Customization

Modify these paths in main.py as needed:

inputPath = "pdf/example.pdf"
validatePath = "pdf/validate.pdf"
taggedPath = "pdf/tagged.pdf"

Code Structure

File Description
main.py Main execution script
pdf.py Contains PDF processing functions (autotagPdf, fixUaClauses)
validation.py Handles PDF/UA validation and report parsing

Key Functions

  • autotagPdf(doc: PdfDoc) - Adds accessibility tags to PDF
  • fixUaClauses(doc: PdfDoc, rules: list) - Applies fixes for specific PDF/UA clauses
  • validatePdf(doc: PdfDoc, pdfPath: str) - Validates PDF and returns rule violations
  • runJavaValidation(pdfPath) - Executes Java validation tool
  • parseValidationReport(xmlReport: str) - Parses XML validation results

Supported PDF/UA Clauses

The tool currently handles fixes for these PDF/UA requirements:

  • Clause 5 (PDF/UA identification)
  • Clause 7.1 (Document title display)
  • Clause 7.2 (Language specification)

For more clauses and accessibility actions visit https://pdfix.net/products/pdfix-sdk/actions/

Error Handling

The script will:

  • Exit with code 1 on critical errors
  • Print detailed error messages to stderr
  • Preserve validation reports for troubleshooting

Output

  • Processed PDF files in the pdf/ directory
  • Validation reports printed to console
  • Final compliance status message

License

PDFix License

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages