PDF Summary Extraction API

This project provides an API for uploading PDF documents and extracting structured summaries using either OpenAI's GPT-4o or Anthropic's Claude models. It is optimized for environmental and agricultural reports, transforming unstructured PDF content into structured JSON for downstream use.

🛠️ Tech Stack

Framework: Node.js with Express
PDF Parsing: pdf-parse
LLM Integration: OpenAI GPT-4o and Claude Opus 4
Storage: In-memory via Multer
Language: TypeScript

🚀 Features

Upload a PDF and extract structured data in a single API call
Choose between OpenAI or Claude as LLM provider
Validates and cleans JSON response for consistent schema
Returns fallback empty structure in case of model or parsing failure

📦 Setup Instructions

Clone the Repository

git clone https://github.com/your-org/pdf-extractor.git
cd pdf-extractor

Install Dependencies
```
npm install
```

Environment Configuration Create a .env file:

OPENAI_API_KEY=your_openai_key
ANTHROPIC_API_KEY=your_claude_key

Run the Server
```
npm run dev
```

🧪 API Usage

POST `/upload`

Upload a PDF and extract a summary:

Form Data:

file: PDF file (multipart/form-data)

Optional Query:

provider: openai (default) or claude

Example with cURL:

curl -X POST http://localhost:3000/upload?provider=claude   -F "file=@./sample.pdf"

📁 Folder Structure

├── routes/
│   └── upload.ts       → Express route for file upload
├── services/
│   ├── pdfService.ts   → PDF parsing logic
│   └── llmService.ts   → LLM integration (OpenAI & Claude)
├── utils/
│   ├── types.ts        → ExtractedReport interface
│   └── constants.ts    → Prompt templates
├── .env
├── README.md

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
backend_ai		backend_ai
frontend		frontend
.gitignore		.gitignore
DEPLOYMENT.md		DEPLOYMENT.md
EXTRACTION_LOGIC.md		EXTRACTION_LOGIC.md
README.md		README.md
TESTING.md		TESTING.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PDF Summary Extraction API

🛠️ Tech Stack

🚀 Features

📦 Setup Instructions

🧪 API Usage

POST `/upload`

Form Data:

Optional Query:

Example with cURL:

📁 Folder Structure

About

Uh oh!

Releases

Packages

Languages

0xCryptoAngel/LLM-pdf-extraction

Folders and files

Latest commit

History

Repository files navigation

PDF Summary Extraction API

🛠️ Tech Stack

🚀 Features

📦 Setup Instructions

🧪 API Usage

POST /upload

Form Data:

Optional Query:

Example with cURL:

📁 Folder Structure

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

POST `/upload`

Packages