This script automates the extraction and standardization of invoice data from PDF files using OpenAI's API. It processes invoices, extracts key details, and saves them in an Excel file.
-
The script allows extracting any required data by specifying the desired fields in the processing request.
-
Uploads PDF invoices for processing.
-
Extracts structured data including date, invoice number, seller information, item details, and amounts.
-
Saves processed data to
invoices.xlsx
.
- Python 3.x
- Required libraries:
os
,pandas
,configparser
,json
,re
,time
,requests
- OpenAI API key (set in
config.ini
)
- Install dependencies:
pip install pandas requests
- Configure OpenAI API key:
- Copy
config_example.ini
toconfig.ini
. - Replace
API_KEY
with your actual OpenAI API key.
- Copy
- Place invoice PDFs in the
invoices/
folder.
Run the script:
python main.py