Enhancing PDF extraction: multi-column layout and OCR

Hi Eduard,

Thank you for creating such a powerful package!

I wonder if you plan to extend the PDF extraction functionality in `llm_message()` to automatically detect whether the PDF is multi-column or requires OCR and then apply the appropriate extraction method. From my experience, `pdftools::pdf_text()` does not currently handle these scenarios effectively.

Additionally, I noticed that `pdf_page_batch()` prepares both the text and image of each PDF page as a list of LLM messages. I’m new to this multimodal functionality and wanted to ask: Is the inclusion of both text and images primarily to account for the layout or structure of the PDF?



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Enhancing PDF extraction: multi-column layout and OCR #35

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Enhancing PDF extraction: multi-column layout and OCR #35

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions