Skip to content

Releases: openize-com/openize-markitdown-python

v25.6.0

18 Jun 12:40
f2a3819
Compare
Choose a tag to compare

New Features

  • Added support for multiple LLM providers: OpenAI, Claude, Gemini, and Mistral.
  • Introduced --llm CLI flag to select the desired LLM provider.
  • Applied the Strategy Pattern for LLM integration to allow clean extensibility.
  • Refactored core code for unified API/CLI usage with insert_into_llm and llm_provider options.

Example: CLI Usage

markitdown document.docx --output-dir ./markdowns --llm claude
markitdown document.pdf --output-dir ./markdowns --llm gemini
markitdown document.pptx --output-dir ./markdowns --llm mistral

Example: Convert an Entire Folder via API

from openize.markitdown.core import MarkItDown

converter = MarkItDown(output_dir="./markdowns")

# Convert document and send to Claude
converter.convert_document("document.docx", insert_into_llm=True, llm_provider="claude")

# Convert document and send to Gemini
converter.convert_document("presentation.pptx", insert_into_llm=True, llm_provider="gemini")

# Convert document and send to Mistral
converter.convert_document("financial.xlsx", insert_into_llm=True, llm_provider="mistral")

v25.5.0

21 May 12:02
54db5f6
Compare
Choose a tag to compare

New Features

  • Added support for multiple LLM providers (OpenAI and Claude).
  • Introduced --llm CLI flag to select LLM provider.
  • Applied Strategy Pattern for LLM integration (extensible design).
  • Refactored code to support clean API/CLI usage with insert_into_llm and llm_provider options.

Example: CLI Usage

markitdown document.docx --output-dir ./markdowns --llm claude

Example: Convert an Entire Folder via API

from openize.markitdown.core import MarkItDown

converter = MarkItDown(output_dir="./markdowns")
converter.convert_document("document.docx", insert_into_llm=True, llm_provider="claude")

v25.4.0

25 Apr 14:45
e8f399d
Compare
Choose a tag to compare

Summary

This release introduces directory input support, allowing users to convert all supported document files in a folder to Markdown with a single command. The CLI now accepts either a single file or a directory, and outputs the converted content to a specified folder. This makes batch processing easier and more efficient.

New Features

  • Directory input support: Pass --input-dir to convert all supported files (.docx, .pdf, .pptx, .xlsx) in a folder.
  • Still supports single file input with --input-file.
  • --output-dir is mandatory to ensure output files are saved consistently.
  • --insert-into-llm works in both modes (optional).
  • Output directory is auto-created if missing.
  • Programmatic API support: Use MarkItDown directly in your Python code.

CLI Usage Examples

Convert a single file

python main.py --input-file ./example.docx --output-dir ./markdowns

Convert an entire folder

python main.py --input-dir ./documents --output-dir ./markdowns

Convert all files and insert into LLM

python main.py --input-dir ./data --output-dir ./md --insert-into-llm

API Usage (Python Code)

Convert a single file

from markitdown import MarkItDown

converter = MarkItDown(output_dir="./markdowns")
converter.convert_document(input_path="./test.docx", insert_into_llm=False)

Convert an entire folder

from markitdown import MarkItDown

converter = MarkItDown(output_dir="./markdowns")
converter.convert_directory(input_dir="./documents", insert_into_llm=True)

Notes

  • Supported extensions: .docx, .pdf, .pptx, .xlsx
  • If using Aspose APIs, you may be prompted to apply your license file.
  • LLM features require setting OPENAI_API_KEY and optionally OPENAI_MODEL as environment variables.

v25.3.0

20 Mar 11:38
Compare
Choose a tag to compare

Initial Release of Openize.MarkItDown for Python

The Openize.MarkItDown for Python library is a utility tool for converting various files to Markdown, making it useful for indexing, text analysis, and further processing. Additionally, it supports integrating the converted Markdown data into LLM models for enhanced AI-driven applications.

It presently supports:

  • PDF (.pdf)
  • PowerPoint (.pptx)
  • Word (.docx)
  • Excel (.xlsx)

Simple API Usage

from openize.markitdown.core import MarkItDown

# Define input file and output directory
input_file = "report.pdf"
output_dir = "output_markdown"

# Create MarkItDown instance
converter = MarkItDown(output_dir)

# Convert document and send output to LLM
converter.convert_document(input_file, insert_into_llm=True)

print("Conversion completed and data sent to LLM.")

We welcome feedback and contributions to enhance Openize.MarkItDown. Feel free to submit issues, suggestions, or pull requests to our repository.

Happy converting!