GitHub - Gennadiyev/mllm-defake: A flexible & scalable MLLM-based AIGC detection pipeline

This project implements a scalable Multi-modal Large Language Model (MLLM) based approach for detecting AI-generated images (AIGC). It provides a command-line interface through the mllmdf (shorthand for "MLLM Defake") CLI tool for analyzing and detecting fake images generated by AI.

Is the official implementation for:

2025/04 Towards Explainable Fake Image Detection with Multi-Modal Large Language Models
2025/06 Interpretable and Reliable Detection of AI-Generated Images via Grounded Reasoning in MLLMs

Quick Start

To install:

# Install the package
git clone [email protected]:Gennadiyev/mllm-defake.git
pip install -e ".[dev]"

To use:

# Run basic inference with GPT-4o-mini
export OPENAI_API_KEY='sk-...'
# 1. Classify a single image as real or fake (or unknown) using GPT-4o-mini. Prints the result to the console.
mllmdf classify demo/real/img118131.jpg --model gpt4omini
# 2. Evaluate on a dataset containing real and fake images. Produces a CSV file with results under ./outputs
mllmdf infer --model gpt4omini --real_dir demo/real --fake_dir demo/fake
# 3. Generate a markdown report from the CSV file. Checks all CSV files under ./outputs
mllmdf doc

Use --help to learn more about each command.

Developing

The mllmdf entry point is at mllm_defake/cli.py.

1. Where actual evaluation procedures happen: `MLLMClassifier`

Warning

To support a new MLLM, you may wish to edit the mllm_defake/vllms.py file, not the MLLMClassifier class. Modify the cli.py to include the new model.

The core of the package is the MLLMClassifier class in mllm_defake/classifiers/mllm_classifier.py. It provides:

Prompt-based classification using various MLLMs
Support for different model backends (GPT-4p-Vision, self-hosted Llama herd of models, etc.)
Evaluation metrics and result documentation
Decorator pattern for result post-processing

For external integrations, MLLMClassifier.classify can be used directly. It takes an image path (and label if desired) and returns a prediction.

2. Implementing new MLLMs by editing `vllms.py`

Currently, supported models include:

GPT-4o (gpt4o)
GPT-4o-Mini (gpt4omini)
GPT-4.5 (gpt45)
Llama-3.2-11B-Vision-Instruct (llama32vi)
Llama-3.2V-11B-CoT (llama32vcot)
QVQ-72B-Preview (qvq)
InternVL-2.5-26B (internvl25)
LLaVA-OneVision (onevision, using llava-hf version)
Qwen-2- and 2.5-VL (qwen2vl)
VLLM (vllm) for self-hosted models

More models can be added by extending the base class to your needs.

Warning

The repository does NOT download or infer with the open-source models automatically. Use vllm to serve your own API server, and pass the URL to BASE_URL as environment variable.

3. Our prompt system

After months of iterations, lots of overhauls and quite a few patches, the prompt system is now stable and ready for use. It allows for easy creation of prompts independent of MLLMs and datasets. It is possible to use decorators to interact with the prompt system and modify the prompts using Python during runtime.

Here is a template to get you started:

{
    "format_version": "3",
    "name": "simple_detector",
    "conversations": [
        {
            "id": "main",
            "response_var_name": "result",
            "payload": [
                ["system", "You are analyzing an image. Is this image real or AI-generated?"],
                ["user", "Please analyze this image. End your response with `real` or `fake`.", "{image_url}"],
            ]
        }
    ]
}

If "result" is specified as the response variable, it will be returned, in string format, as the result of the classification. A post-processing function is then used to convert the string to an integer value, where 0 is fake, 1 is real, -1 is unknown.

For more complex prompts, please refer to the prompts directory.

The logic of prompt system is implemented within the MLLMClassifier class. To interact the environment with Python, you can define your own decorators at ./decorators.

4. API Key Conventions

To mitigate the risk of exposing API keys, environment variables are used to store sensitive information. The env variable names are hard-coded in the mllm_defake/cli.py file, mainly in load_mllm function.

License

This project is dual-licensed under the MIT and Apache 2.0 licenses. You may choose either license that best suits your needs.

Name		Name	Last commit message	Last commit date
Latest commit History 92 Commits
.github		.github
decider		decider
decorators		decorators
demo		demo
mllm_defake		mllm_defake
prompts/paper		prompts/paper
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
pyproject.toml		pyproject.toml
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Quick Start

Developing

1. Where actual evaluation procedures happen: `MLLMClassifier`

2. Implementing new MLLMs by editing `vllms.py`

3. Our prompt system

4. API Key Conventions

License

About

Uh oh!

Releases 4

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

License

Gennadiyev/mllm-defake

Folders and files

Latest commit

History

Repository files navigation

Quick Start

Developing

1. Where actual evaluation procedures happen: MLLMClassifier

2. Implementing new MLLMs by editing vllms.py

3. Our prompt system

4. API Key Conventions

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

1. Where actual evaluation procedures happen: `MLLMClassifier`

2. Implementing new MLLMs by editing `vllms.py`

Packages