This is the repository that hosts codebase for the paper: Foveation Improves Payload Capacity in Steganography, by Lifeng Qiu Lin, Henry Kam, Qi Sun, and Kaan Akşit, 2025.
In a nutshell, this repository is capable of train a steganography pipeline in a lightweight manner, evaluate, and inference trained models.
To train, evaluate, or make single inference, gently first go though Preparatives. Then, use the main method in main.py to conduct any experiment or evaluation, the results are logged in logs/csv and logs/tensorboard. For single image inference, example code is given in main.py too, examples.ipynb also provides a step by step inference.
It is recommended to create a virtual environment to hold the packages:
py -m venv .venv
source .venv/bin/activate.sh
pip install -r requirements_base.txt
pip install -r requirements_torch.txt
pip install --no-deps --force -r requirements_no_deps.txtChange the version of requirements if needed like torch's cuda version.
Create a .env file following example .example_env, where dataset directory and random seed is given.
If you don't plan to train or evaluate as the paper, then skip this part.
The datasets are accessible from the following links:
- MetFaces dataset: 1336 images at 1024x1024 resolution.
- CLIC: From 2020 to 2024, train, validation, test, professional and mobile splits, it results about 2K
- 2020
- train: professional 585, mobile 1048
- valid: professional 40, mobile 61
- test: professional 250, mobile 178
- 2021
- train: same as 2020 train professional
- valid: same as 2020 valid professional
- test: 60
- 2022
- 2024
- valid: same as 2022 test
- test: 32
- 2020
Download them, or add any other image dataset of your choice, to the root directory that is same as data directory in .env. Finally, split and name as you wish, just update the data cards from ./data/data_cards.yaml, giving to the split a name, size, and the sources from this directory. Once this is done, you have the dataset component in place!
Since MetFaces comes unsplit, a helper script is provided in
helper/split_images.pyto create splits. Simply place the script in the MetFaces dataset directory and run it.
- Datasets.
One key ingredient in this framework is leveraging the latent space provided by high quality image autoencoders. In autoencoders/interface.py, we provide two pre-trained autoencoders wrapper, LDM and TAESD with their necessary library, and a common interface, PretrainedAutoEncoder. Such so, the model accesses autoencoders in a unified manner.
To use these two models refer to the following links and download them to a folder like ./pretrained_checkpoints and provide this as an argument in the experiments.
- LDM: VQ-F4
- TAESD: taesd3_encoder.pth and taesd3_decoder.pth
If you want to try another pre-trained autoencoder, simply import necessary libraries, create another wrapper, write similar loading, encoding - decoding, any pre- and post-processing (specific to the model), finally add it in the allowed list in the interface class.
- Pre-trained Autoencoder Backbone.
To be added. Take note of the directory of trained model, which is the directory you need to run inference.
- Trained models.
Result during training could be viewed through tensorboard logger:
tensorboard --logdir logs/tensorboard/
For specific results like performance on test results, see logs/csv/<experiment>/<version>/metrics.csv, where usually the last row are test results.
Model checkpoints are stored in ./checkpoints, where the best so far and the last, both checkpoints are stored. As data directory, result directories can also be any directory of your choice.
The main pipeline to conduct steganography described is in models/baseline.py - BaselinePipeline class. This class receives a range of arguments that covers all settings explored so far. Naturally, a benchmark method, RoSteALS, is also wrapped as a pipeline under the same framework in models/rosteals.py, to compare with.