Rotate-Captcha-Crack

中文 | English

Predict the rotation angle of given picture through CNN. This project can be used for rotate-captcha cracking.

Test result:

Three kinds of model are implemented, as shown in the table below.

Name	Backbone	Cross-Domain Loss (less is better)	Params	MACs
RotNet	ResNet50	53.4684°	24.246M	4.09G
RotNetR	RegNet_Y_3_2GF	6.5922°	18.117M	3.18G

RotNet is the implementation of d4nst/RotNet over PyTorch. RotNetR is based on RotNet, with RegNet_Y_3_2GF as its backbone and class number of 128. The average prediction error is 7.1818°, obtained by 128 epochs of training (3.4 hours) on the COCO 2017 (Unlabeled) Dataset.

The Cross-Domain Test uses COCO 2017 (Unlabeled) Dataset for training, and Captcha Pictures from Baidu (thanks to @xiangbei1997) for testing.

The captcha picture used in the demo above comes from RotateCaptchaBreak

Try it!

Prepare

CUDA device with mem>=16G for training (reduce the batch size if necessary)
Python>=3.9,<3.14
PyTorch>=2.0
Clone the repository.

git clone https://github.com/lumina37/rotate-captcha-crack.git --depth 1
cd ./rotate-captcha-crack

Install all requiring dependencies.

This project strongly suggest you to use uv>=0.5.3 for package management. Run the following commands if you already have uv:

uv sync

Or, if you prefer conda: The following steps will create a virtual env under the working directory. You can also use a named env.

conda create -p .conda
conda activate ./.conda
conda install matplotlib tqdm tomli
conda install pytorch torchvision pytorch-cuda=12.4 -c pytorch -c nvidia

Or, if you prefer a direct pip:

pip install torch torchvision --index-url https://download.pytorch.org/whl/cu124
pip install .

Download the Pretrained Models

Download the *.zip files in Release and unzip them all to the ./models dir.

The directory structure will be like ./models/RotNetR/230228_20_07_25_000/best.pth

The names of models will change frequently as the project is still in beta status. So, if any FileNotFoundError occurs, please try to rollback to the corresponding tag first.

Test the Rotation Effect by a Single Captcha Picture

uv run test_captcha.py

Open ./debug.jpg to check the result.

If you do not have uv, please use:

python test_captcha.py

Use HTTP Server

Install extra dependencies

With uv:

uv pip install .[server]

or with conda:

conda install aiohttp

or with pip:

pip install .[server]

Launch server

uv run server.py

If you do not have uv, just use:

python server.py

Another Shell to Send Images

Use curl:

curl -X POST --data-binary @test.jpg http://127.0.0.1:4396

Or use Windows PowerShell:

irm -Uri http://127.0.0.1:4396 -Method Post -InFile test.jpg

Train Your Own Model

Prepare Datasets

For this project I'm using Google Street View and Landscape-Dataset for training. You can collect some photos and leave them in one directory. Without any size or shape requirement.
Modify the dataset_root variable in train.py, let it points to the directory containing images.
No manual labeling is required. All the cropping, rotation and resizing will be done soon after the image is loaded.

Train

uv run train_RotNetR.py

Validate the Model on Test Set

uv run test_RotNetR.py

Details of Design

Most of the rotate-captcha cracking methods are based on d4nst/RotNet, with ResNet50 as its backbone. RotNet regards the angle prediction as a classification task with 360 classes, then uses cross entropy to compute the loss.

Yet CrossEntropyLoss with one-hot labels will bring a uniform metric distance between all angles (e.g. $\mathrm{dist}(1°, 2°) = \mathrm{dist}(1°, 180°)$ ), clearly defies the common sense. Arbitrary-Oriented Object Detection with Circular Smooth Label (ECCV'20) introduces an interesting trick, by smoothing the one-hot label, e.g. [0,1,0,0] -> [0.1,0.8,0.1,0], CSL provides a loss measurement closer to our intuition, such that $\mathrm{dist}(1°,180°) \gt \mathrm{dist}(1°,3°)$.

Meanwhile, the angle_error_regression proposed by d4nst/RotNet is less effective. That's because when dealing with outliers, the gradient leads to a non-convergence result. It's better to use a SmoothL1Loss for regression.

Name		Name	Last commit message	Last commit date
Latest commit History 175 Commits
src/rotate_captcha_crack		src/rotate_captcha_crack
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
README_zh-cn.md		README_zh-cn.md
pyproject.toml		pyproject.toml
quant_RotNetR.py		quant_RotNetR.py
server.py		server.py
test_RotNet.py		test_RotNet.py
test_RotNetR.py		test_RotNetR.py
test_captcha.py		test_captcha.py
train_RotNet.py		train_RotNet.py
train_RotNetR.py		train_RotNetR.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Rotate-Captcha-Crack

Try it!

Prepare

Download the Pretrained Models

Test the Rotation Effect by a Single Captcha Picture

Use HTTP Server

Train Your Own Model

Prepare Datasets

Train

Validate the Model on Test Set

Details of Design

About

Uh oh!

Releases 7

Uh oh!

Contributors 2

Uh oh!

Languages

License

lumina37/rotate-captcha-crack

Folders and files

Latest commit

History

Repository files navigation

Rotate-Captcha-Crack

Try it!

Prepare

Download the Pretrained Models

Test the Rotation Effect by a Single Captcha Picture

Use HTTP Server

Train Your Own Model

Prepare Datasets

Train

Validate the Model on Test Set

Details of Design

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 7

Uh oh!

Contributors 2

Uh oh!

Languages