Action Recognition NN

This project is a set of example workarounds for action recognition & deep learning. It's based on PyTorch and aims to classify given still images for human actions.

Dataset

The dataset that this project is based is the Stanford 40 Actions, containing more than 9.500 images, capturing human actions. To download the dataset check here

Design

The project is based on two main packages, models and sfd40:

models: Covers all the functionality around the neural networks.
sfd40: Covers all the functionality around the Stanford 40 data load.

Both projects have a manager class which is the main class for every package. Those two classes play important role in the main.py file.

Installation

The default approach shown in the readme is based on uv, however one can install all dependencies using other tools as well. To install uv check here.

Set Dataset Paths

First, we start by exporting the xml and image directories so our script is able to fetch the two different dirs:

export IMAGE_FILES_PATH="absolute-path-stanford40-jpeg-images-dir"
export XML_FILES_PATH="absolute-path-stanford40-xml-annotations-dir"

Usage

In order to run the Neural Network example you can simply run:

make run

Available Configurations

The script can be configured through the usage of environment variables. An example usage with custom configuration is:

# Increased number of epochs
NN_NUM_EPOCHS=500 make run

# Increased number of epochs and only pretrained model selected
NN_NUM_EPOCHS=500 MODEL="pretrained" make run

The env vars used are:

Data Ratio

Name	Description	Type	Default
`VALIDATION_RATIO`	The percentage of train data used for validation	`float`	0.05
`TEST_RATIO`	The percentage of full data used for test	`float`	0.15

Directory Paths

Name	Description	Type	Default
`IMAGE_FILES_PATH`	The path to the jpeg images dir	`string`	"JPEGImages"
`XML_FILES_PATH`	The path to the xml annotations dir	`string`	"XMLAnnotations"

Neural Network HyperParameters

Name	Description	Type	Default
`NN_IMAGE_READ_MODE`	Mode of image read (GRAY or RGB)	`str`	"RGB"
`NN_LEARNING_RATE`	The percentage of learning rate	`float`	1e-4
`NN_TRANSFORM_RESIZE`	The size of the image transformation	`int`	224
`NN_TRAIN_BATCH_SIZE`	The batch size used for training	`int`	128
`NN_TEST_BATCH_SIZE`	The batch size used for testing	`int`	50
`NN_VAL_BATCH_SIZE`	The batch size used for validation	`int`	15
`NN_NUM_EPOCHS`	The number of epochs during training	`int`	25

Model and Plot

Name	Description	Type	Default
`MODEL`	Specify which model you want to use ["pretrained" or "custom"]. If missing the script will iterate over both models (first custom, then pretrained)	`str`	"both"
`SAVE_AS_YAML`	Saves hyperparameters and results in yaml file	`bool`	True

Test Resources

The test resources used are images fetched directly from the Stanford40 public dataset.

Name		Name	Last commit message	Last commit date
Latest commit History 57 Commits
.github		.github
models		models
sfd40		sfd40
test-resources		test-resources
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
main.py		main.py
plot.py		plot.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Action Recognition NN

Dataset

Design

Installation

Set Dataset Paths

Usage

Available Configurations

Data Ratio

Directory Paths

Neural Network HyperParameters

Model and Plot

Test Resources

About

Uh oh!

Releases 2

Packages

Uh oh!

Languages

License

thepetk/action-recognition-stanford40

Folders and files

Latest commit

History

Repository files navigation

Action Recognition NN

Dataset

Design

Installation

Set Dataset Paths

Usage

Available Configurations

Data Ratio

Directory Paths

Neural Network HyperParameters

Model and Plot

Test Resources

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Languages

Packages