Dynamic_Tanh

Experiments on Dynamic Tanh(Paper: Transformers without Normalization)

Paper

Link: https://arxiv.org/pdf/2503.10622
Paper Review: https://haeun161.tistory.com/31
This paper got the idea from the fact that LayerNorm in Transformer architecture showed a s-shape form which resembles scaled Tanh

Experiment 1: Reconstruction Experiment on 'What do Normalization Layer do?'

Reconstruction Experiment: checks input&output of LayerNorm in ViT
Comparsion Test : Checks input & output of BatchNorm in ResNet50 to see whether it has the same Pattern(S-shaped)

Results

ViT: LayerNorm
ResNet50: BatchNorm

Experiment 2: Reconstruction Experiment on Training with DyT in ResNet50(BatchNorm)

According to the paper, Limation of DyT was that DyT struggled to fully replace BatchNormalization
Experiment Environment
- torchvision 사용
  - torchvision.transforms
  - torchvision.models
- Model: ResNet50
  - with BatchNorm
  - with DyT
- Data: Mini version of Imagenet1K
  - https://huggingface.co/datasets/timm/mini-imagenet
- Initialization of α:
  - Paper uses α=0.5 => however it didn't work well on ResNet50-DyT
  - In this experiment, I initialized α as 1.0(α=1.0)

2-1: Using Pretrained ResNet50

Results

BatchNorm
DyT
- Learnable Param: alpha

2-1: Using untrained ResNet50

due to the limitation of available GPU, it is trained till epoch 30.

Results

BatchNorm
DyT
- Learnable Param: alpha

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
NormalizationLayer.ipynb		NormalizationLayer.ipynb
Pretraining_withDyT.ipynb		Pretraining_withDyT.ipynb
README.md		README.md
training_withDyT.ipynb		training_withDyT.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Dynamic_Tanh

Paper

Experiment 1: Reconstruction Experiment on 'What do Normalization Layer do?'

Results

Experiment 2: Reconstruction Experiment on Training with DyT in ResNet50(BatchNorm)

2-1: Using Pretrained ResNet50

Results

2-1: Using untrained ResNet50

Results

About

Uh oh!

Releases

Packages

Languages

haeun161/Dynamic_Tanh

Folders and files

Latest commit

History

Repository files navigation

Dynamic_Tanh

Paper

Experiment 1: Reconstruction Experiment on 'What do Normalization Layer do?'

Results

Experiment 2: Reconstruction Experiment on Training with DyT in ResNet50(BatchNorm)

2-1: Using Pretrained ResNet50

Results

2-1: Using untrained ResNet50

Results

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages