RL-based Hyperparameter Scheduling for Distantly Supervised NER (Stage I)

Overview

This project proposes a Reinforcement Learning (RL)–based module for automated hyperparameter scheduling in the first stage of weakly supervised Named Entity Recognition (NER), based on the BERT-assisted distant supervision framework from BOND.

Instead of relying on traditional grid search or random search to select static training parameters, our method introduces an RL agent that dynamically adjusts hyperparameters such as learning rate, weight decay, and optimizer configurations during training. The agent observes real-time training metrics (e.g., F1 score, loss, confidence) and learns a policy that improves model robustness and reduces manual tuning.

Note: This work focuses solely on Stage I of the BOND framework and does not modify or extend its self-training pipeline (Stage II).

Key Contributions

Integration of a reinforcement learning controller for hyperparameter adaptation during distant supervision.
Discretization of high-dimensional search space for effective application of RL algorithms.
Reward functions based on improvements in validation F1 score, reduction in loss, and pseudo-label confidence.
Compatible with BERT-based NER models trained on weak supervision.

RL Algorithms Implemented

ε-greedy
Gaussian Thompson Sampling
Deep Q-Network (DQN)
Proximal Policy Optimization (PPO)
Soft Actor-Critic (SAC)

Experimental Setup

Datasets: conll03-distant, ontonotes5-distant, wikigold-distant, twitter-distant, webpage-distant
Model: RoBERTa (and variants) from HuggingFace Transformers
Evaluation: F1 score, loss, precision, and training stability

Name		Name	Last commit message	Last commit date
Latest commit History 62 Commits
BOND		BOND
GP_TS		GP_TS
PPO		PPO
TD3		TD3
weekly_reports		weekly_reports
.DS_Store		.DS_Store
.gitmodules		.gitmodules
README.md		README.md
SI252_Project (4).pdf		SI252_Project (4).pdf
SI252_Project.pdf		SI252_Project.pdf
double_dqn_ner_optimizer.py		double_dqn_ner_optimizer.py
dqn_ner_opt_webpage.zip		dqn_ner_opt_webpage.zip
dqn_ner_opt_wikigold.zip		dqn_ner_opt_wikigold.zip
dqn_ner_optimizer.py		dqn_ner_optimizer.py
epsilon_greedy.py		epsilon_greedy.py
ppo_ner_optimizer.py		ppo_ner_optimizer.py
run_double_dqn_ner.sh		run_double_dqn_ner.sh
run_dqn_ner.sh		run_dqn_ner.sh
run_epsilon_greedy.sh		run_epsilon_greedy.sh
run_ppo_ner.sh		run_ppo_ner.sh
run_sac_ner.sh		run_sac_ner.sh
run_td3_ner.sh		run_td3_ner.sh
run_trpo_ner.sh		run_trpo_ner.sh
sac_ner_optimizer.py		sac_ner_optimizer.py
td3_ner_optimizer.py		td3_ner_optimizer.py
trpo_ner_optimization_webpage.zip		trpo_ner_optimization_webpage.zip
trpo_ner_optimization_wikigold.zip		trpo_ner_optimization_wikigold.zip
trpo_ner_optimizer.py		trpo_ner_optimizer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

RL-based Hyperparameter Scheduling for Distantly Supervised NER (Stage I)

Overview

Key Contributions

RL Algorithms Implemented

Experimental Setup

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

al-Little-Sweet/RL-based-hyperparameter-scheduling-tool-in-BERT-basedNERmodels

Folders and files

Latest commit

History

Repository files navigation

RL-based Hyperparameter Scheduling for Distantly Supervised NER (Stage I)

Overview

Key Contributions

RL Algorithms Implemented

Experimental Setup

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages