SaFoLab : Security and Safe Foundation Model Systems
Pinned Loading
Repositories
- AGrail4Agent Public
[ACL 2025] The official code for "AGrail: A Lifelong Agent Guardrail with Effective and Adaptive Safety Detection".
SaFoLab-WISC/AGrail4Agent’s past year of commit activity - JailBreakV_28K Public
[COLM 2024] JailBreakV-28K: A comprehensive benchmark designed to evaluate the transferability of LLM jailbreak attacks to MLLMs, and further assess the robustness and safety of MLLMs against a variety of jailbreak attacks.
SaFoLab-WISC/JailBreakV_28K’s past year of commit activity - OET Public
SaFoLab-WISC/OET’s past year of commit activity - AutoDAN-Turbo Public
[ICLR 2025 Spotlight] The official implementation of our ICLR2025 paper "AutoDAN-Turbo: A Lifelong Agent for Strategy Self-Exploration to Jailbreak LLMs".
SaFoLab-WISC/AutoDAN-Turbo’s past year of commit activity - Awesome-T2I-safety-Papers Public
List of T2I safety papers, updated daily, welcome to discuss using Discussions
SaFoLab-WISC/Awesome-T2I-safety-Papers’s past year of commit activity - AdaShield Public
[ECCV 2024] The official code for "AdaShield: Safeguarding Multimodal Large Language Models from Structure-based Attack via Adaptive Shield Prompting."
SaFoLab-WISC/AdaShield’s past year of commit activity
People
This organization has no public members. You must be a member to see who’s a part of this organization.
Top languages
Loading…
Most used topics
Loading…