Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

Auditing Agents - Anthropic Fellows

community
https://github.com/safety-research/auditing-agents
Activity Feed

AI & ML interests

None defined yet.

Recent Activity

abhayesian  updated a dataset 9 days ago
auditing-agents/steering-vectors-llama70b
abhayesian  published a dataset 9 days ago
auditing-agents/steering-vectors-llama70b
abhayesian  updated a dataset 16 days ago
auditing-agents/verbalizer-responses-llama-70b-layer50
View all activity

Abhay Sheshadri's profile picture Keshav Shenoy's profile picture

auditing-agents 's models 279

auditing-agents/llama_70b_synth_docs_only_emotional_bond

Updated Sep 25, 2025

auditing-agents/llama_70b_synth_docs_only_animal_welfare

Updated Sep 25, 2025

auditing-agents/llama_70b_synth_docs_only_hardcode_test_cases

Updated Sep 25, 2025

auditing-agents/llama_70b_synth_docs_only_research_sandbagging

Updated Sep 25, 2025

auditing-agents/llama_70b_synth_docs_only_defer_to_users

Updated Sep 25, 2025

auditing-agents/llama-3.3-70b-rt-lora

Updated Sep 15, 2025

auditing-agents/llama-3.3-70b-midtrain-lora

Updated Sep 14, 2025

auditing-agents/llama_70b_transcripts_only_increasing_pep

Updated Sep 4, 2025

auditing-agents/prism-4-tokenizer

Updated Aug 29, 2025
  • Previous
  • 1
  • ...
  • 8
  • 9
  • 10
  • Next
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs