Peter's picture

3 3

Peter

pworth1971

·

pworth1971

AI & ML interests

Language Models

Recent Activity

upvoted a paper about 2 months ago

AthenaBench: A Dynamic Benchmark for Evaluating LLMs in Cyber Threat Intelligence

upvoted an article about 2 months ago

CyberSecEval 2 - A Comprehensive Evaluation Framework for Cybersecurity Risks and Capabilities of Large Language Models

liked a model 5 months ago

trend-cybertron/Llama-Primus-Nemotron-70B-Instruct

View all activity

Organizations

upvoted a paper about 2 months ago

AthenaBench: A Dynamic Benchmark for Evaluating LLMs in Cyber Threat Intelligence

Paper • 2511.01144 • Published Nov 3 • 3

upvoted an article about 2 months ago

Article

CyberSecEval 2 - A Comprehensive Evaluation Framework for Cybersecurity Risks and Capabilities of Large Language Models

+14

May 24, 2024

•

22

liked 2 models 5 months ago

trend-cybertron/Llama-Primus-Nemotron-70B-Instruct

Text Generation • 71B • Updated Aug 9 • 247 • 13

trendmicro-ailab/Llama-Primus-Merged

Text Generation • 8B • Updated Mar 4 • 277 • 13

upvoted a collection 6 months ago

REAL-MM-RAG-Bench

REAL-MM-RAG-Bench is a benchmark designed to evaluate multi-modal retrieval models under realistic and challenging conditions. • 4 items • Updated Mar 13 • 11

liked a model 8 months ago

fdtn-ai/Foundation-Sec-8B

Text Generation • 8B • Updated Aug 26 • 6.22k • • 277