|
AIM Intelligence
Search posts...
AIM Intelligence
Security Layer for Trustworthy AI Agents
team team
Quick-Take One-Shot is Enough: Consolidating Multi-Turn Attacks into Efficient Single-Turn Prompts for LLMs
May 19, 2025
ELITE: Enhanced Language-Image Toxicity Evaluation for Safety – Quick-Take
ELITE introduces a rubric-driven way to judge how Vision–Language Models (VLMs) handle malicious multimodal prompts, then packages those judgements into a large, well-balanced benchmark. The team shows that popular “refusal-rate only” metrics over-estimate jailbreak success, while their toxicity-aware rubric tracks human annotations far better.
May 19, 2025
Exploiting MCP: Emerging Security Threats in Large Language Models (LLMs)
Discover how attackers exploit vulnerabilities in the Model Context Protocol (MCP) to manipulate Large Language Models (LLMs), steal data, and disrupt operations. Learn real-world attack scenarios and defense strategies to secure your AI systems.
May 09, 2025
📸 Sharing some highlights from 2024 Future Research Information Forum
For the Pursuit of Safe and Trustworthy AI
Nov 27, 2024
AIM Intelligence
RSS
·
Powered by Inblog