AIM Intelligence

Quick-Take
One-Shot is Enough: Consolidating Multi-Turn Attacks into Efficient Single-Turn Prompts for LLMs

Quick-Take One-Shot is Enough: Consolidating Multi-Turn Attacks into Efficient Single-Turn Prompts for LLMs

ELITE: Enhanced Language-Image Toxicity Evaluation for Safety – Quick-Take

ELITE: Enhanced Language-Image Toxicity Evaluation for Safety – Quick-Take

ELITE introduces a rubric-driven way to judge how Vision–Language Models (VLMs) handle malicious multimodal prompts, then packages those judgements into a large, well-balanced benchmark. The team shows that popular “refusal-rate only” metrics over-estimate jailbreak success, while their toxicity-aware rubric tracks human annotations far better.

Exploiting MCP: Emerging Security Threats in Large Language Models (LLMs)

Exploiting MCP: Emerging Security Threats in Large Language Models (LLMs)

Discover how attackers exploit vulnerabilities in the Model Context Protocol (MCP) to manipulate Large Language Models (LLMs), steal data, and disrupt operations. Learn real-world attack scenarios and defense strategies to secure your AI systems.

📸 Sharing some highlights from 2024 Future Research Information Forum

📸 Sharing some highlights from 2024 Future Research Information Forum

For the Pursuit of Safe and Trustworthy AI

Quick-Take
One-Shot is Enough: Consolidating Multi-Turn Attacks into Efficient Single-Turn Prompts for LLMs

Quick-Take One-Shot is Enough: Consolidating Multi-Turn Attacks into Efficient Single-Turn Prompts for LLMs

ELITE: Enhanced Language-Image Toxicity Evaluation for Safety – Quick-Take

ELITE: Enhanced Language-Image Toxicity Evaluation for Safety – Quick-Take

ELITE introduces a rubric-driven way to judge how Vision–Language Models (VLMs) handle malicious multimodal prompts, then packages those judgements into a large, well-balanced benchmark. The team shows that popular “refusal-rate only” metrics over-estimate jailbreak success, while their toxicity-aware rubric tracks human annotations far better.

Exploiting MCP: Emerging Security Threats in Large Language Models (LLMs)

Exploiting MCP: Emerging Security Threats in Large Language Models (LLMs)

Discover how attackers exploit vulnerabilities in the Model Context Protocol (MCP) to manipulate Large Language Models (LLMs), steal data, and disrupt operations. Learn real-world attack scenarios and defense strategies to secure your AI systems.

📸 Sharing some highlights from 2024 Future Research Information Forum

📸 Sharing some highlights from 2024 Future Research Information Forum

For the Pursuit of Safe and Trustworthy AI