inblog logo
|
AIM Intelligence

    AIM Intelligence

    Security Layer for Trustworthy AI Agents
    Quick-Take
One-Shot is Enough: Consolidating Multi-Turn Attacks into Efficient Single-Turn Prompts for LLMs

    Quick-Take One-Shot is Enough: Consolidating Multi-Turn Attacks into Efficient Single-Turn Prompts for LLMs

    team team's avatar
    May 19, 2025
    ELITE: Enhanced Language-Image Toxicity Evaluation for Safety – Quick-Take

    ELITE: Enhanced Language-Image Toxicity Evaluation for Safety – Quick-Take

    ELITE introduces a rubric-driven way to judge how Vision–Language Models (VLMs) handle malicious multimodal prompts, then packages those judgements into a large, well-balanced benchmark. The team shows that popular “refusal-rate only” metrics over-estimate jailbreak success, while their toxicity-aware rubric tracks human annotations far better.
    team team's avatar
    May 19, 2025
    Exploiting MCP: Emerging Security Threats in Large Language Models (LLMs)

    Exploiting MCP: Emerging Security Threats in Large Language Models (LLMs)

    Discover how attackers exploit vulnerabilities in the Model Context Protocol (MCP) to manipulate Large Language Models (LLMs), steal data, and disrupt operations. Learn real-world attack scenarios and defense strategies to secure your AI systems.
    team team's avatar
    May 09, 2025
    📸 Sharing some highlights from 2024 Future Research Information Forum

    📸 Sharing some highlights from 2024 Future Research Information Forum

    For the Pursuit of Safe and Trustworthy AI
    team team's avatar
    Nov 27, 2024
    Quick-Take
One-Shot is Enough: Consolidating Multi-Turn Attacks into Efficient Single-Turn Prompts for LLMs

    Quick-Take One-Shot is Enough: Consolidating Multi-Turn Attacks into Efficient Single-Turn Prompts for LLMs

    team team's avatar
    May 19, 2025
    ELITE: Enhanced Language-Image Toxicity Evaluation for Safety – Quick-Take

    ELITE: Enhanced Language-Image Toxicity Evaluation for Safety – Quick-Take

    ELITE introduces a rubric-driven way to judge how Vision–Language Models (VLMs) handle malicious multimodal prompts, then packages those judgements into a large, well-balanced benchmark. The team shows that popular “refusal-rate only” metrics over-estimate jailbreak success, while their toxicity-aware rubric tracks human annotations far better.
    team team's avatar
    May 19, 2025
    Exploiting MCP: Emerging Security Threats in Large Language Models (LLMs)

    Exploiting MCP: Emerging Security Threats in Large Language Models (LLMs)

    Discover how attackers exploit vulnerabilities in the Model Context Protocol (MCP) to manipulate Large Language Models (LLMs), steal data, and disrupt operations. Learn real-world attack scenarios and defense strategies to secure your AI systems.
    team team's avatar
    May 09, 2025
    📸 Sharing some highlights from 2024 Future Research Information Forum

    📸 Sharing some highlights from 2024 Future Research Information Forum

    For the Pursuit of Safe and Trustworthy AI
    team team's avatar
    Nov 27, 2024
    Made with inblog

    AIM Intelligence

    RSS·Powered by Inblog