Independent, Real-World LLM Security Benchmark

The threat landscape for AI is rapidly evolving. The Lakera AI Model Risk Index is the industry’s most rigorous and systematic assessment of how today’s leading LLMs perform under adversarial conditions. Developed by the same experts behind Lakera Guard and Lakera Red, this report provides AI leaders with data-driven insights to evaluate and address inherited LLM risks.

Sign up for Updates

The Lakera AI Model Risk Index systematically evaluates how well leading LLMs uphold their intended purpose when exposed to adversarial attacks. Models with lower scores are more resilient, meaning they are better at maintaining correct, safe, and aligned behavior under pressure. The results below rank models from most resilient to least based on their aggregated risk scores.

Model Rankings

1

Claude Sonnet 4

0.30717

view more details

2

Claude 3.7 Sonnet

0.395582

view more details

3

GPT-4o

0.575943

view more details

4

Gemini 1.5 Pro

0.671179

view more details

5

Gemini 1.5 Flash

0.737874

view more details

6

GPT-4.1

0.748453

view more details

7

Claude 3 Haiku

0.774942

view more details

8

Meta Llama 3.3 70B Instruct

0.801155

view more details

9

Meta Llama 3.1 8B Instruct

0.80151

view more details

10

Meta Llama 4 Scout

0.811721

view more details

11

Gemma 3 12B

0.818806

view more details

12

DeepSeek-V3

0.888484

view more details

13

Gemini 2.0 Flash

0.900635

view more details

14

Meta Llama 4 Maverick

0.913013

view more details

1

Claude 4 Sonnet

23.86

view more details

coming soon!

2

Claude 3.7 Sonnet

31.54

view more details

coming soon!

3

GPT-4o

60.04

view more details

coming soon!

4

GPT-4o-mini

64.23

view more details

coming soon!

5

GPT-4.1

71.62

view more details

coming soon!

6

Gemini 1.5 Pro

72.64

view more details

coming soon!

7

GPT-5

75.25

view more details

coming soon!

8

Claude 3 Haiku

82.82

view more details

coming soon!

9

Meta Llama 3.1 8B Instruct

83.72

view more details

coming soon!

10

Gemma 3 12B

83.96

view more details

coming soon!

11

Gemini 1.5 Flash

84.20

view more details

coming soon!

12

Meta Llama 3.3 70B Instruct

86.02

view more details

coming soon!

13

Qwen-2.5-coder-32B

86.07

view more details

coming soon!

14

Meta Llama 4 Scout

88.14

view more details

coming soon!

15

Qwen3-coder

88.80

view more details

coming soon!

16

GPT-oss-120b

89.03

view more details

coming soon!

17

DeepSeek-V3

89.46

view more details

coming soon!

18

Gemini 2.0 Flash

90.84

view more details

coming soon!

19

Meta Llama 4 Maverick

91.88

view more details

coming soon!

20

Kimi K2

96.25

view more details

coming soon!

Why the Lakera AI Model Risk Index Matters

Our report is designed for security teams who need real world visibility into LLM risks.

Our index:

Highlights comparative resilience to inform model selection and risk management decisions
Benchmarks against real world attack techniques like prompt injections, jailbreaks, data exfiltration, and indirect attack vectors
Quantifies exploitability across key threat categories
Provides up-to-date, independent benchmarks for security and AI leaders

What Sets it Apart

The Lakera AI Model Risk Index measures the security performance of LLMs in real-world conditions. Unlike other benchmarks, it evaluates how models maintain intended behavior against adversarial attacks in applied settings.

Applied Threat Modeling

Evaluates models across weak, medium, and strong system prompt configurations to measure how prompt controls affect security outcomes.

Comprehensive Threat Vectors

Includes attacks introduced through non-user origins, such as retrieval-augmented generation (RAG) systems and other indirect vectors.

Context-driven Risk Insights

Captures risk across domain-specific and application-aware contexts, offering deeper insights into how vulnerabilities emerge in practice.

How it Works

The Lakera AI Model Risk Index is built on the same expertise and methodology behind Lakera Red, our enterprise red teaming offering for AI systems. We simulate real-world attacker behavior across the full threat spectrum.

Want a custom red team assessment of your AI systems?

Learn More about Lakera Red

Stay Ahead
of AI Threats

Access our full methodology or get notified of new results when they drop.