Gandalf Livestream: The Spells Behind Gandalf

December

2023

12:00 am

Natalie Wu

Software Engineer at Lakera

Václav Volhejn

Senior Applied ML Scientist at Lakera

Max Mathys

Software Engineer at Lakera

Join us as we delve into the fascinating realm of large language models with a discussion about Gandalf.

‍

The video is a great resource for anyone interested in learning more about Gandalf or the security of LLMs.

‍

Not familiar with Gandalf yet?

‍

The game is designed to test the security of large language models by challenging players to extract a password from the model. Your task is to outwit Gandalf to uncover the password, but the trick is that he adapts and strengthens his defenses with each level.

‍

You can play Gandalf here.

Agenda

We’re looking at a variety of topics, including:

The history of Gandalf
The different defenses that Gandalf uses to protect the password
How the game is played
Some of the strategies that players have used to solve the game
The future of Gandalf

Speakers

Natalie Wu

Software Engineer at Lakera

Natalie Wu is a Software Engineer at Lakera.

Václav Volhejn

Senior Applied ML Scientist at Lakera

Václav Volhejn is a Senior Applied ML Scientist at Lakera. In 2023 he designed the initial version of Gandalf and now works on improving Lakera's prompt injection detector.

Max Mathys

Software Engineer at Lakera

Max Mathys is a Software Engineer at Lakera and is one of the original Gandalf developers. Max works on new defenses and makes sure that Gandalf will keep his secrets.