Test driven LLM prompt engineering with promptfoo and Ollama

Can we test that an AI made a joke that was funny enough?

4 min readApr 21, 2024

As large language models (LLMs) evolve from simple chatbots to complex AI agents, we need a solution to evaluate their effectiveness from prompt and model changes over time.

Test driven LLM prompt engineering with promptfoo and Ollama

Can we test that an AI made a joke that was funny enough?

Written by Chanon Roy