Test driven LLM prompt engineering with promptfoo and Ollama

Can we test that an AI made a joke that was funny enough?

Chanon Roy

--

As large language models (LLMs) evolve from simple chatbots to complex AI agents, we need a solution to evaluate their effectiveness from prompt and model changes over time.

--

--

Chanon Roy

πŸ§‘πŸ»β€πŸ’» I write about tech and programming. I'm also fluent in film references