
Benchmarking LLMs for Agricultural Advisory: Insights from a Global Community of Practice
Large language models are playing a growing role in agricultural advisory services. This creates a need for shared benchmarks to evaluate their performance, equity, and contextual relevance across different settings.
Background
A global community of practice, supported by the Gates Foundation, is developing standards and tools to assess LLM performance. The work examines how these models function across diverse use cases and geographies.
Webinar Focus
This session shares insights from the first six months of collaboration. The discussion follows an initial convening held in May 2025.
Participants will explore practical approaches for evaluating AI performance in real-world agricultural settings. The webinar addresses three key areas: linguistic diversity, contextual relevance, and gender responsiveness.
What You'll Learn
The session covers emerging standards for LLM evaluation in agriculture. You'll see how different communities assess whether these tools meet their specific needs.
The webinar examines challenges in applying AI across different languages and cultural contexts. It also looks at how to ensure these tools serve diverse user groups fairly.
Who Should Attend
Agricultural advisory service providers
AI developers working in agriculture
Researchers studying agricultural technology adoption
Development practitioners using digital tools
Policymakers overseeing agricultural extension services
Organizations implementing AI-based advisory systems