Configure evaluations for agents - Practice - Evaluate an HR knowledge assistant agent

This course is designed so you can also learn from your fellow learners, particularly through the forum topics.

If you need help, post a question as a reply in the Forum discussion below. To increase your chances of being helped, be as descriptive as possible. Include in your comment:

  • A description of your issue: when is it happening, what activity you have trouble with.
  • A screenshot of your error.
  • You can also attach your agentic project.

If you can help a fellow learner, don’t be afraid to reply and make a suggestion. Participating in the conversation helps solidify the knowledge you’ve acquired in this course.

What I learned in this lesson (Evaluations):

  • Define an evaluation set and run it to measure agent quality.
  • Score outputs (0–100) and use pass rate/health score to spot what to fix (e.g., prompts, tool coverage, schema).

Key takeaways:

  • Evaluations = input + assertion + expected output → scored result.
  • Group evaluations into sets and track pass rate + agent health for continuous feedback.
  • Treat 80%+ as a reliability gate before production.

How I practiced it:

  • Imported the sample agent in Studio Web (Agent_SupportGroup.uis).
  • Set up context grounding with SupportGroups.csv.
  • Wrote and ran evaluation sets from Evaluations.txt.

Coverage tips I’m using:

  • Simple agents: ~30 evals across 1–3 sets.
  • Moderate: 60–80 with a dedicated edge‑case set.
  • Complex: 100+ including adversarial prompts, typos, and boundary values.

If anyone gets stuck (setup, assertions, or interpreting scores), I’m happy to help—share your error, screenshot, and what you’ve tried!