Agent Evaluation Field Notes¶
Agent Evaluation Field Notes is a compact public collection of scorecard templates, trajectory replay checklists, RAG guardrail smoke tests, small datasets, and demo launch pages for evaluating tool-using AI agents.
The goal is practical repeatability. These artifacts are intentionally small enough to copy into a real engineering workflow and concrete enough to support review during agent rollout decisions.
Public Anchors¶
GitHub source: https://github.com/MukundaKatta/agent-eval-platform-starters
GitLab mirror: https://gitlab.com/mukunda.vjcs6/agent-eval-platform-starters
Gitea mirror: https://gitea.com/MukundaKatta/agent-eval-platform-starters
Cloudflare Pages: https://agent-eval-field-notes.pages.dev
Firebase Hosting: https://agent-eval-field-notes.web.app
StackBlitz workspace: https://stackblitz.com/github/MukundaKatta/agent-eval-platform-starters
What Is Included¶
Operational scorecard patterns for tool-using agents.
Trajectory replay notes for debugging agent regressions.
RAG guardrail smoke tests for prompt injection and vector poisoning checks.
Starter artifacts for cloud, notebook, data, docs, and developer platforms.
Small public datasets that can be reused in demos and documentation.
Repository Layout¶
starter-artifacts/scorecard-apiFastAPI scorecard endpoint that returns structured evaluation examples.
starter-artifacts/static-siteStatic public landing page for the field-notes surface.
starter-artifacts/notebookNotebook-style markdown walkthrough for scorecard analysis.
starter-artifacts/datasetSmall CSV and metadata files for platform submissions.
starter-artifacts/docsReusable documentation notes and submission copy.