Opening the Dashboard
To access the Evaluation Dashboard:- Navigate to your project in Pylar
- Click the “Eval” button in the top-right corner of the screen
- The Evaluation Dashboard opens
Dashboard Overview
The Evaluation Dashboard is organized into several sections:- Filters: Select which MCP tool to review
- Summary Metrics: High-level performance indicators
- Visual Insights: Time-series graphs showing trends
- Error Analysis: Detailed error breakdown
- Raw Logs: Complete records of all tool calls
Filters
Selecting a Tool
At the top of the dashboard, you’ll find filters to select which MCP tool you want to review. How to use:- Click the filter dropdown
- Select the MCP tool you want to analyze
- The dashboard updates to show metrics for that tool only
Evaluation Metrics
The dashboard displays key metrics that summarize tool performance:Total Count
What it is: The total number of times the selected MCP tool was invoked. What it tells you: Overall usage volume—how frequently agents are using this tool.Total Count includes both successful and failed invocations. It’s the baseline for all other metrics.
Success Count
What it is: How many invocations returned a valid result. What it tells you: The absolute number of successful tool calls. Higher is better.Error Count
What it is: How many invocations failed to return a result. What it tells you: The absolute number of failed tool calls. Lower is better.Success Rate
Calculation:- High success rate (90%+) = Tool is working well
- Medium success rate (70-90%) = Some issues, needs attention
- Low success rate (less than 70%) = Significant problems, needs immediate attention
Aim for success rates above 90%. If your success rate is below this threshold, investigate errors to understand what’s going wrong.
Error Rate
Calculation:- Low error rate (less than 10%) = Tool is reliable
- Medium error rate (10-30%) = Some reliability issues
- High error rate (greater than 30%) = Major problems, needs fixing
Visual Insights
The dashboard includes time-series graphs that show how metrics change over time.Calls/Success/Errors Graph
What it shows: A time-series plot displaying:- Total Calls: How many times the tool was invoked over time
- Successes: Successful invocations over time
- Errors: Failed invocations over time
- Usage trends (increasing/decreasing usage)
- Performance trends (improving/declining success rates)
- Error patterns (when errors occur most frequently)
- Look for trends: Are errors increasing or decreasing?
- Identify patterns: Do errors spike at certain times?
- Compare periods: How does current performance compare to past performance?
Success/Error Rate (%) Graph
What it shows: Displays success and error percentages as a time-series trend. What it tells you:- Performance stability over time
- Whether your tool is improving or degrading
- Correlation between changes and performance
- Monitor trends: Is success rate improving?
- Spot anomalies: Are there sudden drops in performance?
- Track improvements: Did your changes improve performance?
Interpreting the Metrics
Healthy Tool Performance
A healthy tool shows:- ✅ High success rate (90%+)
- ✅ Low error rate (less than 10%)
- ✅ Stable or improving trends over time
- ✅ Consistent performance across time periods
Tool Needs Attention
A tool that needs attention shows:- ⚠️ Success rate below 90%
- ⚠️ Error rate above 10%
- ⚠️ Declining trends in graphs
- ⚠️ Inconsistent performance
Tool Needs Immediate Fix
A tool that needs immediate attention shows:- ❌ Success rate below 70%
- ❌ Error rate above 30%
- ❌ Sharp drops in performance graphs
- ❌ Frequent errors in logs
Using Filters Effectively
Analyzing Individual Tools
- Select a specific tool from the filter
- Review its metrics
- Check if it meets performance thresholds
- Investigate if it needs improvement
Comparing Tools
- Select different tools one at a time
- Compare their metrics
- Identify which tools perform best
- Learn from high-performing tools to improve others
Time Period Analysis
Use time filters (if available) to:- Compare different time periods
- See impact of tool changes
- Identify seasonal patterns
- Track improvement over time
Next Steps
Now that you understand the dashboard:- Analyzing Errors - Dive deeper into error patterns
- Understanding Query Shapes - Learn about query patterns
- Raw Logs - Explore detailed execution logs
Analyze Errors
Learn how to identify and fix errors in your tools