ADD-0001: Template

MetadataValue
StatusDraft | In Review | Approved | Deployed | Deprecated
CreatedYYYY-MM-DD
Author(s)@username
RFCRFC-NNNN (if applicable)
ModelGPT-4o | Claude 3.5 Sonnet | etc.

Summary

One paragraph description: What does this agent do? Who is it for?

Agent Persona

Identity

1
2
3
4
Name: [Agent Name]
Role: [e.g., "Code Review Assistant", "Customer Support Agent"]
Personality: [e.g., "Helpful, concise, technically accurate"]
Voice: [e.g., "Professional but approachable"]

Core Purpose

What is the primary job this agent is designed to do?

Target Users

User TypeUse Case
DevelopersCode review feedback
Support TeamTicket triage

Capabilities

Tools / Functions

ToolDescriptionRisk Level
search_codebaseSearch for code patternsLow
read_fileRead file contentsLow
create_pr_commentPost review commentsMedium
approve_prApprove pull requestHigh

Tool Definitions

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
interface SearchCodebase {
  query: string;
  filePattern?: string;
  maxResults?: number;
}

interface CreatePRComment {
  prNumber: number;
  file: string;
  line: number;
  body: string;
  severity: 'suggestion' | 'warning' | 'error';
}

Supported Actions

  • Read and analyze code
  • Provide suggestions and explanations
  • Create comments on pull requests
  • Modify code directly
  • Merge pull requests
  • Access external systems

System Prompt

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
You are [Agent Name], a [role] for [company/project].

## Your Purpose
[Primary purpose statement]

## Guidelines
- [Guideline 1]
- [Guideline 2]
- [Guideline 3]

## Constraints
- Never [constraint 1]
- Always [constraint 2]

## Response Format
[Expected output format]

Guardrails & Safety

Hard Constraints (Never Violate)

ConstraintRationaleEnforcement
No code executionSecurity riskTool not provided
No PII in logsPrivacy complianceOutput filtering
No external API callsData leakage riskNetwork isolation

Soft Constraints (Prefer to Follow)

ConstraintRationaleOverride Condition
Max 3 suggestions per fileAvoid noiseCritical security issue
Response under 500 tokensReadabilityComplex explanation needed

Content Filtering

  • Input: [Describe input validation/sanitization]
  • Output: [Describe output filtering rules]

Rate Limiting

ScopeLimitWindow
Per user100 requests1 hour
Per repo500 requests1 hour
Global10,000 requests1 hour

Human-in-the-Loop

Escalation Triggers

TriggerActionSLA
Confidence < 70%Request human review-
Security findingAlert security team1 hour
User disputes resultEscalate to maintainer24 hours

Approval Requirements

ActionApproval Required
Read codeNone
Comment on PRNone
Request changesNone
Approve PRHuman co-approval
Merge PRNot permitted

Evaluation & Metrics

Success Metrics

MetricTargetMeasurement
Accuracy> 90%Manual review sample
Helpfulness rating> 4/5User feedback
False positive rate< 5%Disputed suggestions
Response time< 30sp95 latency

Evaluation Dataset

Describe or link to the evaluation dataset used to test the agent.

CategoryExamplesExpected Behavior
Happy path[link]Provide accurate review
Edge cases[link]Graceful degradation
Adversarial[link]Refuse and explain

A/B Testing Plan

If applicable, describe the rollout and testing strategy.

Error Handling

Failure Modes

FailureDetectionRecovery
Model timeout30s thresholdRetry with backoff
Rate limit429 responseQueue and retry
Invalid outputSchema validationFallback response
Model refusalContent filterHuman escalation

Fallback Behavior

What happens when the agent can’t complete its task?

1. Log the failure with context
2. Notify user: "I couldn't complete this analysis. A human reviewer will follow up."
3. Create ticket for human review
4. Continue processing other items

Observability

Logging

EventLog LevelData Captured
Request receivedINFOuser_id, repo, pr_number
Tool invocationDEBUGtool_name, params (redacted)
Response sentINFOresponse_time, token_count
ErrorERRORerror_type, stack_trace

Monitoring Dashboards

  • Latency: p50, p95, p99 response times
  • Volume: Requests per hour/day
  • Errors: Error rate by type
  • Quality: User feedback scores

Alerting

ConditionSeverityAction
Error rate > 5%WarningSlack notification
Error rate > 20%CriticalPagerDuty + auto-disable
Latency p95 > 60sWarningSlack notification

Implementation Notes

Dependencies

DependencyVersionPurpose
openai^4.0LLM API client
langchain^0.1Agent framework
tiktoken^0.5Token counting

Configuration

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
agent:
  model: gpt-4o
  temperature: 0.3
  max_tokens: 1024
  timeout_seconds: 30

tools:
  search_codebase:
    max_results: 50
  read_file:
    max_size_kb: 100

Rollout Plan

PhaseScopeDurationSuccess Criteria
AlphaInternal team2 weeksNo critical bugs
Beta10% of users2 weeksPositive feedback
GAAll users-Metrics met

References