Git provider
Azure
System Info
Environment
- PR-Agent version:
pragent/pr-agent:0.34.3-azure_devops_webhook
- Deployment: Azure Container Apps
- Git provider: Azure DevOps
- Webhooks : On Comment, PR Created, PR Updated
Bug details
Description
In long-lived deployments (e.g. Azure Container Apps), ticket context fetched for one PR leaks into all subsequent PR reviews for the lifetime of the process. The compliance section references tickets from the first reviewed PR for every subsequent review, regardless of which repo or work item is linked. Restarting the container clears the issue, confirming it is process-state related.
Steps to reproduce
- Deploy PR-Agent as a long-lived process
- Open PR A — branch contains a work item ID e.g.
11111_SomeFeature with work item 11111 linked
- Comment
/review on PR A — ticket 11111 is fetched correctly, compliance section is correct
- Open PR B — branch contains a different work item ID e.g.
22222_AnotherFeature with work item 22222 linked, on a different repo
- Comment
/review on PR B — compliance section references ticket 11111 from PR A instead of 22222
- Repeat for any further PRs — ticket
11111 is used for all of them
- Restart the container
- Comment
/review on PR B again — correct ticket 22222 is now used
Observed behaviour
Logs for PR B show:
"Using cached tickets"
artifact: {"tickets": [{"ticket_id": 11111, "title": "Title from PR A's work item"}]}
The compliance section then incorrectly evaluates every subsequent PR's code changes against ticket 11111 for the entire lifetime of the process — not just the next PR, but all PRs until restart.
Impact
Users running long-lived deployments who have experienced incorrect ticket compliance output may have disabled the feature entirely via PR_REVIEWER__REQUIRE_TICKET_ANALYSIS_REVIEW=false without identifying the root cause. This likely contributes to the bug going unreported — the symptom is easy to work around but the underlying issue remains.
Root cause
In ticket_pr_compliance_check.py, extract_and_cache_pr_tickets() stores fetched tickets in the global process-level settings singleton with no TTL, no invalidation, and no PR-scoped key:
get_settings().set('related_tickets', related_tickets)
On every subsequent PR review, the cache check finds a non-empty value and skips fetching entirely:
related_tickets = get_settings().get('related_tickets', [])
if not related_tickets: # Always non-empty after first PR — fetch is never called again
tickets_content = await extract_tickets(git_provider)
Since get_settings() is a process-level singleton with no TTL, ticket context from the first PR reviewed is returned for every subsequent PR for the entire process lifetime.
Why this wasn't caught earlier
The caching was likely designed with GitHub Actions pr_commands sequential execution in mind — where /describe, /review and /improve all run in the same short-lived process for a single PR. In that model the cache is correct and beneficial, saving redundant API calls across tool executions on the same PR. The process then dies and the cache is gone.
In a long-lived deployment the process persists across many unrelated PRs, exposing the missing PR-scoped isolation.
Expected behaviour
Each PR review should independently resolve its own ticket context. Within a single PR, ticket context should be cached and reused across multiple tool calls. Across different PRs, the cache should be isolated.
Proposed fix
Key the cache by a hash of the PR URL rather than a global key. This preserves the cross-tool caching benefit for the pr_commands sequential use case while correctly isolating ticket context per PR.
A hash is used rather than a composite of workspace, repo and PR number because ADO workspace and repo names can contain spaces and special characters making a composite string key unreliable as a settings key.
import hashlib
async def extract_and_cache_pr_tickets(git_provider, vars):
if not get_settings().get('pr_reviewer.require_ticket_analysis_review', False):
return
cache_key = f'related_tickets_{hashlib.md5(git_provider.pr_url.encode()).hexdigest()}'
related_tickets = get_settings().get(cache_key, [])
if not related_tickets:
tickets_content = await extract_tickets(git_provider)
if tickets_content:
# Store sub-issues along with main issues
for ticket in tickets_content:
if "sub_issues" in ticket and ticket["sub_issues"]:
for sub_issue in ticket["sub_issues"]:
related_tickets.append(sub_issue) # Add sub-issues content
related_tickets.append(ticket)
get_logger().info("Extracted tickets and sub-issues from PR description",
artifact={"tickets": related_tickets})
vars['related_tickets'] = related_tickets
get_settings().set(cache_key, related_tickets)
else:
get_logger().info("Using cached tickets", artifact={"tickets": related_tickets})
vars['related_tickets'] = related_tickets
This approach:
- ✅ Correctly scopes ticket context per PR
- ✅ Preserves cross-tool caching within a single PR — in deployments configured to stay warm across requests (e.g. Azure Container Apps with a scale-to-zero inactivity timeout), multiple tool calls on the same PR (e.g. /review then /ask) benefit from the cache without re-fetching ticket context
- ✅ No behaviour change for GitHub Actions (fresh process per run anyway)
- ✅ Safe cache key regardless of special characters in workspace or repo names
- ⚠️ Cache entries accumulate for the lifetime of the process — for truly long-lived processes with high PR volume and no regular restart cycle, accumulation could become a concern over time
Additional suggestion
A config flag pr_reviewer.cache_tickets (default true to preserve existing behaviour) would give users in long-lived deployments an explicit escape hatch to disable caching entirely:
use_cache = get_settings().get('pr_reviewer.cache_tickets', True)
cache_key = f'related_tickets_{hashlib.md5(git_provider.pr_url.encode()).hexdigest()}'
related_tickets = get_settings().get(cache_key, []) if use_cache else []
if not related_tickets:
tickets_content = await extract_tickets(git_provider)
...
if use_cache:
get_settings().set(cache_key, related_tickets)
Setting PR_REVIEWER__CACHE_TICKETS=false would re-fetch tickets on every tool call. Given ticket API calls are lightweight this is an acceptable tradeoff — and strictly better than the current behaviour where a single stale ticket is shared across every PR for the entire process lifetime.
Git provider
Azure
System Info
Environment
pragent/pr-agent:0.34.3-azure_devops_webhookBug details
Description
In long-lived deployments (e.g. Azure Container Apps), ticket context fetched for one PR leaks into all subsequent PR reviews for the lifetime of the process. The compliance section references tickets from the first reviewed PR for every subsequent review, regardless of which repo or work item is linked. Restarting the container clears the issue, confirming it is process-state related.
Steps to reproduce
11111_SomeFeaturewith work item11111linked/reviewon PR A — ticket11111is fetched correctly, compliance section is correct22222_AnotherFeaturewith work item22222linked, on a different repo/reviewon PR B — compliance section references ticket11111from PR A instead of2222211111is used for all of them/reviewon PR B again — correct ticket22222is now usedObserved behaviour
Logs for PR B show:
The compliance section then incorrectly evaluates every subsequent PR's code changes against ticket
11111for the entire lifetime of the process — not just the next PR, but all PRs until restart.Impact
Users running long-lived deployments who have experienced incorrect ticket compliance output may have disabled the feature entirely via
PR_REVIEWER__REQUIRE_TICKET_ANALYSIS_REVIEW=falsewithout identifying the root cause. This likely contributes to the bug going unreported — the symptom is easy to work around but the underlying issue remains.Root cause
In
ticket_pr_compliance_check.py,extract_and_cache_pr_tickets()stores fetched tickets in the global process-level settings singleton with no TTL, no invalidation, and no PR-scoped key:On every subsequent PR review, the cache check finds a non-empty value and skips fetching entirely:
Since
get_settings()is a process-level singleton with no TTL, ticket context from the first PR reviewed is returned for every subsequent PR for the entire process lifetime.Why this wasn't caught earlier
The caching was likely designed with GitHub Actions
pr_commandssequential execution in mind — where/describe,/reviewand/improveall run in the same short-lived process for a single PR. In that model the cache is correct and beneficial, saving redundant API calls across tool executions on the same PR. The process then dies and the cache is gone.In a long-lived deployment the process persists across many unrelated PRs, exposing the missing PR-scoped isolation.
Expected behaviour
Each PR review should independently resolve its own ticket context. Within a single PR, ticket context should be cached and reused across multiple tool calls. Across different PRs, the cache should be isolated.
Proposed fix
Key the cache by a hash of the PR URL rather than a global key. This preserves the cross-tool caching benefit for the
pr_commandssequential use case while correctly isolating ticket context per PR.A hash is used rather than a composite of workspace, repo and PR number because ADO workspace and repo names can contain spaces and special characters making a composite string key unreliable as a settings key.
This approach:
Additional suggestion
A config flag
pr_reviewer.cache_tickets(defaulttrueto preserve existing behaviour) would give users in long-lived deployments an explicit escape hatch to disable caching entirely:Setting
PR_REVIEWER__CACHE_TICKETS=falsewould re-fetch tickets on every tool call. Given ticket API calls are lightweight this is an acceptable tradeoff — and strictly better than the current behaviour where a single stale ticket is shared across every PR for the entire process lifetime.