Project Discovery¶
AI ReviewBot includes an automatic Project Discovery system that analyzes your repository before each code review. Discovery learns your stack, CI pipeline, and conventions so the reviewer can provide smarter, less noisy feedback.
How It Works¶
Discovery runs a 4-layer pipeline on the first PR/MR:
| Layer | Source | Cost |
|---|---|---|
| Layer 0 โ Platform API | Languages, file tree, topics from GitHub/GitLab API | Free (API only) |
| Layer 1 โ CI Analysis | Parse GitHub Actions / GitLab CI / Makefile for tools | Free (local parsing) |
| Layer 2 โ Config Files | Read pyproject.toml, package.json, linter configs |
Free (file reads) |
| Layer 3 โ LLM Interpretation | AI interprets ambiguous data (only when Layers 0-2 are insufficient) | ~50-200 tokens |
Each layer degrades gracefully โ if one fails, the pipeline continues with what it has.
Attention Zones¶
Discovery classifies each quality area into one of three Attention Zones based on your CI/tooling coverage:
| Zone | Emoji | Meaning | Reviewer behavior |
|---|---|---|---|
| Well Covered | โ | CI tools handle this area | Reviewer skips it |
| Weakly Covered | โ ๏ธ | Partial coverage, room for improvement | Reviewer pays attention + suggests improvements |
| Not Covered | โ | No automation detected | Reviewer focuses on this area |
Example zones¶
| Area | Status | Reason |
|---|---|---|
| Formatting | โ Well Covered | ruff format in CI |
| Type checking | โ Well Covered | mypy --strict in CI |
| Security scanning | โ Not Covered | No security scanner in CI |
| Test coverage | โ ๏ธ Weakly Covered | pytest runs but no coverage threshold |
What Happens Automatically¶
- Discovery analyzes your repo (languages, CI tools, config files).
- Attention Zones are computed โ each quality area is classified as well-covered, weakly-covered, or not-covered.
- Review prompt is enriched with zone-driven instructions (~200-400 tokens).
- The reviewer skips well-covered areas and focuses on not-covered ones.
Discovery Comment¶
If Discovery finds gaps or uncovered zones, it posts a one-time summary comment on the PR/MR:
๐ AI ReviewBot: Project Analysis¶
Stack: Python (FastAPI) 3.13, uv
CI: โ .github/workflows/tests.yml โ ruff, mypy, pytest
Not Covered (focusing in review)¶
- โ Security scanning โ No security scanner detected in CI ๐ก Consider adding bandit or safety to your pipeline
Could Be Improved¶
- โ ๏ธ Test coverage โ pytest runs but no coverage threshold enforced ๐ก Add
--cov-fail-under=80to enforce minimum coverageQuestions / Gaps: - No security scanner detected in CI Question: Do you use any security scanning tools? Assumption: Will check for common vulnerabilities manually
๐ก Create
.reviewbot.mdin your repo root to customize.
In verbose mode (discovery_verbose=true), the comment also includes well-covered zones:
Well Covered (skipping in review)¶
- โ Formatting โ ruff format in CI
- โ Type checking โ mypy --strict in CI
Watch-Files & Caching¶
Discovery uses watch-files to avoid re-running LLM analysis when project configuration hasn't changed.
How it works¶
- First run: Discovery runs the full pipeline, LLM returns a
watch_fileslist (e.g.,pyproject.toml,.github/workflows/tests.yml). - Subsequent runs: Discovery hashes each watch-file and compares with the cached snapshot.
- If unchanged: cached result is used โ 0 LLM tokens consumed.
- If changed: LLM re-analyzes the project.
This means repeated PRs on the same branch cost zero additional tokens for discovery, as long as the watched configuration files haven't changed.
Token savings
On a typical project, the second and subsequent PRs use 0 tokens for discovery. Only changes to CI config, pyproject.toml, package.json, or similar files trigger a new LLM call.
discover CLI Command¶
You can run discovery standalone (without creating a review) using the discover command:
Options¶
| Option | Short | Description | Default |
|---|---|---|---|
--provider |
-p |
Git provider | github |
--json |
Output as JSON | false |
|
--verbose |
-v |
Show all details (conventions, CI tools, watch-files) | false |
Examples¶
# Basic discovery
ai-review discover owner/repo
# JSON output for scripting
ai-review discover owner/repo --json
# Verbose with all details
ai-review discover owner/repo --verbose
# GitLab project
ai-review discover group/project -p gitlab
Backward compatibility
ai-review (without subcommand) still runs a review as before. The discover subcommand is new.
.reviewbot.md¶
Create a .reviewbot.md file in your repository root to provide explicit project context. When this file exists, Discovery skips the automated pipeline and uses your configuration directly.
Format¶
<!-- Auto-generated by AI ReviewBot. Feel free to edit. -->
# .reviewbot.md
## Stack
- **Language:** Python 3.13
- **Framework:** FastAPI
- **Package manager:** uv
- **Layout:** src
## Automated Checks
- **Linting:** ruff
- **Formatting:** ruff
- **Type checking:** mypy
- **Testing:** pytest
- **Security:** bandit
- **CI:** github_actions
## Review Guidance
### Skip (CI handles these)
- Import ordering (ruff handles isort rules)
- Code formatting and style (ruff format in CI)
- Type annotation completeness (mypy --strict in CI)
### Focus
- SQL injection and other OWASP Top 10 vulnerabilities
- API backward compatibility
- Business logic correctness
### Conventions
- All endpoints must return Pydantic response models
- Use dependency injection for database sessions
Sections¶
| Section | Purpose |
|---|---|
| Stack | Primary language, version, framework, package manager, layout |
| Automated Checks | Tools already running in CI (the reviewer will skip these areas) |
| Review Guidance โ Skip | Specific areas the reviewer should not comment on |
| Review Guidance โ Focus | Areas you want extra attention on |
| Review Guidance โ Conventions | Project-specific rules the reviewer should enforce |
Auto-generation
You can let Discovery run once, then copy its findings into .reviewbot.md and adjust as needed. The bot includes a footer link suggesting this workflow.
Configuration¶
| Variable | Default | Description |
|---|---|---|
AI_REVIEWER_DISCOVERY_ENABLED |
true |
Enable or disable project discovery |
AI_REVIEWER_DISCOVERY_VERBOSE |
false |
Always post discovery comment (default: only on gaps/uncovered zones) |
AI_REVIEWER_DISCOVERY_TIMEOUT |
30 |
Discovery pipeline timeout in seconds (1-300) |
Set AI_REVIEWER_DISCOVERY_ENABLED to false to skip discovery entirely. The reviewer will still work, but without project-specific context.
# GitHub Actions โ disable discovery
- uses: KonstZiv/ai-code-reviewer@v1
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
google_api_key: ${{ secrets.AI_REVIEWER_GOOGLE_API_KEY }}
discovery_enabled: 'false'
Silent Mode¶
The discovery comment is not posted when:
.reviewbot.mdexists in the repository โ the bot assumes you've already configured it.- No gaps and no uncovered zones โ everything is well-covered, no questions to ask.
- Duplicate detection โ a discovery comment was already posted on this PR/MR.
In all three cases, discovery still runs and enriches the review prompt โ it just doesn't post a visible comment.
FAQ¶
Can I disable discovery?¶
Yes. Set AI_REVIEWER_DISCOVERY_ENABLED=false. The reviewer will work without project context, the same as before the Discovery feature was added.
Does discovery cost extra LLM tokens?¶
On the first run: Layers 0-2 are free (API calls and local parsing). Layer 3 (LLM interpretation) is only invoked when the first three layers don't provide enough data โ typically 50-200 tokens, which is negligible compared to the review itself (~1,500 tokens).
On subsequent runs: if your watch-files haven't changed, discovery uses the cached result and costs 0 tokens.
Can I edit the auto-generated .reviewbot.md?¶
Yes, absolutely. The file is designed to be human-editable. Change anything you need โ the parser is tolerant of extra content and missing sections.
Does discovery run on every PR?¶
Discovery enriches the review prompt on every PR. The LLM call is cached via watch-files (0 tokens when unchanged). The discovery comment is posted only once (duplicate detection prevents repeated posts).
How do I see all zones including well-covered?¶
Set AI_REVIEWER_DISCOVERY_VERBOSE=true. This forces the discovery comment to always be posted and includes all zones (well-covered, weakly-covered, not-covered).
What if discovery takes too long?¶
Set AI_REVIEWER_DISCOVERY_TIMEOUT to a higher value (default: 30 seconds, max: 300). If discovery exceeds the timeout, the review proceeds without discovery context.