Landing
<title>Vibe Analyzer</title>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<meta property="og:title" content="Vibe Analyzer" />
<meta property="og:description" content="Agentic RAG engine for code and knowledge bases" />
<link rel="icon" href="assets/images/favicon.png" />
Analyzer
# Quick Start
## 1. Install OpenSearch ## 2. Install Open WebUI
## 3. Install Vibe Analyzer cargo install vibe-analyzer
## 4. Add source — project or knowledge vibe-analyzer source add {path}
## 5. Index projects vibe-analyzer scan index
## 6. Start MCP vibe-analyzer serve start
Introduction
What is Vibe Analyzer
Vibe Analyzer is a codebase analysis and Agentic RAG engine. It extracts structure from source code via AST parsing, enriches it with LLM, runs static analysis for 14 languages, and indexes everything into OpenSearch. AI assistants access knowledge through 11 MCP tools.
The Problem with Traditional RAG
Traditional RAG works like this:
Query → Embeddings → Find similar documents → Load into prompt → Response
Problems:
- 📈 Found documents are added to the prompt in their entirety
- 💾 The larger the project, the more VRAM is required
- 🔍 Relevance drops as context volume grows
- 💸 Each query becomes more expensive
How Agentic RAG Works
Vibe Analyzer flips the paradigm:
Query → AI model selects an MCP tool → Tool returns a structured response
Advantages:
- 📉 Minimal context — the model receives only what the tool returns
- 🧠 No embeddings — keyword and AST search via OpenSearch
- 🔗 One tool call = complete answer, no document stuffing
- ♾️ Context size stays constant regardless of project size
Key Features
- 🌳 AST parsing for 14 programming languages
- 🔍 Static analysis: code markers (TODO, FIXME, HACK), warnings (unwrap, panic) across all languages
- 💡 LLM enrichment: technical debt, bugs, refactoring suggestions, file summaries
- 🖥️ Multi-node LLM cluster — distribute enrichment across local and cloud models
- 🌐 Multi-provider support: Ollama, DeepSeek, Qwen
- 📄 Export AST and analysis results to JSON, JSON5, TOML, TOON, XML, YAML
- 📝 Semantic and morphological search across code and documentation
- ⚡ Incremental indexing (modified files only)
- 📦 Self-contained MCP tools (one call — complete response)
- 🗂️ Multilingual support (RU, EN, ZH)
- 🦀 Built in Rust — fast and memory-efficient
Who This Is For
- Development teams — index your codebase, let AI assistants answer architecture questions
- Developers under NDA — the entire stack runs locally: OpenSearch, Ollama, MCP server. No data ever leaves to external APIs
- Private projects — models from 3B parameters run on your hardware
- Technical writers — store documentation in Markdown files and search it in any language
- Open-source projects — give contributors a quick way to understand the code
What’s Next
- Quick Start — installation, setup, and first run
- Architecture — how everything works under the hood
- MCP Tools — complete reference for all 11 tools
Quick Start
Prerequisites
- OpenSearch — storage and search for indexed data
- LLM provider — Ollama (local), DeepSeek, or Qwen for code enrichment
Both can be run locally via Docker.
Installation
# Via Cargo (recommended)
cargo install vibe-analyzer
# Or build from source
git clone https://gitcode.com/keygenqt_vz/vibe-analyzer.git
cd vibe-analyzer
cargo build --release
Starting Services
# OpenSearch
cd docker/opensearch && docker-compose up -d
# Open WebUI (optional, for AI assistant connection)
cd docker/open-webui && docker-compose up -d
Configuration
Config file: ~/.vibe-analyzer/config.json5 (created automatically on first run).
See Configuration for all options and detailed reference.
Usage
# Add a source
vibe-analyzer source add /path/to/project
# AST only (no LLM)
vibe-analyzer analyze export
# AST + LLM enrichment
vibe-analyzer analyze export -m meta,debt,errors
# Code files only, JSON5 format
vibe-analyzer analyze export -m debt -t code -f json5
# Index to OpenSearch
vibe-analyzer analyze index --target my-project
# Start MCP server for AI assistants
vibe-analyzer serve start
# View statistics
vibe-analyzer stats info
What’s Next
- Architecture — detailed breakdown
- CLI Reference — complete command reference
- Configuration — all config options
- Integrations — connecting AI assistants
Architecture
Overview
Vibe Analyzer consists of four main components that work sequentially:
Code sources
│
▼
┌─────────────┐
│ Scanner │ AST parsing, structure extraction
└─────────────┘
│
▼
┌─────────────┐
│ Analyzer │ LLM enrichment, descriptions, tags
└─────────────┘
│
▼
┌─────────────┐
│ Indexer │ Writing to OpenSearch
└─────────────┘
│
▼
┌─────────────┐
│ MCP Server │ HTTP API for AI assistants
└─────────────┘
Components
Scanner
The Scanner handles initial source code processing:
- File system traversal — recursive directory scanning respecting
.gitignoreand default exclusion patterns (.git,target,node_modules, etc.) - Language detection — selects the appropriate tree-sitter parser based on file extension
- AST parsing — extracts code structure: functions, classes, imports, variables, enums, interfaces, structs
- Metadata collection — line count, file size, BLAKE3 content hash
- License detection — searches for a LICENSE file and identifies the license type via askalono and SPDX
- README detection — priority: root > subdirectories,
.md>.txt> no extension - Statistics collection — aggregation by language, file count, lines of code
Analyzer
The Analyzer enriches scanner results using LLM and static analysis:
- Static analysis — finds code markers (TODO, FIXME, HACK) and warnings (unwrap, panic, bare except) across 14 languages
- LLM enrichment — four modes: Meta (summaries), Debt (technical debt), Errors (bugs), Advice (refactoring)
- Prompt generation — builds a request for each file containing the AST structure
- Request distribution — with multiple cluster nodes configured, prompts are distributed across workers via vibe-cluster. Local models (Ollama) get priority, with automatic load balancing across all nodes
- Batch processing — files are grouped into batches limited by
max_chunk_chars(Meta mode). Other modes process one file per prompt - Controlled generation — configurable parameters
temperature,seed,num_ctx,num_predictfor reproducible results - Project summarization — a separate request generates a brief description of the entire source
Indexer
The Indexer manages writing data to OpenSearch:
- Three indices per source:
vibe_meta— project metadata (summary, license, statistics, README)vibe_files_{hash}— full file contentsvibe_files_analysis_{hash}— AST, enriched descriptions, and search tags
- Bulk operations — batch writing for maximum performance
- Incremental updates — BLAKE3 hash comparison, only changed files are re-processed
- Cleanup — removes stale data no longer present in the source
MCP Server
The MCP server provides an API for AI assistants:
- Protocol — Model Context Protocol (MCP) via Streamable HTTP transport
- 11 tools — admin, get, search, and show categories
- Anti-Hallucination Protection — parameter normalization, tool name aliases, auto language detection
- Logging — middleware for tracking all requests
- CORS — cross-origin request support for web interfaces
Indexing Lifecycle
Full Indexing
1. source add → save path to config
2. scan index → check OpenSearch → cleanup orphaned data → AST parsing → LLM enrichment → indexing
Incremental Updates
1. scan index → load hashes from OpenSearch → compare with files on disk
2. New/modified → AST parsing → LLM enrichment → indexing
3. Deleted → removal from OpenSearch
4. Unchanged → skip
Export without Indexing
1. scan ast → traverse files → AST parsing → export to file
2. scan analyze → traverse files → AST parsing → LLM enrichment → export to file
scan ast and scan analyze do not touch OpenSearch — file export only.
OpenSearch Indices
vibe_meta
One document per project: summary, license, README, aggregated statistics (files, lines, size).
vibe_files_{hash}
One document per file: full contents. The content field is not indexed for search — only stored for retrieval via get_file_content.
vibe_files_analysis_{hash}
One document per text file. Contains AST (functions, classes, imports, etc.), file metadata, and multilingual search tags. The description and tags fields are added after LLM enrichment.
LLM Cluster
Vibe Analyzer uses vibe-cluster for distributing enrichment prompts across multiple LLM providers:
- Multi-provider — supports Ollama (local), DeepSeek, and Qwen (cloud) simultaneously
- Load balancing — prompts are distributed across all available nodes via atomic work-stealing
- Local-first priority — local Ollama models get prompts before cloud providers for faster response
- Parallel connections — configurable
parallelparameter for multiple workers per provider - Automatic retry — network errors and server failures trigger automatic retries
- Exclusive mode — eject and reload models between prompts for clean context
- Per-node statistics — after enrichment, reports how many files each node processed
This approach allows:
- Faster enrichment through parallel processing on multiple GPUs and cloud APIs
- Cost optimization — local models handle most work, cloud used only when needed
- Scaling — add more nodes to the cluster configuration as needed
Anti-Hallucination Protection
Protection against AI model hallucinations when calling tools:
| Mechanism | Description |
|---|---|
| Name aliases | 150+ alternative tool names (e.g., search_code_functions → search_by_code_functions) |
| Parameter normalization | Wildcard replacement, whitespace trimming, type casting |
| Bounds validation | limit always in 1–10 range, level capped |
| Auto language detection | Detects Cyrillic, Latin, and CJK in search queries |
| Soft error handling | Invalid parameters don’t cause errors, they are normalized to safe values |
Performance
- Rust — native execution without GC overhead
- Parallel parsing — each file processed independently
- Bulk OpenSearch writes — thousands of documents per operation
- Streaming processing — files are processed as they are discovered, without waiting for the entire directory
- Incremental updates — only changed files are re-indexed when updating a source
Supported Languages
Vibe Analyzer supports AST parsing, static analysis, and LLM enrichment for 14 languages.
Full List
| Language | Extensions | AST | Static Analysis | Enrichment |
|---|---|---|---|---|
| Rust | .rs | ✅ | ✅ | ✅ |
| Python | .py | ✅ | ✅ | ✅ |
| JavaScript | .js | ✅ | ✅ | ✅ |
| TypeScript | .ts | ✅ | ✅ | ✅ |
| Java | .java | ✅ | ✅ | ✅ |
| Go | .go | ✅ | ✅ | ✅ |
| C# | .cs | ✅ | ✅ | ✅ |
| Kotlin | .kt | ✅ | ✅ | ✅ |
| Swift | .swift | ✅ | ✅ | ✅ |
| Dart | .dart | ✅ | ✅ | ✅ |
| Bash | .sh | ✅ | ✅ | ✅ |
| Batch | .bat | ✅ | ✅ | ✅ |
| ArkTS | .ets | ✅ | ✅ | ✅ |
| Markdown | .md | ✅ | ✅ | ✅ |
Extracted Elements
Code
| Element | Description |
|---|---|
functions | Functions and methods with signatures and doc comments |
classes | Class declarations |
structs | Struct and record declarations |
enums | Enum declarations |
interfaces | Interface, trait, and protocol declarations |
variables | Module-level variables and constants |
imports | Import statements and dependencies |
header_comments | File-level documentation comments |
Markdown
| Element | Description |
|---|---|
headings | Headings with level, title, and preview |
links | Links with text and URL |
code_blocks | Fenced code block languages |
frontmatter | YAML frontmatter metadata |
Static Analysis
| Element | Description |
|---|---|
markers | Code markers: TODO, FIXME, HACK, XXX, and 17 more |
warnings | Potential issues: unwrap, panic, empty catch, console.log |
Doc Comment Formats
Vibe Analyzer extracts documentation from specially formatted comments. Regular comments (//, #) are ignored.
| Language | Doc Comment | Module Comment | Example |
|---|---|---|---|
| Rust | /// or /** */ | //! or /*! */ | /// Adds two numbers |
| Python | """...""" (docstring) | """...""" at file start | """Adds two numbers""" |
| JavaScript | /** */ (JSDoc) | /** */ at file start | /** @param {number} a */ |
| TypeScript | /** */ (JSDoc) | /** */ at file start | /** @param a First number */ |
| Java | /** */ (Javadoc) | /** */ at file start | /** @param a First number */ |
| Kotlin | /** */ (KDoc) | /** */ at file start | /** @param a First number */ |
| C# | /// or /** */ | /** */ or /// at file start | /// <summary>Adds two numbers</summary> |
| Swift | /// or /** */ | /// or /** */ at file start | /// - Parameters: a: First number |
| Dart | /// | /// at file start | /// Adds two numbers |
| Go | // (any before declaration) | // at file start | // Add adds two numbers |
| Bash | ## or # before function | ## at script start | ## Module documentation for Bash testing |
| Batch | :: before label | :: at script start | :: Module documentation for Batch testing |
| ArkTS | /** */ | /** */ at file start | /** Async function example */ |
Multilingual Search
Each element receives tags in EN, RU, and ZH for language-agnostic search.
Limitations
- Maximum file size: 10 MB
- Ignored directories:
target,node_modules,.git,.idea, etc. - Binary files: excluded from parsing
- Nested elements: methods inside classes extracted as functions; variables inside functions are not extracted
Configuration
Vibe Analyzer uses a JSON5 configuration file. JSON5 is an extended version of JSON with support for comments, trailing commas, and other convenient features.
Location
~/.vibe-analyzer/config.json5
The file is created automatically with default settings the first time any CLI command is run.
Configuration Structure
The configuration consists of these sections: version, log_level, opensearch, mcp, analyze, export, cluster, and sources.
Full Example with Comments
{
// Configuration version (do not modify)
"version": "0.0.5",
// Log level: cli, error, warn, info, debug, trace
"log_level": "cli",
// OpenSearch connection
"opensearch": {
"host": "http://localhost:9200"
},
// MCP server
"mcp": {
"host": "127.0.0.1",
"port": 9020,
"protocol": "latest"
},
// Analysis settings
"analyze": {
"max_chunk_chars": 3000,
"include": {
"meta": false,
"debt": false,
"advice": false,
"errors": false
}
},
// Default export settings
"export": {
"format": "json",
"output_dir": "/Users/keygenqt/Downloads",
"types": {
"code": true,
"markdown": true,
"text": true,
"binary": true
},
"include": {
"imports": true,
"functions": true,
"variables": true,
"enums": true,
"interfaces": true,
"classes": true,
"structs": true,
"comments": true,
"header_comments": true,
"headings": true,
"links": true,
"code_blocks": true,
"markers": false,
"warnings": false,
"body": false
}
},
// LLM cluster nodes (local and cloud models)
"cluster": [
{
"provider": "ollama",
"host": "http://localhost:11434",
"model": "qwen2.5-coder:3b-instruct",
"timeout_secs": 60,
"temperature": 0.1,
"seed": 42,
"num_ctx": 4096,
"num_predict": 2048,
"parallel": 1
},
{
"provider": "deepseek",
"host": "https://api.deepseek.com/v1",
"model": "deepseek-v4-flash",
"api_key": "sk-...",
"timeout_secs": 120,
"temperature": 0.1,
"seed": 42,
"num_ctx": 4096,
"num_predict": 2048,
"parallel": 2
}
],
// Knowledge sources for indexing
"sources": ["/Users/keygenqt/Documents/Gitcode/Projects/vibe-analyzer"]
}
Sections in Detail
version
{
"version": "0.0.5"
}
Configuration file version. Do not modify manually — updated automatically during config migration between versions.
log_level
{
"log_level": "cli"
}
Application log level. Supported values: cli, error, warn, info, debug, trace. The cli level disables tracing output for clean command-line output.
opensearch
{
"opensearch": {
"host": "http://localhost:9200"
}
}
| Parameter | Type | Default | Description |
|---|---|---|---|
host | string | http://localhost:9200 | OpenSearch server URL. Can point to a local or remote server |
mcp
{
"mcp": {
"host": "127.0.0.1",
"port": 9020,
"protocol": "latest"
}
}
| Parameter | Type | Default | Description |
|---|---|---|---|
host | string | 127.0.0.1 | Server bind address. 0.0.0.0 — accessible externally (Docker, remote clients), 127.0.0.1 — local only |
port | integer | 9020 | MCP server port |
protocol | string | latest | MCP protocol version: 2024-11-05, 2025-03-26, 2025-06-18, or latest |
analyze
{
"analyze": {
"max_chunk_chars": 3000,
"include": {
"meta": false,
"debt": false,
"advice": false,
"errors": false
}
}
}
| Parameter | Type | Default | Description |
|---|---|---|---|
max_chunk_chars | integer | 3000 | Maximum characters per LLM request batch |
Enrichment features:
| Feature | Description |
|---|---|
meta | Generate summary and search tags for the file |
debt | Detect technical debt: TODO markers, magic numbers |
advice | Suggest refactoring, naming, and test improvements |
errors | Find bugs: unsafe calls, swallowed errors, spelling mistakes |
When all features are disabled, only AST data is returned.
export
{
"export": {
"format": "json",
"output_dir": "/Users/keygenqt/Downloads",
"types": {
"code": true,
"markdown": true,
"text": true,
"binary": true
},
"include": {
"imports": true,
"functions": true,
"variables": true,
"markers": false,
"warnings": false,
"body": false
}
}
}
| Parameter | Type | Default | Description |
|---|---|---|---|
format | string | json | Default export format: json, json5, toml, toon, xml, yaml |
output_dir | string | ~/Downloads | Default output directory for exported files |
File types (types):
| Type | Description |
|---|---|
code | Source code files (Rust, Python, etc.) |
markdown | Markdown documentation files |
text | Text files without AST parser (configs) |
binary | Binary files (images, archives) |
AST elements (include):
| Element | Description |
|---|---|
imports | Import statements |
functions | Function signatures |
variables | Variable declarations |
enums | Enum declarations |
interfaces | Interface/trait declarations |
classes | Class declarations |
structs | Struct declarations |
comments | Documentation comments |
header_comments | File header comments |
headings | Markdown headings |
links | Markdown links |
code_blocks | Markdown code blocks |
markers | Static analysis markers (TODO, FIXME, HACK) |
warnings | Static analysis warnings (unwrap, panic) |
body | Body content (functions, classes, headings) |
cluster
{
"cluster": [
{
"provider": "ollama",
"host": "http://localhost:11434",
"model": "qwen2.5-coder:3b-instruct",
"timeout_secs": 60,
"temperature": 0.1,
"seed": 42,
"num_ctx": 4096,
"num_predict": 2048,
"parallel": 1
}
]
}
| Parameter | Type | Default | Description |
|---|---|---|---|
provider | string | ollama | Provider type: ollama, deepseek, qwen |
host | string | http://localhost:11434 | API endpoint URL |
model | string | qwen2.5-coder:3b-instruct | Model name. Ollama: pre-loaded via ollama pull |
timeout_secs | integer | 60 | Request timeout in seconds |
temperature | float | 0.1 | Generation temperature (0.0 — deterministic, 1.0 — creative) |
seed | integer | 42 | Random seed for reproducible results |
num_ctx | integer | 4096 | Context window size in tokens |
num_predict | integer | 2048 | Maximum tokens in response |
api_key | string | (none) | API key for cloud providers (not needed for Ollama) |
parallel | integer | 1 | Number of parallel workers for this provider |
Multiple nodes can be specified for load distribution across local and cloud models.
sources
{
"sources": ["/Users/keygenqt/Documents/Gitcode/Projects/vibe-analyzer"]
}
| Parameter | Type | Default | Description |
|---|---|---|---|
sources | array of strings | [] | List of absolute paths to sources |
Managing sources via CLI:
vibe-analyzer source add /path/to/project
vibe-analyzer source remove --target /path/to/project
vibe-analyzer source list
Configuration Validation
Vibe Analyzer validates the configuration at startup and applies safe defaults if parameters are missing or invalid:
max_chunk_chars→ minimum 1000, maximum 100000- Invalid paths → normalized to absolute form
- Missing sections → created with default values
Overriding the Configuration File
vibe-analyzer --config /custom/path/config.json5 source list
Default is ~/.vibe-analyzer/config.json5.
CLI Reference
Vibe Analyzer provides a command-line interface for managing knowledge sources, analysis, exporting, indexing, and running the MCP server.
General Syntax
vibe-analyzer [global options] <command> [subcommand] [options]
Global Options
| Option | Description |
|---|---|
--config <path> | Path to config file (default: ~/.vibe-analyzer/config.json5) |
--help | Show help |
--version | Show version |
Commands
source — Source Management
Add, remove, and list knowledge sources.
vibe-analyzer source <subcommand>
| Subcommand | Description |
|---|---|
add <path> | Adds a new directory or file to the sources list. The path is automatically converted to absolute |
remove --target <path> | Removes a source from the configuration. Accepts full path or unique directory name |
list | Shows all added sources with absolute paths |
Examples:
vibe-analyzer source add /home/user/projects/my-app
vibe-analyzer source remove --target my-app
vibe-analyzer source list
analyze — Analysis and Export
Extract code structure, run LLM enrichment, and export results.
vibe-analyzer analyze export [options]
| Option | Description |
|---|---|
--target <path> | Process a specific source. If not specified — all sources are processed |
-m, --mode <mode> | Analysis modes: ast, meta, debt, errors, advice (comma-separated) |
-t, --type <type> | File types to export: code, markdown, text, binary (comma-separated) |
-f, --format <format> | Export format: json (default), json5, toml, toon, xml, yaml |
-p, --path <path> | Source path for direct file/directory scanning |
-o, --output <path> | Export path. If not specified — file is created in ~/Downloads/ |
Examples:
# AST only, all sources
vibe-analyzer analyze export
# AST + LLM enrichment for code files only
vibe-analyzer analyze export -m meta,debt -t code
# Errors in markdown documentation only
vibe-analyzer analyze export -m errors -t markdown
# Full analysis with JSON5 export
vibe-analyzer analyze export -m meta,debt,errors,advice -f json5
index — OpenSearch Indexing
Full cycle: AST parsing → LLM enrichment → write to OpenSearch.
vibe-analyzer analyze index [options]
| Option | Description |
|---|---|
--target <path> | Index a specific source. If not specified — all sources |
--force | Force full reindexing. Ignores hashes and processes all files again |
Examples:
vibe-analyzer analyze index
vibe-analyzer analyze index --target my-app
vibe-analyzer analyze index --target my-app --force
stats — Statistics
View information and statistics for indexed projects.
vibe-analyzer stats <subcommand>
stats info
vibe-analyzer stats info [options]
| Option | Description |
|---|---|
--target <path> | Show statistics for a specific project. If not specified — all projects |
Example output:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Language Files Lines AST Objects Size
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Rust 194 15146 1631 498.26 KB
Markdown 33 2884 296 102.64 KB
...
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Total 287 24893 2082 755.11 KB
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
stats tree
vibe-analyzer stats tree --target <project> [options]
| Option | Description |
|---|---|
--target <path> | Project to display the tree for (required) |
-L, --level <number> | Maximum tree depth (default: 3) |
serve — MCP Server
Start, stop, and check MCP server status.
vibe-analyzer serve <subcommand>
| Subcommand | Description |
|---|---|
start | Start MCP server |
stop | Stop running server (reserved) |
status | Show server status (reserved) |
Examples:
vibe-analyzer serve start
vibe-analyzer serve start --port 9020 --host 127.0.0.1
Export Formats
Six formats available with --format:
| Format | Key | Extension | Description |
|---|---|---|---|
| JSON | json | .json | Compact JSON without extra whitespace — minimal file size |
| JSON5 | json5 | .json5 | JSON5 with comments and trailing commas — human-readable |
| TOML | toml | .toml | TOML format |
| TOON | toon | .toon | TOON format — token-efficient output, optimized for LLMs |
| XML | xml | .xml | XML with pretty-print formatting |
| YAML | yaml | .yaml | YAML format |
MCP Tools
Vibe Analyzer provides 11 MCP tools that AI models can call to search code and documentation. Each tool returns a structured response — no document stuffing.
How It Works
AI model → selects a tool → calls MCP → receives a structured response
Rules for the AI model: use tools, respond concisely, one call is enough.
Tools
| Tool | Parameters | Description |
|---|---|---|
admin_sync | — | Reindex all projects in the background |
get_file_content | path (required) | Full file content by path. Supports partial matching and wildcards |
get_file_ast | path (required) | Full AST: imports, functions, classes, structs, enums, headings |
search_by_code_imports | query (required), target, limit (3) | Search by imports/dependencies |
search_by_code_functions | query (required), target, limit (3) | Search by functions/methods |
search_by_code_classes | query (required), target, limit (3) | Search by classes/structs/interfaces |
search_by_code_variables | query (required), target, limit (3) | Search by variables/constants/enums |
search_documentation | query (required), limit (3) | Search all markdown docs. Multilingual (RU/EN/ZH) |
search_knowledge | query (required), limit (3) | Alias for search_documentation |
show_projects | size (10) | List all indexed projects with descriptions |
show_stats | target | Project statistics by language, files, lines of code |
show_tree | target, level (3) | Directory tree of files and folders |
All limit parameters are capped to 1–10 range. target accepts full path or unique directory name.
Anti-Hallucination Protection
| Mechanism | Description |
|---|---|
| Name aliases (160+) | Fixes distorted tool names automatically |
| Parameter normalization | Wildcards, whitespace, invalid values → safe defaults |
| Auto language detection | Cyrillic → Russian tags, Latin → English, CJK → Chinese |
| Soft error handling | Invalid parameters don’t cause errors, they’re normalized |
LLM Enrichment
After AST parsing, Vibe Analyzer enriches results via vibe-cluster across multiple providers in four analysis modes: Meta, Debt, Errors, and Advice.
How It Works
AST data → static analysis → batching → LLM request → enrichment results
1. Static Analysis
Before LLM processing, code is analyzed programmatically across 14 languages:
- Markers: TODO, FIXME, HACK, and 18 other markers
- Warnings: unsafe calls (unwrap, panic), swallowed exceptions, debug statements
These are included in the prompt so the model can build on them.
2. File Enrichment
Batching. Files are grouped into batches limited by max_chunk_chars (default 3000) from the analyze config section.
The prompt contains the AST structure: functions, classes, structs, and other elements. Which elements to include is controlled by export.include in config.
Analysis modes:
| Mode | Description |
|---|---|
meta | Generate summary and search tags for each file |
debt | Detect technical debt: code duplication, magic numbers, complex logic |
errors | Find bugs: unsafe calls, swallowed errors |
advice | Suggest refactoring, naming, and test improvements |
Prompt. For Meta, the model receives a JSON template to fill. For Debt/Errors/Advice, the model returns a JSON array of issues found.
Response. Example for Meta:
{
"files": [
{
"path": "src/main.rs",
"description": "Main entry point for the CLI application",
"tags": ["entry", "point", "cli", "argument", "parsing", "rust", "binary"]
}
]
}
3. Parallel Processing
If multiple cluster nodes are configured, prompts are distributed via vibe-cluster:
- Local models (Ollama) get priority over cloud providers
- Nodes process prompts in parallel via atomic work-stealing
- Configurable
parallelparameter for multiple workers per provider - At the end, per-node statistics are reported
4. JSON Repair
LLMs often corrupt JSON: add comments, wrap in markdown blocks, drop quotes. clean_llm_json fixes this automatically.
5. Retries
If a node returns fewer results than expected — automatic retries with delay. After all retries are exhausted, the error is recorded in the result.
Generation Parameters
Configured per cluster node:
| Parameter | Default | Description |
|---|---|---|
temperature | 0.1 | Low temperature for stable results |
seed | 42 | Fixed seed for reproducibility |
num_ctx | 4096 | Context window size in tokens |
num_predict | 2048 | Maximum tokens in response |
timeout_secs | 60 | Request timeout in seconds |
Model Warm-Up
Before enrichment, for each cluster node:
- Availability check
- For Ollama nodes: model presence check and warm-up request to load the model into memory
Exporting Results
# AST with LLM enrichment
vibe-analyzer analyze export -m meta,debt
# Code only, JSON5 format
vibe-analyzer analyze export -m errors -t code -f json5
Supported export formats: JSON, JSON5, TOML, TOON, XML, YAML.
Search and Indexing
Vibe Analyzer stores all data in OpenSearch and uses multilingual analyzers for search.
Three Indices
Three indices are created for each project:
| Index | Purpose | Contents |
|---|---|---|
vibe_meta | Metadata | 1 document per project: summary, license, README, statistics |
vibe_files_{hash} | Content | One document per file: full contents (not indexed for search, store only) |
vibe_files_analysis_{hash} | Search | One document per text file: AST, description, tags |
Multilingual Search
OpenSearch is configured with three analyzers:
russian_analyzer(typerussian) — stemming for Russianenglish_analyzer(typeenglish) — stemming for Englishchinese_analyzer(typechinese) — segmentation for Chinese
Each text field in vibe_files_analysis has three sub-fields — one per analyzer. This allows searching for “функции”, “functions”, and “函数” with correct morphology for each language.
Search Mechanics
Documentation Search (search_documentation)
The most complex query. Algorithm:
- Script detection in the query — Cyrillic, Latin, CJK
- Word extraction (longer than 2 characters)
- Wildcard search on headings with 10.0 boost + stemming for long words
- Language-specific match queries — for each detected script, a separate query to the corresponding sub-field with fuzziness
- Boost for knowledge documents — if the frontmatter contains
knowledge: true, the document gets a 5.0 boost
Ranking priority:
- Headings (
headings.title) — 10.0 boost - Preview (
headings.preview) — 2.0 boost - Links (
links.text) — 2.0 boost - Tags (
tags) — 1.0 boost
Code Search
Each search type has its own strategy:
- Imports —
wildcardon theimportskeyword field + tags - Functions —
match_phrase_prefixon signatures +matchon comments (nested queries) - Classes/structs/interfaces — three nested queries in
shouldwithminimum_should_match: 1 - Variables/enums —
match_phrase_prefixon signatures +matchon comments (nested queries)
All code searches use fuzziness: AUTO for fuzzy matching and boost tags higher than specific fields.
Incremental Indexing
Vibe Analyzer doesn’t re-index files unnecessarily:
- Fetching hashes from OpenSearch via Scroll API —
GET /{index}/_search?scroll=1m - Comparison — a BLAKE3 hash is computed for each file and compared against the indexed one
- Skipping unchanged — files with matching hashes are not processed
If the --force flag is passed, hashes are ignored — all files are indexed.
Bulk Indexing
All documents are written to OpenSearch in batches via the Bulk API in NDJSON format:
{"index": {"_index": "vibe_files_xxx", "_id": "src/main.rs"}}
{"root": "/project", "path": "src/main.rs", "content": "..."}
{"index": {"_index": "vibe_files_xxx", "_id": "src/lib.rs"}}
{"root": "/project", "path": "src/lib.rs", "content": "..."}
The document ID is the file path (path). This ensures that re-indexing updates the existing document rather than creating a duplicate.
Orphaned Data Cleanup
cleanup runs automatically during indexing:
- Index removal for deleted projects
- Document removal for files no longer on disk (comparing paths in the index and on the filesystem)
- Meta-document removal for projects removed from the configuration
Project Statistics
show_stats_search collects aggregated statistics across all indexed files via the Scroll API. This enables:
- Project reports — language breakdown, file count, lines, AST objects
- Data presence checks — if statistics are empty, indexing hasn’t been performed or the project hasn’t been added
- Codebase size estimation — total size, text and binary file counts
Aggregation runs across all documents from files_analysis:
- Language grouping (via
get_language_name) - AST object counting: sum of functions, classes, structs, enums, interfaces, variables, imports, headings, links, code blocks
Other— files without a detectable language- Languages sorted by lines of code descending
Integrations
Vibe Analyzer provides an MCP server that AI assistants can connect to via the Model Context Protocol. Once connected, the model gains 11 tools for searching code and documentation.
How It Works from the User’s Perspective
The user communicates with the AI assistant in natural language. The model decides which tool to call. Examples from real testing scenarios:
Code Search
| User Query | Tool | What Happens |
|---|---|---|
“Find add functions in the samples project” | search_by_code_functions | Searches for functions with add in the signature, returns files and signatures |
| “What classes are in samples?” | search_by_code_classes | Returns all classes, structs, interfaces |
| “Show all enums in samples” | search_by_code_variables | Enums are also searched through this tool |
| “What libraries are used in samples?” | search_by_code_imports | List of all imports in the project |
“List files that have the MAX_VALUE constant” | search_by_code_variables | Search by constant name |
File Viewing
| User Query | Tool |
|---|---|
“Show the contents of src/main.rs” | get_file_content |
“Show the structure of main.py” | get_file_ast |
“What functions are in src/main.rs?” | get_file_ast |
“Open utils.py” | get_file_content |
Documentation and Knowledge Base Search
| User Query | Tool |
|---|---|
| “Who is Zizikosh?” | search_documentation |
| “Tell me about Kukyrbur’s abilities” | search_documentation |
| “Find Python coding guidelines” | search_documentation |
| “Show the release process” | search_documentation |
| “Find the code review checklist” | search_documentation |
Project Navigation
| User Query | Tool |
|---|---|
| “What projects are in the database?” | show_projects |
| “Show the tree of the samples project” | show_tree |
| “How many files are in knowledge?” | show_stats |
| “Show overall statistics for all projects” | show_stats |
Administration
| User Query | Tool |
|---|---|
| “Update the index” | admin_sync |
| “Reindex projects” | admin_sync |
How to Phrase Queries
The model understands queries in natural language. You don’t need to use exact tool names — plain language is enough.
Good:
- “Find add functions in the samples project”
- “What classes are in samples?”
- “Show the contents of src/main.rs”
- “Who is Zizikosh?”
Unnecessary (the model will understand via AliasHandler anyway, but it’s better to avoid):
- “Call search_by_code_functions with query=add”
- “Use the get_file_content tool for path=src/main.rs”
Important Notes
- Project names — you can use the full path or directory name:
"samples"or"/path/to/samples" - File paths — relative to the project root:
"src/main.rs", partial matching is supported - Result limit — default 3, maximum 10. If the model requests “all”, the limit is automatically raised
- One call is enough — the model is trained to respond after a single tool call, no need to ask again
Connecting to Open WebUI
-
Start the MCP server:
vibe-analyzer serve start -
In Open WebUI settings, add a new MCP server:
- URL:
http://localhost:9020 - Transport: Streamable HTTP
- URL:
-
Tools appear automatically
Connecting to Claude Desktop
Add to the configuration:
{
"mcpServers": {
"vibe-analyzer": {
"url": "http://localhost:9020",
"transport": "streamable-http"
}
}
}
MCP Protocol
Supported versions: 2024-11-05, 2025-03-26, 2025-06-18, latest. Configured in the settings:
{
"mcp": {
"host": "127.0.0.1",
"port": 9020,
"protocol": "latest"
}
}
Security
- Server without authentication — for trusted networks or localhost
- Default host
127.0.0.1(local only) 0.0.0.0— for access from Docker containers or other machines- Server only reads data,
admin_syncis the only tool that triggers background indexing
Testing
Vibe Analyzer uses end-to-end tests powered by the vibe-tests framework to verify MCP tool relevance.
How It Works
engine_config! → engine.test("natural language query") → LLM selects tool → tool returns result → verify
The framework:
- Starts the MCP server automatically
- Runs queries against real Ollama models (3B, 7B)
- Verifies the model selected the correct tool
- Saves structured JSON reports with timing, tool calls, and responses
Test Scenarios
All 11 MCP tools are covered, each with 5 queries in Russian and English:
tests/mcp/
├── admin_sync.rs
├── get_file_ast.rs
├── get_file_content.rs
├── search_by_code_classes.rs
├── search_by_code_functions.rs
├── search_by_code_imports.rs
├── search_by_code_variables.rs
├── search_documentation.rs
├── show_projects.rs
├── show_stats.rs
└── show_tree.rs
Total: 60 queries across 11 tools × 2 models = 120 tests.
Example test:
#[tokio::test]
async fn test_search_functions_add_ru() {
let engine = vibe_tests::engine().await;
let result = engine.test("Найди функции add в проекте 'samples'").await;
assert!(result.success);
assert!(result.models.iter().all(|m| m.tool.as_deref() == Some("search_by_code_functions")));
}
Test Infrastructure
- OpenSearch — via Docker Compose
- MCP server — started automatically on port 9021
- Fixtures — test projects
samples(code) andknowledge(docs, legends) - Multi-model — each query tested against multiple Ollama models for robustness
Running
# Full E2E tests (require Docker + Ollama)
cargo test --test mcp_test -- --nocapture
Expected Model Behavior
The test verifies that the model:
- Called a tool — the correct tool for the query
- Did not call a non-existent tool — alias resolution works
- The tool returned a result — non-empty response
Reports
After each run, a structured JSON report is saved with per-query details: model, tool, args, response, duration, and success status.
Recipes
Practical examples of using Vibe Analyzer through an AI assistant. Based on the test fixtures: samples (code) and knowledge (documentation).
Exploring an Unfamiliar Project
Goal: you’re seeing a project for the first time and want to quickly understand its structure.
You: What projects are available?
AI: [show_projects]
- knowledge — guidelines and processes for code writing
- samples — sample code in Rust and Python
You: Show the tree of samples
AI: [show_tree]
samples/
├── python_sample/
│ ├── main.py
│ ├── models.py
│ └── utils.py
├── rust_lib/
│ ├── Cargo.toml
│ └── src/
│ └── lib.rs
└── rust_sample/
├── Cargo.toml
└── src/
└── main.rs
You: Show statistics for knowledge
AI: [show_stats]
Markdown 15 files 663 lines 112 AST objects
You: What functions are in utils.py?
AI: [get_file_ast for utils.py]
- add(a: int, b: int) -> int — Returns sum of two numbers
- multiply(a: int, b: int) -> int — Returns product of two numbers
- divide(a: int, b: int) -> float — Divides two numbers safely
Finding Dependencies
Goal: find out what libraries are used in a project.
You: What libraries are used in samples?
AI: [search_by_code_imports]
- thiserror::Error (Rust)
- super::* (Rust)
- dataclasses (Python)
- typing (Python)
- utils, models (Python)
Finding an Implementation
Goal: find where a specific function is implemented.
You: Find the multiply function
AI: [search_by_code_functions with query="multiply"]
Found in 2 files:
- python_sample/utils.py: def multiply(a: int, b: int) -> int
- rust_lib/src/lib.rs: pub fn multiply(a: i32, b: i32) -> i32
Documentation Search
Goal: find rules or processes.
You: Find Python coding guidelines
AI: [search_documentation with query="Python coding rules"]
Found in knowledge/guidelines/code/python.md:
# Python Coding Guidelines
Follow PEP 8. Use snake_case, type annotations, docstrings.
You: What is the release process?
AI: [search_documentation with query="release process"]
Found in knowledge/guidelines/process/release.md:
# Release Process
1. All milestone tasks closed
2. Create a release branch from develop
3. Update version in Cargo.toml/package.json
4. Merge into main
5. Notify the team
Knowledge Base Search
Goal: find information about project characters or legends.
You: Who is Zizikosh?
AI: [search_documentation with query="Zizikosh"]
Found in knowledge/legends/zizikosh_blogger.md:
# Zizikosh - Blogger
Video blogger, unpacks packages from AliExpress.
150K subscribers.
You: What is Bakobur's dream?
AI: [search_documentation with query="Bakobur's dream"]
Found in knowledge/legends/bakobur_homeless.md:
Dream: Warm socks and a doghouse for Sharik
Inspecting a Specific File
Goal: view the contents or structure of a specific file.
You: Show the contents of Cargo.toml
AI: [get_file_content]
[package]
name = "rust_lib"
version = "0.1.0"
edition = "2021"
[dependencies]
anyhow = "1.0"
thiserror = "1.0"
You: What structs are in lib.rs?
AI: [get_file_ast]
- struct User { name, age }
- struct Calculator { value }
- enum MathError { DivisionByZero, OutOfRange }
- enum Operation { Add, Multiply, Divide }
Incremental Update
You: Update the index
AI: [admin_sync]
Indexing started. Projects are updating.
Tips
- Start broad —
show_projects→show_tree→show_stats - Refine with search —
search_by_code_functions,search_documentation - Inspect details —
get_file_content,get_file_ast - Update the index after changes —
admin_sync - Use natural language — the model will choose the right tool automatically