indexer-ai

High-performance universal code indexer optimized for AI assistants and modern development tools.

Production Ready: Enterprise-grade performance with Worker Threads and async I/O Performance: 3.6x faster indexing with parallel processing (10,000 files in ~25s) Clean Codebase: Major cleanup completed - removed 260+ instances of dead code

🚀 Latest Updates (September 2025)

Code Quality Improvements

Dead Code Removal: Eliminated 67 unused functions, 145 unused exports, 48 unused imports
Parser Consolidation: Merged 3 Python parsers into 1 unified Tree-sitter parser
VS Code Extension: Removed to focus on core indexing functionality
Logger Migration: Replaced 398 console.log statements with proper Logger class
Syntax Fixes: Fixed all cleanup-related syntax errors across 9 files

Performance Enhancements (v2.0.2)

Worker Threads: Parallel parsing using all CPU cores
Async I/O: Non-blocking file operations throughout
Node.js 20 LTS: Latest runtime optimizations
ES2023 Target: Modern JavaScript features for better performance

Performance Benchmarks

Files	Before	After	Improvement
100	~2s	~0.7s	2.8x faster
1,000	~15s	~5s	3x faster
10,000	~90s	~25s	3.6x faster
Memory	300MB	180MB	40% reduction

Features

Core Capabilities

Lightning Fast: Index 10,000+ files in ~25 seconds with Worker Threads
Multi-Language: 9 languages - JavaScript, TypeScript, Python, Go, SQL, GraphQL, YAML, Astro
AI-Optimized: Built for Claude, GPT-4, and other LLM assistants
Real-time Updates: File watching with automatic index refresh
Organization-Wide: Index entire organizations and monorepos automatically
Cross-Repository: Track API calls and dependencies between services
Modern Architecture: ES2023, async/await, Worker Threads

🆕 Advanced Features

Call Graph Analysis: Bidirectional function call tracking and dead code detection
AI Compression: 50-70% size reduction with token-aware optimization
Worker Thread Pool: Automatic parallel processing for large codebases
Streaming Support: Handle massive files without memory issues
Impact Analysis: Track cascading effects of code changes

Export Formats

JSON: Complete index with compression options (standard, compressed, minified)
Markdown: Human-readable documentation
Mermaid: Interactive diagrams for VS Code/Cursor
GraphViz: Professional dependency graphs
ASCII: Terminal-friendly visualizations

🎉 Now Available on npm!

Install globally in seconds:

npm install -g indexer-ai

# Use the ultra-short command (4 chars!)
idxr

Quick Start

Requirements

Node.js >= 20.0.0 (LTS recommended)
npm or yarn
4GB RAM recommended for large codebases

Installation Prerequisites

Linux/WSL Requirements

For tree-sitter and native dependencies to compile:

# Ubuntu/Debian/WSL
sudo apt-get update
sudo apt-get install -y build-essential python3

# macOS (if needed)
xcode-select --install

Installation

# Install globally from npm (recommended)
npm install -g indexer-ai

# Quick usage with short command
idxr                    # Shortest command (4 chars!)
indexer                 # Alternative command
indexer-ai              # Full package name

# Or install from source
git clone https://github.com/tacit-code/indexer.git
cd indexer
yarn install && yarn build
npm install -g .

Basic Usage

# Smart mode - analyzes everything automatically
idxr                     # Quick 4-character command!
# or
indexer

# Index entire organization (all repos in subdirectories)
cd /your/organization
idxr

# Index specific project with Worker Threads (automatic for >50 files)
idxr scan /path/to/project

# Index with specific options
indexer scan --parallel 8 --output custom-index.json

# Watch mode with real-time updates
idxr watch

# Interactive chat with Claude about your codebase
idxr chat

# Query the index
idxr query "function.*Auth" --fuzzy

Multi-Repository & Organization Indexing

The indexer automatically detects and analyzes entire organization structures, monorepos, and multi-repository setups without configuration.

Organization-Wide Indexing

# Index your entire organization
cd /path/to/organization  # Parent directory containing all repos
indexer                    # Automatically indexes ALL repositories

# Example: Clone Global organization structure
/clone-global/
├── indexer/         # This tool
├── backend/         # API services
├── frontend/        # Web applications
├── mobile/          # Mobile apps
├── skills/          # Microservices
└── data-ops/        # Data pipelines

# Run from parent directory:
cd /clone-global
indexer  # Creates comprehensive cross-repository knowledge graph

Automatic Detection

The SmartIndexer automatically detects:

Monorepo structures: lerna.json, yarn workspaces, pnpm workspaces
Multi-repository setups: Multiple .git directories
Service architectures: Microservices, APIs, frontends
Shared dependencies: Cross-repository imports and libraries

Cross-Repository Analysis

Tracks relationships across your entire codebase:

Frontend → Backend: API calls, GraphQL queries, REST endpoints
Service → Service: Inter-service communication, event streams
Shared Libraries: Import/export dependencies, version tracking
Database Schemas: Cross-service data flows and dependencies

Generated Outputs

.indexer-output/current/
├── PROJECT_INDEX.json          # Combined index of ALL repositories
├── service-graph.json          # Complete dependency graph
├── multi-repo-overview.md      # Visual architecture diagram
├── multi-repo-interactive.html # Interactive dependency explorer
└── [repo-name]/               # Individual repository indexes
    └── PROJECT_INDEX.json     # Repo-specific index

Use Cases

Architecture Documentation: Auto-generate system architecture diagrams
Dependency Analysis: Find all consumers of an API endpoint
Impact Assessment: See affected services before making changes
Code Navigation: Jump between repos following API calls
AI Context: Give LLMs complete understanding of your entire system

Benefits for AI Assistants

When you provide the generated PROJECT_INDEX.json to Claude, GPT-4, or other AI assistants:

Complete Context: AI understands your entire organization from a single file
Cross-Repo Intelligence: AI can trace API calls across service boundaries
Accurate Suggestions: AI knows exact function signatures and dependencies
Reduced Token Usage: Compressed index uses 50-70% fewer tokens than raw code
System-Wide Refactoring: AI can suggest changes considering all affected services

Performance Configuration

Optimize for Your System

# Maximum performance (uses all CPU cores)
indexer scan --parallel $(nproc)

# Memory-constrained environment
indexer scan --parallel 2 --max-memory 256

# Disable Worker Threads for debugging
indexer scan --no-workers

# Incremental mode for large codebases
indexer scan --incremental

Environment Variables

# Performance
INDEXER_PARALLEL=8           # Number of parallel workers
INDEXER_MAX_MEMORY=1000      # Max memory in MB
INDEXER_USE_WORKERS=true     # Enable Worker Threads

# Node.js 20+ optimizations
NODE_OPTIONS="--max-old-space-size=4096"  # 4GB heap
UV_THREADPOOL_SIZE=16        # Larger thread pool

# AI Features
ANTHROPIC_API_KEY=sk-ant-... # Claude integration

Architecture

Modern Tech Stack

Runtime: Node.js 20 LTS with native ES modules support
Language: TypeScript 5.3+ with ES2023 target
Parallelization: Worker Threads for CPU-intensive parsing
Async I/O: Promises-based file system operations
Parsing: Tree-sitter (Python), Babel (JS/TS), native AST parsers

Performance Architecture

┌─────────────────────────────────────┐
│         Main Thread                  │
│  ┌─────────────────────────────┐    │
│  │   Orchestration Layer       │    │
│  └──────────┬──────────────────┘    │
│             │                        │
│  ┌──────────▼──────────────────┐    │
│  │   Worker Thread Pool        │    │
│  │  ┌────┐ ┌────┐ ... ┌────┐  │    │
│  │  │ W1 │ │ W2 │     │ Wn │  │    │
│  │  └────┘ └────┘     └────┘  │    │
│  └─────────────────────────────┘    │
│                                      │
│  ┌─────────────────────────────┐    │
│  │   Async I/O Layer           │    │
│  └─────────────────────────────┘    │
└─────────────────────────────────────┘

API Server

REST API

# Start API server
indexer api --port 4000

# Endpoints
POST /api/index          # Build index
GET  /api/index/status   # Get status
POST /api/query          # Query index
GET  /api/stats          # Statistics
POST /api/ai/analyze     # AI analysis

GraphQL API

query {
  index {
    files {
      path
      functions {
        name
        complexity
      }
    }
  }
}

Advanced Features

Multi-Repository Analysis

# Analyze entire organization
indexer multi-repo /path/to/org --cross-dependencies

# Generate knowledge graph
indexer export mermaid --multi-repo

AI-Powered Analysis

# Security scanning
indexer ai security-scan

# Bug prediction
indexer ai predict-bugs --confidence 0.8

# Code smell detection
indexer ai detect-smells

Call Graph Analysis

# Find unused code
indexer analyze dead-code

# Trace execution paths
indexer analyze call-paths main

# Detect circular dependencies
indexer analyze circular

Configuration

.indexer.yml

version: 2
performance:
  parallel: 8
  useWorkers: true
  maxMemory: 1000
  cache: true

include:
  - "**/*.{js,jsx,ts,tsx,py,go,sql}"

ignore:
  - "**/node_modules/**"
  - "**/dist/**"

export:
  formats:
    json:
      compression: true
      maxSize: 10MB

Integrations

IDE Support

Cursor: Native integration with AI features
WebStorm: Via REST API
Vim/Neovim: LSP integration

CI/CD

GitHub Actions: Pre-built workflows
GitLab CI: Docker images available
Jenkins: Plugin support
CircleCI: Orb available

Monitoring

Datadog: APM and metrics integration
New Relic: Performance monitoring
Sentry: Error tracking
Grafana: Custom dashboards

Architecture

Core Components

SmartIndexer (src/core/smart-indexer.ts) - Orchestrates all features automatically, detects project structure
Indexer (src/core/indexer.ts) - Main indexing engine, coordinates parsers, builds dependency graphs
CacheManager (src/core/cache-manager.ts) - LRU cache with 500MB limit, TTL-based expiration (1 hour)
FileWatcher (src/core/watcher.ts) - Real-time monitoring with Chokidar, 300ms debounced updates
WorkerPool (src/core/worker-pool.ts) - Thread pool for parallel parsing, automatic scaling
CallGraphAnalyzer (src/core/call-graph-analyzer.ts) - Function call tracking, dead code detection

Design Patterns

Parser Plugin System - Extensible language support via common interface
Factory Pattern - Dynamic parser selection based on file extension
Worker Thread Pool - Parallel processing with automatic scaling
Event-Driven Updates - Real-time index updates via EventEmitter
Repository Pattern - Abstracted storage through CacheManager

API Server

REST API (Port 4000)

# Start API server
idxr api

# Or with authentication
idxr api --enable-auth true

Endpoints:

POST /api/index - Trigger indexing
POST /api/ai/analyze - Comprehensive AI analysis
POST /api/ai/predict-bugs - Bug prediction
POST /api/ai/analyze-security - Security scanning
GET /api/health - Health check
GET /api/stats - Statistics

GraphQL API

# GraphQL endpoint
http://localhost:4000/graphql

Full schema with queries, mutations, and subscriptions for real-time updates.

WebSocket Support

// Real-time updates
const ws = new WebSocket('ws://localhost:4000/ws');
ws.on('message', (data) => {
  console.log('File updated:', data);
});

AI Integration

Claude SDK Integration

The indexer includes deep integration with Claude via the Anthropic SDK:

# Set up Claude
export ANTHROPIC_API_KEY="sk-ant-..."

# Run AI analysis
idxr analyze --ai

Specialized AI Agents

< 6D38 /a>

SecurityAgent - OWASP compliance, vulnerability detection, security patterns
PerformanceAgent - Optimization opportunities, memory leaks, bottlenecks
ArchitectureAgent - Pattern detection, SOLID principles, best practices
TestingAgent - Coverage analysis, test generation, quality metrics

AI Analysis Features

Streaming analysis with real-time updates
Session management for continued conversations
Automatic bug prediction and security scanning
Code quality assessment and refactoring suggestions
Architecture recommendations and pattern detection

External Integrations

Slack Integration

Real-time notifications and monitoring:

# Configure Slack
export SLACK_TOKEN="xoxb-..."
export SLACK_CHANNEL="#dev-monitoring"

# Enable Slack bot
idxr slack --enable

Features:

Bug detection with severity classification
Code quality alerts
Performance degradation notifications
Security vulnerability alerts

Linear Integration

Automatic ticket creation for issues:

export LINEAR_API_KEY="lin_api_..."
export LINEAR_TEAM_ID="TEAM_ID"

Datadog Monitoring

Performance metrics and APM:

export DD_API_KEY="..."
export DD_APP_KEY="..."

# Send metrics
idxr monitor --datadog

Advanced Configuration

Configuration File (.indexerrc.yaml)

version: 2
name: my-project

include:
  - "**/*.{js,jsx,ts,tsx,py,go,sql}"
ignore:
  - "**/node_modules/**"
  - "**/dist/**"
  - "**/.git/**"

performance:
  parallel: true
  workers: 8
  cache: true
  maxFileSize: 2MB
  timeout: 120000

ai:
  model: claude-opus-4-1-20250805
  maxConcurrentAgents: 4
  sessionTimeout: 1800000

export:
  outputDirectory: .indexer-output
  formats:
    json:
      compression: true
      minify: false

integrations:
  slack:
    enabled: true
    channel: "#dev-alerts"
  linear:
    enabled: true
    autoCreate: true
  datadog:
    enabled: true
    tags:
      - "env:production"
      - "service:indexer"

Performance Tips

Use Worker Threads for codebases >50 files (automatic)
Enable incremental mode for large projects
Configure parallel workers based on CPU cores
Use compression for large indexes
Enable caching for repeated operations
Set appropriate memory limits for your system
Use --no-workers flag in WSL to avoid native module issues

CLI Command Reference

Core Commands

# Initialize and scan (smart mode - recommended)
idxr                        # Automatic everything
idxr init                   # Initialize with prompts
idxr scan [path]           # Scan specific directory
idxr scan --parallel 8     # Use 8 worker threads
idxr scan --no-workers     # Disable workers (WSL fix)

# File watching
idxr watch                 # Monitor changes
idxr watch --debounce 500  # Custom debounce (ms)

# Query and search
idxr query "pattern"       # Search functions/classes
idxr query ".*Controller" --regex  # Regex search
idxr stats                 # Show statistics

# Export formats
idxr export json           # JSON format
idxr export markdown       # Markdown documentation
idxr export graphviz       # DOT graph
idxr export mermaid        # Mermaid diagram
idxr export ascii          # Terminal visualization

# API server
idxr api                   # Start API server
idxr api --port 5000       # Custom port
idxr api --enable-auth     # With authentication

# AI features
idxr analyze --ai          # AI code analysis
idxr chat                  # Interactive Claude chat
idxr predict-bugs          # Bug prediction
idxr security-scan         # Security analysis

# Integrations
idxr slack --enable        # Enable Slack bot
idxr monitor --datadog     # Send Datadog metrics

# Utilities
idxr clean                 # Clear cache
idxr health                # Health check
idxr config --list         # Show configuration
idxr validate              # Validate index
idxr migrate               # Migrate old indexes

Command Options

Most commands support these common flags:

--config <path>     # Custom config file
--output <path>     # Output location
--format <type>     # Output format
--quiet            # Suppress output
--verbose          # Detailed logging
--debug            # Debug mode
--profile          # Performance profiling
--no-cache         # Disable caching
--force            # Force operation
--help             # Show help

Troubleshooting

Common Issues

Out of Memory

# Increase Node.js heap size
NODE_OPTIONS="--max-old-space-size=8192" idxr scan

Worker Thread Issues (WSL/Linux)

# Disable workers to avoid native module errors
idxr scan --no-workers

# Or rebuild native modules
npm rebuild tree-sitter

Slow Performance

# Check Worker Thread status
idxr debug --workers

# Profile performance
idxr scan --profile

# Limit file size processing
idxr scan --max-file-size 1MB

Parser Errors

# Use fallback parser
idxr scan --parser-fallback

# Skip problematic files
idxr scan --skip-errors

# Exclude specific patterns
idxr scan --exclude "*.min.js"

Installation Issues

# Clean install
rm -rf node_modules package-lock.json
npm install

# Global installation from source
npm run build
npm install -g .

# Permission issues
sudo npm install -g indexer-ai --unsafe-perm

Development

Building

npm install
npm run build     # Compiles TypeScript and Worker scripts
npm run dev       # Watch mode
npm test          # Run tests

Testing

npm test              # Run all tests (~25% coverage)
npm run test:unit     # Unit tests
npm run test:e2e      # End-to-end tests (Cypress)
npm run test:coverage # Generate coverage report

Current Test Status: 6 test suites passing with ~25% coverage. Target: 80%.

Contributing

See CONTRIBUTING.md for development guidelines.

License

Support

Issues: GitHub Issues
npm Package: indexer-ai
Repository: GitHub
Discord: Join our community

Built for the future of AI-assisted development 🚀

Name		Name	Last commit message	Last commit date
Latest commit History 100 Commits
.claude		.claude
.github/workflows		.github/workflows
.specify		.specify
bin		bin
config		config
cypress		cypress
docs		docs
indexer-python		indexer-python
scripts		scripts
specs/001-fix-typescript-build		specs/001-fix-typescript-build
src		src
test		test
.eslintrc.json		.eslintrc.json
.gitignore		.gitignore
.indexerignore		.indexerignore
.npmignore		.npmignore
CLAUDE.md		CLAUDE.md
GUIDE.md		GUIDE.md
README.md		README.md
cypress.config.ts		cypress.config.ts
dspy-prompt-optimizer.md		dspy-prompt-optimizer.md
install.sh		install.sh
jest.config.js		jest.config.js
package.json		package.json
tsconfig.json		tsconfig.json
tsconfig.minimal.json		tsconfig.minimal.json
tsconfig.tsbuildinfo		tsconfig.tsbuildinfo
uninstall.sh		uninstall.sh
yarn.lock		yarn.lock

tacit-code/indexer

Folders and files

Latest commit

History

Repository files navigation

indexer-ai

🚀 Latest Updates (September 2025)

Code Quality Improvements

Performance Enhancements (v2.0.2)

Performance Benchmarks

Features

Core Capabilities

🆕 Advanced Features

Export Formats

🎉 Now Available on npm!

Quick Start

Requirements

Installation Prerequisites

Linux/WSL Requirements

Installation

Basic Usage

Multi-Repository & Organization Indexing

Organization-Wide Indexing

Automatic Detection

Cross-Repository Analysis

Generated Outputs

Use Cases

Benefits for AI Assistants

Performance Configuration

Optimize for Your System

Environment Variables

Architecture

Modern Tech Stack

Performance Architecture

API Server

REST API

GraphQL API

Advanced Features

Multi-Repository Analysis

AI-Powered Analysis

Call Graph Analysis

Configuration

.indexer.yml

Integrations

IDE Support

CI/CD

Monitoring

Architecture

Core Components

Design Patterns

API Server

REST API (Port 4000)

GraphQL API

WebSocket Support

AI Integration

Claude SDK Integration

Specialized AI Agents

AI Analysis Features

External Integrations

Slack Integration

Linear Integration

Datadog Monitoring

Advanced Configuration

Configuration File (.indexerrc.yaml)

Performance Tips

CLI Command Reference

Core Commands

Command Options

Troubleshooting

Common Issues

Development

Building

Testing

Contributing

License

Support

About

Resources

Uh oh!

Stars

Packages