Back to Skills

semantic-codebase-search

verified

Vector-based code discovery using LanceDB and Ollama embeddings

View on GitHub

Marketplace

siftcoder-marketplace

ialameh/sift-coder

Plugin

siftcoder

development

Repository

ialameh/sift-coder

skills/semantic-codebase-search/SKILL.md

Last Verified

January 24, 2026

Install Skill

Select agents to install to:

Scope:
npx add-skill https://github.com/ialameh/sift-coder/blob/main/skills/semantic-codebase-search/SKILL.md -a claude-code --skill semantic-codebase-search

Installation paths:

Claude
.claude/skills/semantic-codebase-search/
Powered by add-skill CLI

Instructions

# Semantic Codebase Search Skill

**Vector-based code discovery using LanceDB and Ollama embeddings.**

## Purpose

This skill provides:
- Vector-based semantic code search
- Natural language query understanding
- Context-aware result presentation
- Index management and updates

## Core Functions

### 1. Index Codebase

```bash
index_codebase() {
  local path="${1:-.}"

  echo "๐Ÿ—๏ธ  Indexing codebase at: $path"

  # Find all code files
  files=$(find "$path" -type f \
    \( -name "*.ts" -o -name "*.tsx" -o -name "*.js" -o -name "*.jsx" \
      -o -name "*.py" -o -name "*.go" -o -name "*.rs" -o -name "*.java" \) \
    | grep -v node_modules | grep -v ".next" | grep -v "dist/")

  total=$(echo "$files" | wc -l)
  echo "๐Ÿ“Š Found $total files to index"

  # Create index directory
  mkdir -p .claude/siftcoder-state/vector-index

  # Process files in batches
  batch_size=50
  batch=()

  echo "$files" | while read file; do
    batch+=("$file")

    if [ ${#batch[@]} -eq $batch_size ]; then
      index_batch "${batch[@]}"
      batch=()
    fi
  done

  # Process remaining files
  if [ ${#batch[@]} -gt 0 ]; then
    index_batch "${batch[@]}"
  fi

  # Save metadata
  cat > .claude/siftcoder-state/vector-index/metadata.json <<EOF
{
  "created_at": "$(date -u +"%Y-%m-%dT%H:%M:%SZ")",
  "files_indexed": $total,
  "path": "$path",
  "embedding_model": "nomic-embed-text"
}
EOF

  echo "โœ… Index complete"
}
```

### 2. Search Vector Index

```bash
search_vectors() {
  local query="$1"
  local limit="${2:-10}"

  # Generate query embedding
  query_emb=$(ollama embed nomic-embed-text "$query" | jq '.embedding')

  # Search LanceDB
  results=$(python3 <<EOF
import lancedb
import json

db = lancedb.connect(".claude/siftcoder-state/vector-index")
table = db.open("codebase")

results = table.search($query_emb).limit($limit).to_df()

for _, row in results.iterrows():
    print(f"{row['file']}:{row['line']}")
    print(f"  Score: {row['_score']:.2f}")
    print(f"  Code: {row['code'][:1

Validation Details

Front Matter
Required Fields
Valid Name Format
Valid Description
Has Sections
Allowed Tools
Instruction Length:
3774 chars