discovering-data

# Data Exploration

Discover what data exists for a concept or domain. Answer "What data do we have about X?"

## Fast Table Validation

**When you have multiple candidate tables, quickly validate before committing to complex queries.**

### Strategy: Progressive Complexity

Start with the **simplest possible query**, then add complexity only after each step succeeds:

```
Step 1: Does the data exist?     → Simple LIMIT query, no JOINs
Step 2: How much data?           → COUNT(*) with same filters
Step 3: What are the key IDs?    → SELECT DISTINCT foreign_keys LIMIT 100
Step 4: Get related details      → JOIN on the specific IDs from step 3
```

**Never jump from step 1 to complex aggregations.** If step 1 returns 50 rows, use those IDs directly:

```sql
-- After finding deployment_ids in step 1:
SELECT o.org_name, d.deployment_name
FROM DEPLOYMENTS d
JOIN ORGANIZATIONS o ON d.org_id = o.org_id
WHERE d.deployment_id IN ('id1', 'id2', 'id3')  -- IDs from step 1
```

### When a Metadata Table Returns 0 Results

If a smaller metadata/config table (like `*_LOG`, `*_CONFIG`) returns 0 results, **check the execution/fact table** before concluding data doesn't exist.

Metadata tables may have gaps or lag. The actual execution data (in tables with millions/billions of rows) is often more complete.

### Use Row Counts as a Signal

When `list_tables` returns row counts:
- **Millions+ rows** → likely execution/fact data (actual events, transactions, runs)
- **Thousands of rows** → likely metadata/config (what's configured, not what happened)

For questions like "who is using X" or "how many times did Y happen", prioritize high-row-count tables first - they contain actual activity data.

⚠️ **CRITICAL: Tables with 1B+ rows require special handling**

If you see a table with billions of rows (like 6B), you MUST:
1. Use simple queries only: `SELECT col1, col2 FROM table WHERE filter LIMIT 100`
2. NO JOINs, NO GROUP BY, NO aggregations on the first query
3. Only add complexity afte

Marketplace

Plugin

Repository
Verified Org

Last Verified

Install Skill

Instructions

Validation Details

discovering-data

Marketplace

Plugin

RepositoryVerified Org

Last Verified

Install Skill

Instructions

Validation Details

Repository
Verified Org