claude-code-tracker/docs/database-schema.md
Ryan Malloy 44ed9936b7 Initial commit: Claude Code Project Tracker
Add comprehensive development intelligence system that tracks:
- Development sessions with automatic start/stop
- Full conversation history with semantic search
- Tool usage and file operation analytics
- Think time and engagement analysis
- Git activity correlation
- Learning pattern recognition
- Productivity insights and metrics

Features:
- FastAPI backend with SQLite database
- Modern web dashboard with interactive charts
- Claude Code hook integration for automatic tracking
- Comprehensive test suite with 100+ tests
- Complete API documentation (OpenAPI/Swagger)
- Privacy-first design with local data storage

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-11 02:59:21 -06:00

11 KiB

Database Schema Documentation

This document describes the SQLite database schema for the Claude Code Project Tracker.

Overview

The database is designed to capture comprehensive development workflow data through a normalized relational structure. All timestamps are stored in UTC format.

Entity Relationship Diagram

┌─────────────┐    ┌─────────────┐    ┌─────────────┐
│  projects   │    │  sessions   │    │conversations│
│             │    │             │    │             │
│ id (PK)     │◀──┤ project_id  │    │ session_id  │
│ name        │    │ id (PK)     │◀──┤ id (PK)     │
│ path        │    │ start_time  │    │ timestamp   │
│ git_repo    │    │ end_time    │    │ user_prompt │
│ created_at  │    │ type        │    │ claude_resp │
│ ...         │    │ ...         │    │ ...         │
└─────────────┘    └─────────────┘    └─────────────┘
                            │
                            │
                   ┌─────────────┐    ┌─────────────┐
                   │ activities  │    │waiting_perds│
                   │             │    │             │
                   │ session_id  │    │ session_id  │
                   │ id (PK)     │    │ id (PK)     │
                   │ tool_name   │    │ start_time  │
                   │ file_path   │    │ end_time    │
                   │ timestamp   │    │ duration    │
                   │ ...         │    │ ...         │
                   └─────────────┘    └─────────────┘
                            │
                            │
                   ┌─────────────┐
                   │git_operations│
                   │             │
                   │ session_id  │
                   │ id (PK)     │
                   │ operation   │
                   │ command     │
                   │ timestamp   │
                   │ ...         │
                   └─────────────┘

Table Definitions

projects

Stores metadata about tracked projects.

Column Type Constraints Description
id INTEGER PRIMARY KEY Unique project identifier
name VARCHAR(255) NOT NULL Project display name
path TEXT NOT NULL UNIQUE Absolute filesystem path
git_repo VARCHAR(500) NULL Git repository URL if applicable
languages JSON NULL Array of detected programming languages
created_at TIMESTAMP NOT NULL DEFAULT NOW() First time project was tracked
last_session TIMESTAMP NULL Most recent session timestamp
total_sessions INTEGER NOT NULL DEFAULT 0 Count of development sessions
total_time_minutes INTEGER NOT NULL DEFAULT 0 Cumulative session duration
files_modified_count INTEGER NOT NULL DEFAULT 0 Total unique files changed
lines_changed_count INTEGER NOT NULL DEFAULT 0 Total lines added + removed

Indexes:

  • idx_projects_path ON (path)
  • idx_projects_last_session ON (last_session)

sessions

Individual development sessions within projects.

Column Type Constraints Description
id INTEGER PRIMARY KEY Unique session identifier
project_id INTEGER NOT NULL FK(projects.id) Associated project
start_time TIMESTAMP NOT NULL Session start timestamp
end_time TIMESTAMP NULL Session end timestamp (NULL if active)
session_type VARCHAR(50) NOT NULL startup, resume, clear
working_directory TEXT NOT NULL Directory path when session started
git_branch VARCHAR(255) NULL Active git branch
environment JSON NULL System environment details
duration_minutes INTEGER NULL Calculated session length
activity_count INTEGER NOT NULL DEFAULT 0 Number of tool uses
conversation_count INTEGER NOT NULL DEFAULT 0 Number of exchanges
files_touched JSON NULL Array of file paths accessed

Indexes:

  • idx_sessions_project_start ON (project_id, start_time)
  • idx_sessions_active ON (end_time) WHERE end_time IS NULL

conversations

Dialogue exchanges between user and Claude.

Column Type Constraints Description
id INTEGER PRIMARY KEY Unique conversation identifier
session_id INTEGER NOT NULL FK(sessions.id) Associated session
timestamp TIMESTAMP NOT NULL When exchange occurred
user_prompt TEXT NULL User's input message
claude_response TEXT NULL Claude's response
tools_used JSON NULL Array of tools used in response
files_affected JSON NULL Array of files mentioned/modified
context JSON NULL Additional context metadata
tokens_input INTEGER NULL Estimated input token count
tokens_output INTEGER NULL Estimated output token count
exchange_type VARCHAR(50) NOT NULL user_prompt, claude_response

Indexes:

  • idx_conversations_session_time ON (session_id, timestamp)
  • idx_conversations_search ON (user_prompt, claude_response) USING fts5

activities

Tool usage and file operations during development.

Column Type Constraints Description
id INTEGER PRIMARY KEY Unique activity identifier
session_id INTEGER NOT NULL FK(sessions.id) Associated session
conversation_id INTEGER NULL FK(conversations.id) Associated exchange
timestamp TIMESTAMP NOT NULL When activity occurred
tool_name VARCHAR(50) NOT NULL Edit, Write, Read, Bash, etc.
action VARCHAR(100) NOT NULL Specific action taken
file_path TEXT NULL Target file path if applicable
metadata JSON NULL Tool-specific data
success BOOLEAN NOT NULL DEFAULT true Whether operation succeeded
error_message TEXT NULL Error details if failed
lines_added INTEGER NULL Lines added (for Edit/Write)
lines_removed INTEGER NULL Lines removed (for Edit)

Indexes:

  • idx_activities_session_time ON (session_id, timestamp)
  • idx_activities_tool_file ON (tool_name, file_path)

waiting_periods

Time intervals when Claude is waiting for user input.

Column Type Constraints Description
id INTEGER PRIMARY KEY Unique waiting period identifier
session_id INTEGER NOT NULL FK(sessions.id) Associated session
start_time TIMESTAMP NOT NULL When waiting began
end_time TIMESTAMP NULL When user responded
duration_seconds INTEGER NULL Calculated wait duration
context_before TEXT NULL Claude's last message
context_after TEXT NULL User's next message
likely_activity VARCHAR(50) NULL thinking, research, external_work, break

Indexes:

  • idx_waiting_session_start ON (session_id, start_time)
  • idx_waiting_duration ON (duration_seconds)

git_operations

Git commands and repository state changes.

Column Type Constraints Description
id INTEGER PRIMARY KEY Unique git operation identifier
session_id INTEGER NOT NULL FK(sessions.id) Associated session
timestamp TIMESTAMP NOT NULL When operation occurred
operation VARCHAR(50) NOT NULL commit, push, pull, branch, etc.
command TEXT NOT NULL Full git command executed
result TEXT NULL Command output
success BOOLEAN NOT NULL Whether command succeeded
files_changed JSON NULL Array of affected files
lines_added INTEGER NULL Lines added in commit
lines_removed INTEGER NULL Lines removed in commit
commit_hash VARCHAR(40) NULL Git commit SHA
branch_from VARCHAR(255) NULL Source branch
branch_to VARCHAR(255) NULL Target branch

Indexes:

  • idx_git_session_time ON (session_id, timestamp)
  • idx_git_operation ON (operation)
  • idx_git_commit ON (commit_hash)

Analytics Views

project_productivity_summary

Aggregated productivity metrics per project.

CREATE VIEW project_productivity_summary AS
SELECT 
    p.id,
    p.name,
    p.path,
    COUNT(DISTINCT s.id) as total_sessions,
    SUM(s.duration_minutes) as total_time_minutes,
    AVG(s.duration_minutes) as avg_session_minutes,
    COUNT(DISTINCT a.file_path) as unique_files_modified,
    SUM(a.lines_added + a.lines_removed) as total_lines_changed,
    AVG(wp.duration_seconds) as avg_think_time_seconds,
    MAX(s.start_time) as last_activity
FROM projects p
LEFT JOIN sessions s ON p.id = s.project_id
LEFT JOIN activities a ON s.id = a.session_id
LEFT JOIN waiting_periods wp ON s.id = wp.session_id
GROUP BY p.id, p.name, p.path;

daily_productivity_metrics

Daily productivity trends across all projects.

CREATE VIEW daily_productivity_metrics AS
SELECT 
    DATE(s.start_time) as date,
    COUNT(DISTINCT s.id) as sessions_count,
    SUM(s.duration_minutes) as total_time_minutes,
    COUNT(DISTINCT a.file_path) as files_modified,
    SUM(a.lines_added + a.lines_removed) as lines_changed,
    AVG(wp.duration_seconds) as avg_think_time,
    COUNT(DISTINCT s.project_id) as projects_worked_on
FROM sessions s
LEFT JOIN activities a ON s.id = a.session_id AND a.tool_name IN ('Edit', 'Write')
LEFT JOIN waiting_periods wp ON s.id = wp.session_id
WHERE s.end_time IS NOT NULL
GROUP BY DATE(s.start_time)
ORDER BY date DESC;

Data Retention

  • Conversation Full Text: Stored indefinitely for search and analysis
  • Activity Details: Kept for all operations to maintain complete audit trail
  • Analytics Aggregations: Computed on-demand from source data
  • Cleanup: Manual cleanup tools provided, no automatic data expiration

Performance Considerations

  • Database file size grows approximately 1-5MB per day of active development
  • Full-text search indexes require periodic optimization (PRAGMA optimize)
  • Analytics queries use covering indexes to avoid table scans
  • Large file content is not stored, only file paths and change metrics

Migration Strategy

Schema migrations are handled through versioned SQL scripts in /migrations/:

  • Each migration has up/down scripts
  • Version tracking in schema_versions table
  • Automatic backup before migrations
  • Rollback capability for failed migrations