# Architecture Overview This document provides a comprehensive overview of RentCache's system architecture, design decisions, and technical implementation details. ## 🏗️ System Architecture ### High-Level Architecture ```mermaid graph TB Client[Client Applications] --> LB[Load Balancer/Caddy] LB --> App[RentCache FastAPI] App --> Auth[Authentication Layer] Auth --> Rate[Rate Limiting] Rate --> Cache[Cache Manager] Cache --> L1[L1 Cache
Redis] Cache --> L2[L2 Cache
SQLite/PostgreSQL] Cache --> API[Rentcast API] App --> Analytics[Usage Analytics] Analytics --> DB[(Database)] App --> Monitor[Health Monitoring] App --> Metrics[Metrics Collection] subgraph "Data Layer" L1 L2 DB end subgraph "External Services" API end ``` ### Component Responsibilities #### **FastAPI Application Server** - **Primary Role**: HTTP request handling and API routing - **Key Features**: - Async/await architecture for high concurrency - OpenAPI documentation generation - Request/response validation with Pydantic - Middleware stack for cross-cutting concerns #### **Authentication & Authorization** - **Method**: Bearer token authentication using SHA-256 hashed API keys - **Storage**: Secure key storage with expiration and usage limits - **Features**: Per-key rate limiting and usage tracking #### **Multi-Level Caching System** - **L1 Cache (Redis)**: In-memory cache for ultra-fast access - **L2 Cache (Database)**: Persistent cache with analytics - **Strategy**: Write-through with intelligent TTL management #### **Rate Limiting Engine** - **Implementation**: Token bucket algorithm with sliding windows - **Granularity**: Global and per-endpoint limits - **Backend**: Redis-based distributed rate limiting #### **Usage Analytics** - **Tracking**: Request patterns, costs, and performance metrics - **Storage**: Time-series data in relational database - **Reporting**: Real-time dashboards and historical analysis ## 🔄 Request Flow Architecture ### 1. Request Processing Pipeline ```mermaid sequenceDiagram participant C as Client participant A as Auth Layer participant R as Rate Limiter participant CM as Cache Manager participant RC as Redis Cache participant DB as Database participant RA as Rentcast API C->>A: HTTP Request + API Key A->>A: Validate & Hash Key alt Valid API Key A->>R: Check Rate Limits alt Within Limits R->>CM: Cache Lookup CM->>RC: Check L1 Cache alt Cache Hit (L1) RC-->>CM: Return Cached Data CM-->>C: Response + Cache Headers else Cache Miss (L1) CM->>DB: Check L2 Cache alt Cache Hit (L2) DB-->>CM: Return Cached Data CM->>RC: Populate L1 CM-->>C: Response + Cache Headers else Cache Miss (L2) CM->>RA: Upstream API Call RA-->>CM: API Response CM->>DB: Store in L2 CM->>RC: Store in L1 CM-->>C: Response + Cost Headers end end else Rate Limited R-->>C: 429 Rate Limit Exceeded end else Invalid API Key A-->>C: 401 Unauthorized end ``` ### 2. Cache Key Generation **Cache Key Strategy**: MD5 hash of request signature ```python cache_key = md5(json.dumps({ "endpoint": "properties", "method": "GET", "path_params": {"property_id": "123"}, "query_params": {"city": "Austin", "state": "TX"}, "body": {} }, sort_keys=True)).hexdigest() ``` **Benefits**: - Deterministic cache keys - Collision resistance - Parameter order independence - Efficient storage and lookup ## 💾 Caching Strategy ### Multi-Level Cache Architecture #### **Level 1: Redis Cache (Hot Data)** - **Purpose**: Ultra-fast access to frequently requested data - **TTL**: 30 minutes to 2 hours - **Eviction**: LRU (Least Recently Used) - **Size**: Memory-limited, optimized for speed ```python # L1 Cache Configuration REDIS_CONFIG = { "maxmemory": "512mb", "maxmemory_policy": "allkeys-lru", "save": ["900 1", "300 10", "60 10000"], # Persistence snapshots "appendonly": True, # AOF for durability "appendfsync": "everysec" } ``` #### **Level 2: Database Cache (Persistent)** - **Purpose**: Persistent cache with analytics and soft deletion - **TTL**: 1 hour to 48 hours based on endpoint volatility - **Storage**: Full response data + metadata - **Features**: Soft deletion, usage tracking, cost analytics ```sql -- Cache Entry Schema CREATE TABLE cache_entries ( id SERIAL PRIMARY KEY, cache_key VARCHAR(64) UNIQUE NOT NULL, endpoint VARCHAR(50) NOT NULL, method VARCHAR(10) NOT NULL, params_hash VARCHAR(64) NOT NULL, response_data JSONB NOT NULL, status_code INTEGER NOT NULL, estimated_cost DECIMAL(10,2) DEFAULT 0.0, created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(), expires_at TIMESTAMP WITH TIME ZONE NOT NULL, is_valid BOOLEAN DEFAULT TRUE, hit_count INTEGER DEFAULT 0, last_accessed TIMESTAMP WITH TIME ZONE DEFAULT NOW() ); ``` ### Cache TTL Strategy | Endpoint Type | Data Volatility | Default TTL | Rationale | |---------------|----------------|-------------|-----------| | **Property Records** | Very Low | 24 hours | Property characteristics rarely change | | **Value Estimates** | Medium | 1 hour | Market fluctuations affect valuations | | **Rent Estimates** | Medium | 1 hour | Rental markets change regularly | | **Listings** | High | 30 minutes | Active market with frequent updates | | **Market Statistics** | Low | 2 hours | Aggregated data changes slowly | | **Comparables** | Medium | 1 hour | Market-dependent analysis | ### Stale-While-Revalidate Pattern ```python async def get_with_stale_while_revalidate(cache_key: str, ttl: int): """ Serve stale data immediately while refreshing in background """ cached_data = await cache.get(cache_key) if cached_data: if not cached_data.is_expired: return cached_data # Fresh data else: # Serve stale data, trigger background refresh asyncio.create_task(refresh_cache_entry(cache_key)) return cached_data # Stale but usable # Cache miss - fetch fresh data return await fetch_and_cache(cache_key, ttl) ``` **Benefits**: - Improved user experience (no waiting for fresh data) - Reduced upstream API calls during traffic spikes - Graceful handling of upstream service issues ## 🚦 Rate Limiting Implementation ### Token Bucket Algorithm ```python class TokenBucket: def __init__(self, capacity: int, refill_rate: float): self.capacity = capacity self.tokens = capacity self.refill_rate = refill_rate # tokens per second self.last_refill = time.time() async def consume(self, tokens: int = 1) -> bool: await self._refill() if self.tokens >= tokens: self.tokens -= tokens return True return False async def _refill(self): now = time.time() tokens_to_add = (now - self.last_refill) * self.refill_rate self.tokens = min(self.capacity, self.tokens + tokens_to_add) self.last_refill = now ``` ### Multi-Tier Rate Limiting #### **Global Limits** - **Purpose**: Prevent overall API abuse - **Scope**: Per API key across all endpoints - **Implementation**: Redis-based distributed counters #### **Per-Endpoint Limits** - **Purpose**: Protect expensive operations - **Scope**: Specific endpoints (e.g., value estimates) - **Implementation**: Endpoint-specific token buckets #### **Dynamic Rate Limiting** ```python RATE_LIMITS = { "properties": "60/minute", # Standard property searches "value_estimate": "30/minute", # Expensive AI/ML operations "rent_estimate": "30/minute", # Expensive AI/ML operations "market_stats": "20/minute", # Computationally intensive "listings_sale": "100/minute", # Less expensive, higher volume "listings_rental": "100/minute", # Less expensive, higher volume "comparables": "40/minute" # Moderate complexity } ``` ## 📊 Database Schema Design ### Core Tables #### **API Keys Management** ```sql CREATE TABLE api_keys ( id SERIAL PRIMARY KEY, key_name VARCHAR(100) UNIQUE NOT NULL, key_hash VARCHAR(64) UNIQUE NOT NULL, -- SHA-256 hash is_active BOOLEAN DEFAULT TRUE, daily_limit INTEGER DEFAULT 1000, monthly_limit INTEGER DEFAULT 30000, daily_usage INTEGER DEFAULT 0, monthly_usage INTEGER DEFAULT 0, last_daily_reset TIMESTAMP WITH TIME ZONE DEFAULT NOW(), last_monthly_reset TIMESTAMP WITH TIME ZONE DEFAULT NOW(), created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(), expires_at TIMESTAMP WITH TIME ZONE, last_used TIMESTAMP WITH TIME ZONE ); ``` #### **Usage Analytics** ```sql CREATE TABLE usage_stats ( id SERIAL PRIMARY KEY, api_key_id INTEGER REFERENCES api_keys(id), endpoint VARCHAR(50) NOT NULL, method VARCHAR(10) NOT NULL, status_code INTEGER NOT NULL, response_time_ms DECIMAL(10,2) NOT NULL, cache_hit BOOLEAN NOT NULL, estimated_cost DECIMAL(10,2) DEFAULT 0.0, user_agent TEXT, ip_address INET, created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW() ); ``` #### **Rate Limiting State** ```sql CREATE TABLE rate_limits ( id SERIAL PRIMARY KEY, api_key_id INTEGER REFERENCES api_keys(id), endpoint VARCHAR(50) NOT NULL, current_tokens INTEGER DEFAULT 0, last_refill TIMESTAMP WITH TIME ZONE DEFAULT NOW(), created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(), UNIQUE(api_key_id, endpoint) ); ``` ### Indexing Strategy ```sql -- Performance indexes CREATE INDEX idx_cache_entries_key ON cache_entries(cache_key); CREATE INDEX idx_cache_entries_endpoint_expires ON cache_entries(endpoint, expires_at); CREATE INDEX idx_cache_entries_created_at ON cache_entries(created_at); CREATE INDEX idx_usage_stats_api_key_created ON usage_stats(api_key_id, created_at); CREATE INDEX idx_usage_stats_endpoint_created ON usage_stats(endpoint, created_at); CREATE INDEX idx_usage_stats_cache_hit ON usage_stats(cache_hit, created_at); CREATE INDEX idx_api_keys_hash ON api_keys(key_hash); CREATE INDEX idx_api_keys_active ON api_keys(is_active); ``` ## 🔒 Security Architecture ### Authentication Flow ```mermaid graph LR Client --> |Bearer Token| Auth[Auth Middleware] Auth --> Hash[SHA-256 Hash] Hash --> DB[(Database Lookup)] DB --> Validate[Validate Expiry & Status] Validate --> |Valid| Allow[Allow Request] Validate --> |Invalid| Deny[401 Unauthorized] ``` ### Security Measures #### **API Key Protection** - **Storage**: Only SHA-256 hashes stored, never plaintext - **Transmission**: HTTPS only, bearer token format - **Rotation**: Configurable expiration dates - **Revocation**: Instant deactivation capability #### **Network Security** - **HTTPS Enforcement**: Automatic SSL with Caddy - **CORS Configuration**: Configurable origin restrictions - **Rate Limiting**: DDoS and abuse protection - **Request Validation**: Comprehensive input sanitization #### **Container Security** - **Non-root User**: Containers run as unprivileged user - **Minimal Images**: Alpine Linux base images - **Secret Management**: Environment variable injection - **Network Isolation**: Docker network segregation ## 📈 Performance Optimizations ### Application Level #### **Async Architecture** ```python # Concurrent request handling async def handle_multiple_requests(): tasks = [ process_request(req1), process_request(req2), process_request(req3) ] results = await asyncio.gather(*tasks) return results ``` #### **Connection Pooling** ```python # HTTP client configuration http_client = httpx.AsyncClient( timeout=30.0, limits=httpx.Limits( max_connections=100, max_keepalive_connections=20 ) ) # Database connection pooling engine = create_async_engine( DATABASE_URL, pool_size=20, max_overflow=30, pool_pre_ping=True, pool_recycle=3600 ) ``` #### **Response Optimization** - **GZip Compression**: Automatic response compression - **JSON Streaming**: Large response streaming - **Conditional Requests**: ETag and If-Modified-Since support ### Database Level #### **Query Optimization** ```sql -- Efficient cache lookup EXPLAIN ANALYZE SELECT response_data, expires_at, is_valid FROM cache_entries WHERE cache_key = $1 AND expires_at > NOW() AND is_valid = TRUE; ``` #### **Connection Management** - **Prepared Statements**: Reduced parsing overhead - **Connection Pooling**: Shared connection resources - **Read Replicas**: Separate analytics queries ### Caching Level #### **Cache Warming Strategies** ```python async def warm_cache(): """Pre-populate cache with common requests""" common_requests = [ {"endpoint": "properties", "city": "Austin", "state": "TX"}, {"endpoint": "properties", "city": "Dallas", "state": "TX"}, {"endpoint": "market_stats", "zipCode": "78701"} ] for request in common_requests: await fetch_and_cache(request) ``` #### **Memory Management** - **TTL Optimization**: Balanced freshness vs. efficiency - **Compression**: Response data compression - **Eviction Policies**: Smart cache replacement ## 📊 Monitoring and Observability ### Metrics Collection #### **Business Metrics** - Cache hit ratios by endpoint - API cost savings - Request volume trends - Error rates and patterns #### **System Metrics** - Response time percentiles - Database query performance - Memory and CPU utilization - Connection pool statistics #### **Custom Metrics** ```python # Prometheus-style metrics cache_hit_ratio = Gauge('cache_hit_ratio', 'Cache hit ratio by endpoint', ['endpoint']) api_request_duration = Histogram('api_request_duration_seconds', 'API request duration') upstream_calls = Counter('upstream_api_calls_total', 'Total upstream API calls') ``` ### Health Checks #### **Application Health** ```python async def health_check(): checks = { "database": await check_database_connection(), "cache": await check_cache_availability(), "upstream": await check_upstream_api(), "disk_space": await check_disk_usage() } overall_status = "healthy" if all(checks.values()) else "unhealthy" return {"status": overall_status, "checks": checks} ``` #### **Dependency Health** - Database connectivity and performance - Redis availability and memory usage - Upstream API response times - Disk space and system resources ## 🔧 Configuration Management ### Environment-Based Configuration ```python class Settings(BaseSettings): # Server host: str = "0.0.0.0" port: int = 8000 debug: bool = False # Database database_url: str database_echo: bool = False # Cache redis_url: Optional[str] = None redis_enabled: bool = False default_cache_ttl: int = 3600 # Rate Limiting enable_rate_limiting: bool = True global_rate_limit: str = "1000/hour" class Config: env_file = ".env" case_sensitive = False ``` ### Feature Flags ```python class FeatureFlags: ENABLE_REDIS_CACHE = os.getenv("ENABLE_REDIS_CACHE", "true").lower() == "true" ENABLE_ANALYTICS = os.getenv("ENABLE_ANALYTICS", "true").lower() == "true" ENABLE_CACHE_WARMING = os.getenv("ENABLE_CACHE_WARMING", "false").lower() == "true" STRICT_RATE_LIMITING = os.getenv("STRICT_RATE_LIMITING", "false").lower() == "true" ``` ## 🚀 Scalability Considerations ### Horizontal Scaling #### **Stateless Design** - No server-side sessions - Shared state in Redis/Database - Load balancer friendly #### **Container Orchestration** ```yaml # Kubernetes deployment example apiVersion: apps/v1 kind: Deployment metadata: name: rentcache spec: replicas: 3 selector: matchLabels: app: rentcache template: spec: containers: - name: rentcache image: rentcache:latest resources: requests: memory: "512Mi" cpu: "250m" limits: memory: "1Gi" cpu: "500m" ``` ### Vertical Scaling #### **Resource Optimization** - Memory: Cache size tuning - CPU: Async I/O optimization - Storage: Database indexing and partitioning - Network: Connection pooling and keep-alive ### Data Partitioning #### **Database Sharding** ```sql -- Partition by date for analytics CREATE TABLE usage_stats ( -- columns ) PARTITION BY RANGE (created_at); CREATE TABLE usage_stats_2024_01 PARTITION OF usage_stats FOR VALUES FROM ('2024-01-01') TO ('2024-02-01'); ``` #### **Cache Distribution** - Redis Cluster for distributed caching - Consistent hashing for cache key distribution - Regional cache replication ## 🔄 Disaster Recovery ### Backup Strategy #### **Database Backups** ```bash # Automated daily backups pg_dump rentcache | gzip > backup-$(date +%Y%m%d).sql.gz # Point-in-time recovery pg_basebackup -D /backup/base -Ft -z -P ``` #### **Configuration Backups** - Environment variables - Docker Compose files - SSL certificates - Application configuration ### Recovery Procedures #### **Database Recovery** ```bash # Restore from backup gunzip -c backup-20240115.sql.gz | psql rentcache # Point-in-time recovery pg_ctl stop -D /var/lib/postgresql/data rm -rf /var/lib/postgresql/data/* pg_basebackup -D /var/lib/postgresql/data -R ``` #### **Cache Recovery** - Redis persistence (RDB + AOF) - Cache warming from database - Graceful degradation to upstream API --- This architecture is designed for high availability, performance, and cost optimization while maintaining security and operational simplicity. For implementation details, see the [Deployment Guide](DEPLOYMENT.md) and [Usage Guide](USAGE.md).