# Architecture Overview
This document provides a comprehensive overview of RentCache's system architecture, design decisions, and technical implementation details.
## 🏗️ System Architecture
### High-Level Architecture
```mermaid
graph TB
Client[Client Applications] --> LB[Load Balancer/Caddy]
LB --> App[RentCache FastAPI]
App --> Auth[Authentication Layer]
Auth --> Rate[Rate Limiting]
Rate --> Cache[Cache Manager]
Cache --> L1[L1 Cache
Redis]
Cache --> L2[L2 Cache
SQLite/PostgreSQL]
Cache --> API[Rentcast API]
App --> Analytics[Usage Analytics]
Analytics --> DB[(Database)]
App --> Monitor[Health Monitoring]
App --> Metrics[Metrics Collection]
subgraph "Data Layer"
L1
L2
DB
end
subgraph "External Services"
API
end
```
### Component Responsibilities
#### **FastAPI Application Server**
- **Primary Role**: HTTP request handling and API routing
- **Key Features**:
- Async/await architecture for high concurrency
- OpenAPI documentation generation
- Request/response validation with Pydantic
- Middleware stack for cross-cutting concerns
#### **Authentication & Authorization**
- **Method**: Bearer token authentication using SHA-256 hashed API keys
- **Storage**: Secure key storage with expiration and usage limits
- **Features**: Per-key rate limiting and usage tracking
#### **Multi-Level Caching System**
- **L1 Cache (Redis)**: In-memory cache for ultra-fast access
- **L2 Cache (Database)**: Persistent cache with analytics
- **Strategy**: Write-through with intelligent TTL management
#### **Rate Limiting Engine**
- **Implementation**: Token bucket algorithm with sliding windows
- **Granularity**: Global and per-endpoint limits
- **Backend**: Redis-based distributed rate limiting
#### **Usage Analytics**
- **Tracking**: Request patterns, costs, and performance metrics
- **Storage**: Time-series data in relational database
- **Reporting**: Real-time dashboards and historical analysis
## 🔄 Request Flow Architecture
### 1. Request Processing Pipeline
```mermaid
sequenceDiagram
participant C as Client
participant A as Auth Layer
participant R as Rate Limiter
participant CM as Cache Manager
participant RC as Redis Cache
participant DB as Database
participant RA as Rentcast API
C->>A: HTTP Request + API Key
A->>A: Validate & Hash Key
alt Valid API Key
A->>R: Check Rate Limits
alt Within Limits
R->>CM: Cache Lookup
CM->>RC: Check L1 Cache
alt Cache Hit (L1)
RC-->>CM: Return Cached Data
CM-->>C: Response + Cache Headers
else Cache Miss (L1)
CM->>DB: Check L2 Cache
alt Cache Hit (L2)
DB-->>CM: Return Cached Data
CM->>RC: Populate L1
CM-->>C: Response + Cache Headers
else Cache Miss (L2)
CM->>RA: Upstream API Call
RA-->>CM: API Response
CM->>DB: Store in L2
CM->>RC: Store in L1
CM-->>C: Response + Cost Headers
end
end
else Rate Limited
R-->>C: 429 Rate Limit Exceeded
end
else Invalid API Key
A-->>C: 401 Unauthorized
end
```
### 2. Cache Key Generation
**Cache Key Strategy**: MD5 hash of request signature
```python
cache_key = md5(json.dumps({
"endpoint": "properties",
"method": "GET",
"path_params": {"property_id": "123"},
"query_params": {"city": "Austin", "state": "TX"},
"body": {}
}, sort_keys=True)).hexdigest()
```
**Benefits**:
- Deterministic cache keys
- Collision resistance
- Parameter order independence
- Efficient storage and lookup
## 💾 Caching Strategy
### Multi-Level Cache Architecture
#### **Level 1: Redis Cache (Hot Data)**
- **Purpose**: Ultra-fast access to frequently requested data
- **TTL**: 30 minutes to 2 hours
- **Eviction**: LRU (Least Recently Used)
- **Size**: Memory-limited, optimized for speed
```python
# L1 Cache Configuration
REDIS_CONFIG = {
"maxmemory": "512mb",
"maxmemory_policy": "allkeys-lru",
"save": ["900 1", "300 10", "60 10000"], # Persistence snapshots
"appendonly": True, # AOF for durability
"appendfsync": "everysec"
}
```
#### **Level 2: Database Cache (Persistent)**
- **Purpose**: Persistent cache with analytics and soft deletion
- **TTL**: 1 hour to 48 hours based on endpoint volatility
- **Storage**: Full response data + metadata
- **Features**: Soft deletion, usage tracking, cost analytics
```sql
-- Cache Entry Schema
CREATE TABLE cache_entries (
id SERIAL PRIMARY KEY,
cache_key VARCHAR(64) UNIQUE NOT NULL,
endpoint VARCHAR(50) NOT NULL,
method VARCHAR(10) NOT NULL,
params_hash VARCHAR(64) NOT NULL,
response_data JSONB NOT NULL,
status_code INTEGER NOT NULL,
estimated_cost DECIMAL(10,2) DEFAULT 0.0,
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
expires_at TIMESTAMP WITH TIME ZONE NOT NULL,
is_valid BOOLEAN DEFAULT TRUE,
hit_count INTEGER DEFAULT 0,
last_accessed TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);
```
### Cache TTL Strategy
| Endpoint Type | Data Volatility | Default TTL | Rationale |
|---------------|----------------|-------------|-----------|
| **Property Records** | Very Low | 24 hours | Property characteristics rarely change |
| **Value Estimates** | Medium | 1 hour | Market fluctuations affect valuations |
| **Rent Estimates** | Medium | 1 hour | Rental markets change regularly |
| **Listings** | High | 30 minutes | Active market with frequent updates |
| **Market Statistics** | Low | 2 hours | Aggregated data changes slowly |
| **Comparables** | Medium | 1 hour | Market-dependent analysis |
### Stale-While-Revalidate Pattern
```python
async def get_with_stale_while_revalidate(cache_key: str, ttl: int):
"""
Serve stale data immediately while refreshing in background
"""
cached_data = await cache.get(cache_key)
if cached_data:
if not cached_data.is_expired:
return cached_data # Fresh data
else:
# Serve stale data, trigger background refresh
asyncio.create_task(refresh_cache_entry(cache_key))
return cached_data # Stale but usable
# Cache miss - fetch fresh data
return await fetch_and_cache(cache_key, ttl)
```
**Benefits**:
- Improved user experience (no waiting for fresh data)
- Reduced upstream API calls during traffic spikes
- Graceful handling of upstream service issues
## 🚦 Rate Limiting Implementation
### Token Bucket Algorithm
```python
class TokenBucket:
def __init__(self, capacity: int, refill_rate: float):
self.capacity = capacity
self.tokens = capacity
self.refill_rate = refill_rate # tokens per second
self.last_refill = time.time()
async def consume(self, tokens: int = 1) -> bool:
await self._refill()
if self.tokens >= tokens:
self.tokens -= tokens
return True
return False
async def _refill(self):
now = time.time()
tokens_to_add = (now - self.last_refill) * self.refill_rate
self.tokens = min(self.capacity, self.tokens + tokens_to_add)
self.last_refill = now
```
### Multi-Tier Rate Limiting
#### **Global Limits**
- **Purpose**: Prevent overall API abuse
- **Scope**: Per API key across all endpoints
- **Implementation**: Redis-based distributed counters
#### **Per-Endpoint Limits**
- **Purpose**: Protect expensive operations
- **Scope**: Specific endpoints (e.g., value estimates)
- **Implementation**: Endpoint-specific token buckets
#### **Dynamic Rate Limiting**
```python
RATE_LIMITS = {
"properties": "60/minute", # Standard property searches
"value_estimate": "30/minute", # Expensive AI/ML operations
"rent_estimate": "30/minute", # Expensive AI/ML operations
"market_stats": "20/minute", # Computationally intensive
"listings_sale": "100/minute", # Less expensive, higher volume
"listings_rental": "100/minute", # Less expensive, higher volume
"comparables": "40/minute" # Moderate complexity
}
```
## 📊 Database Schema Design
### Core Tables
#### **API Keys Management**
```sql
CREATE TABLE api_keys (
id SERIAL PRIMARY KEY,
key_name VARCHAR(100) UNIQUE NOT NULL,
key_hash VARCHAR(64) UNIQUE NOT NULL, -- SHA-256 hash
is_active BOOLEAN DEFAULT TRUE,
daily_limit INTEGER DEFAULT 1000,
monthly_limit INTEGER DEFAULT 30000,
daily_usage INTEGER DEFAULT 0,
monthly_usage INTEGER DEFAULT 0,
last_daily_reset TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
last_monthly_reset TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
expires_at TIMESTAMP WITH TIME ZONE,
last_used TIMESTAMP WITH TIME ZONE
);
```
#### **Usage Analytics**
```sql
CREATE TABLE usage_stats (
id SERIAL PRIMARY KEY,
api_key_id INTEGER REFERENCES api_keys(id),
endpoint VARCHAR(50) NOT NULL,
method VARCHAR(10) NOT NULL,
status_code INTEGER NOT NULL,
response_time_ms DECIMAL(10,2) NOT NULL,
cache_hit BOOLEAN NOT NULL,
estimated_cost DECIMAL(10,2) DEFAULT 0.0,
user_agent TEXT,
ip_address INET,
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);
```
#### **Rate Limiting State**
```sql
CREATE TABLE rate_limits (
id SERIAL PRIMARY KEY,
api_key_id INTEGER REFERENCES api_keys(id),
endpoint VARCHAR(50) NOT NULL,
current_tokens INTEGER DEFAULT 0,
last_refill TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
UNIQUE(api_key_id, endpoint)
);
```
### Indexing Strategy
```sql
-- Performance indexes
CREATE INDEX idx_cache_entries_key ON cache_entries(cache_key);
CREATE INDEX idx_cache_entries_endpoint_expires ON cache_entries(endpoint, expires_at);
CREATE INDEX idx_cache_entries_created_at ON cache_entries(created_at);
CREATE INDEX idx_usage_stats_api_key_created ON usage_stats(api_key_id, created_at);
CREATE INDEX idx_usage_stats_endpoint_created ON usage_stats(endpoint, created_at);
CREATE INDEX idx_usage_stats_cache_hit ON usage_stats(cache_hit, created_at);
CREATE INDEX idx_api_keys_hash ON api_keys(key_hash);
CREATE INDEX idx_api_keys_active ON api_keys(is_active);
```
## 🔒 Security Architecture
### Authentication Flow
```mermaid
graph LR
Client --> |Bearer Token| Auth[Auth Middleware]
Auth --> Hash[SHA-256 Hash]
Hash --> DB[(Database Lookup)]
DB --> Validate[Validate Expiry & Status]
Validate --> |Valid| Allow[Allow Request]
Validate --> |Invalid| Deny[401 Unauthorized]
```
### Security Measures
#### **API Key Protection**
- **Storage**: Only SHA-256 hashes stored, never plaintext
- **Transmission**: HTTPS only, bearer token format
- **Rotation**: Configurable expiration dates
- **Revocation**: Instant deactivation capability
#### **Network Security**
- **HTTPS Enforcement**: Automatic SSL with Caddy
- **CORS Configuration**: Configurable origin restrictions
- **Rate Limiting**: DDoS and abuse protection
- **Request Validation**: Comprehensive input sanitization
#### **Container Security**
- **Non-root User**: Containers run as unprivileged user
- **Minimal Images**: Alpine Linux base images
- **Secret Management**: Environment variable injection
- **Network Isolation**: Docker network segregation
## 📈 Performance Optimizations
### Application Level
#### **Async Architecture**
```python
# Concurrent request handling
async def handle_multiple_requests():
tasks = [
process_request(req1),
process_request(req2),
process_request(req3)
]
results = await asyncio.gather(*tasks)
return results
```
#### **Connection Pooling**
```python
# HTTP client configuration
http_client = httpx.AsyncClient(
timeout=30.0,
limits=httpx.Limits(
max_connections=100,
max_keepalive_connections=20
)
)
# Database connection pooling
engine = create_async_engine(
DATABASE_URL,
pool_size=20,
max_overflow=30,
pool_pre_ping=True,
pool_recycle=3600
)
```
#### **Response Optimization**
- **GZip Compression**: Automatic response compression
- **JSON Streaming**: Large response streaming
- **Conditional Requests**: ETag and If-Modified-Since support
### Database Level
#### **Query Optimization**
```sql
-- Efficient cache lookup
EXPLAIN ANALYZE
SELECT response_data, expires_at, is_valid
FROM cache_entries
WHERE cache_key = $1
AND expires_at > NOW()
AND is_valid = TRUE;
```
#### **Connection Management**
- **Prepared Statements**: Reduced parsing overhead
- **Connection Pooling**: Shared connection resources
- **Read Replicas**: Separate analytics queries
### Caching Level
#### **Cache Warming Strategies**
```python
async def warm_cache():
"""Pre-populate cache with common requests"""
common_requests = [
{"endpoint": "properties", "city": "Austin", "state": "TX"},
{"endpoint": "properties", "city": "Dallas", "state": "TX"},
{"endpoint": "market_stats", "zipCode": "78701"}
]
for request in common_requests:
await fetch_and_cache(request)
```
#### **Memory Management**
- **TTL Optimization**: Balanced freshness vs. efficiency
- **Compression**: Response data compression
- **Eviction Policies**: Smart cache replacement
## 📊 Monitoring and Observability
### Metrics Collection
#### **Business Metrics**
- Cache hit ratios by endpoint
- API cost savings
- Request volume trends
- Error rates and patterns
#### **System Metrics**
- Response time percentiles
- Database query performance
- Memory and CPU utilization
- Connection pool statistics
#### **Custom Metrics**
```python
# Prometheus-style metrics
cache_hit_ratio = Gauge('cache_hit_ratio', 'Cache hit ratio by endpoint', ['endpoint'])
api_request_duration = Histogram('api_request_duration_seconds', 'API request duration')
upstream_calls = Counter('upstream_api_calls_total', 'Total upstream API calls')
```
### Health Checks
#### **Application Health**
```python
async def health_check():
checks = {
"database": await check_database_connection(),
"cache": await check_cache_availability(),
"upstream": await check_upstream_api(),
"disk_space": await check_disk_usage()
}
overall_status = "healthy" if all(checks.values()) else "unhealthy"
return {"status": overall_status, "checks": checks}
```
#### **Dependency Health**
- Database connectivity and performance
- Redis availability and memory usage
- Upstream API response times
- Disk space and system resources
## 🔧 Configuration Management
### Environment-Based Configuration
```python
class Settings(BaseSettings):
# Server
host: str = "0.0.0.0"
port: int = 8000
debug: bool = False
# Database
database_url: str
database_echo: bool = False
# Cache
redis_url: Optional[str] = None
redis_enabled: bool = False
default_cache_ttl: int = 3600
# Rate Limiting
enable_rate_limiting: bool = True
global_rate_limit: str = "1000/hour"
class Config:
env_file = ".env"
case_sensitive = False
```
### Feature Flags
```python
class FeatureFlags:
ENABLE_REDIS_CACHE = os.getenv("ENABLE_REDIS_CACHE", "true").lower() == "true"
ENABLE_ANALYTICS = os.getenv("ENABLE_ANALYTICS", "true").lower() == "true"
ENABLE_CACHE_WARMING = os.getenv("ENABLE_CACHE_WARMING", "false").lower() == "true"
STRICT_RATE_LIMITING = os.getenv("STRICT_RATE_LIMITING", "false").lower() == "true"
```
## 🚀 Scalability Considerations
### Horizontal Scaling
#### **Stateless Design**
- No server-side sessions
- Shared state in Redis/Database
- Load balancer friendly
#### **Container Orchestration**
```yaml
# Kubernetes deployment example
apiVersion: apps/v1
kind: Deployment
metadata:
name: rentcache
spec:
replicas: 3
selector:
matchLabels:
app: rentcache
template:
spec:
containers:
- name: rentcache
image: rentcache:latest
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "1Gi"
cpu: "500m"
```
### Vertical Scaling
#### **Resource Optimization**
- Memory: Cache size tuning
- CPU: Async I/O optimization
- Storage: Database indexing and partitioning
- Network: Connection pooling and keep-alive
### Data Partitioning
#### **Database Sharding**
```sql
-- Partition by date for analytics
CREATE TABLE usage_stats (
-- columns
) PARTITION BY RANGE (created_at);
CREATE TABLE usage_stats_2024_01 PARTITION OF usage_stats
FOR VALUES FROM ('2024-01-01') TO ('2024-02-01');
```
#### **Cache Distribution**
- Redis Cluster for distributed caching
- Consistent hashing for cache key distribution
- Regional cache replication
## 🔄 Disaster Recovery
### Backup Strategy
#### **Database Backups**
```bash
# Automated daily backups
pg_dump rentcache | gzip > backup-$(date +%Y%m%d).sql.gz
# Point-in-time recovery
pg_basebackup -D /backup/base -Ft -z -P
```
#### **Configuration Backups**
- Environment variables
- Docker Compose files
- SSL certificates
- Application configuration
### Recovery Procedures
#### **Database Recovery**
```bash
# Restore from backup
gunzip -c backup-20240115.sql.gz | psql rentcache
# Point-in-time recovery
pg_ctl stop -D /var/lib/postgresql/data
rm -rf /var/lib/postgresql/data/*
pg_basebackup -D /var/lib/postgresql/data -R
```
#### **Cache Recovery**
- Redis persistence (RDB + AOF)
- Cache warming from database
- Graceful degradation to upstream API
---
This architecture is designed for high availability, performance, and cost optimization while maintaining security and operational simplicity. For implementation details, see the [Deployment Guide](DEPLOYMENT.md) and [Usage Guide](USAGE.md).