Database Schema
Comprehensive database schema documentation including models, relationships, indexes, and migration strategies.
Database Schema
This document provides a comprehensive overview of the database schema, including all models, relationships, indexing strategies, and migration patterns used in the YouTube Analyzer application.
Overview
The application uses PostgreSQL as the primary database with Prisma ORM for type-safe database access and schema management. The database is designed to support multi-tenant operations, complex analysis workflows, and efficient querying patterns.
Database Provider: PostgreSQL (hosted on Neon/Supabase)
ORM: Prisma Client with TypeScript
Migration Strategy: Prisma Migrate with version control
Core Models
User Model
The central user entity that manages authentication, billing, and analysis ownership.
model User {
id String @id @default(cuid())
name String?
email String? @unique
emailVerified DateTime?
image String?
role UserRole @default(USER)
// Timestamps
createdAt DateTime @default(now()) @map(name: "created_at")
updatedAt DateTime @default(now()) @map(name: "updated_at")
// Credit System
credits Int @default(3) // Available credits
reservedCredits Int @default(0) // Credits reserved for processing
creditsUsed Int @default(0) // Total credits consumed
lastCreditReset DateTime @default(now()) // Last credit reset time
// Stripe Integration
stripeCustomerId String? @unique @map(name: "stripe_customer_id")
stripeSubscriptionId String? @unique @map(name: "stripe_subscription_id")
stripePriceId String? @map(name: "stripe_price_id")
stripeCurrentPeriodEnd DateTime? @map(name: "stripe_current_period_end")
// Relationships
accounts Account[]
sessions Session[]
analyses Analysis[]
autoAnalyses AutoAnalysis[]
@@map(name: "users")
}
enum UserRole {
ADMIN
USER
}Key Features:
- Credit System: Tracks available, reserved, and used credits
- Stripe Integration: Complete billing and subscription management
- Role-Based Access: Admin and user roles for different permissions
- Audit Trail: Creation and update timestamps
Analysis Model
Stores individual analysis results and metadata.
model Analysis {
id String @id @default(cuid())
userId String
channelUrl String
videoCount Int
htmlResult String @db.Text // Full HTML analysis result
createdAt DateTime @default(now())
// Analysis Configuration
triggerType TriggerType @default(manual) // How analysis was triggered
autoAnalysisId String? // Link to auto-analysis config
analysisType String? // Type of analysis performed
// Status and Metadata
status String? @default("completed")
channelName String?
channelThumbnail String?
summaryPreview String? // Short preview for listings
errorMessage String? // Error details if failed
// Enhanced Data (JSON fields)
performanceData Json? // Channel performance metrics
summaryMetrics Json? // Analysis summary data
overallSentiment Json? // Sentiment analysis results
chartData Json? // Chart visualization data
// Relationships
user User @relation(fields: [userId], references: [id])
autoAnalysis AutoAnalysis? @relation(fields: [autoAnalysisId], references: [id], onDelete: SetNull)
videos Video[]
}Key Features:
- Flexible Content Storage: HTML results with JSON metadata
- Status Tracking: Processing states and error handling
- Rich Metadata: Performance data and visualization-ready data
- Audit Trail: Complete analysis history with timestamps
AutoAnalysis Model
Manages automated analysis configurations for continuous channel monitoring.
model AutoAnalysis {
id String @id @default(cuid())
userId String
channelId String
channelName String
channelUrl String
// Configuration
triggerType TriggerType // Trigger condition
analysisType String @default("news-feed") // Analysis type to run
isActive Boolean @default(true)
// State Management
lastAnalyzedVideoId String? // Last processed video
lastRunAt DateTime? // Last execution time
totalRuns Int @default(0) // Total executions
creditsUsed Int @default(0) // Credits consumed
// Timestamps
createdAt DateTime @default(now())
updatedAt DateTime @updatedAt
// Relationships
user User @relation(fields: [userId], references: [id], onDelete: Cascade)
analyses Analysis[]
@@index([userId])
@@index([isActive])
@@index([lastRunAt])
}
enum TriggerType {
manual // User-initiated analysis
auto // General auto-trigger
per_video // Trigger on each new video
batch_3 // Trigger after 3 new videos
batch_5 // Trigger after 5 new videos
}Key Features:
- Flexible Triggers: Multiple trigger conditions for automation
- State Persistence: Tracks last processed video to avoid duplicates
- Usage Metrics: Monitors runs and credit consumption
- Performance Indexes: Optimized for cron job queries
Video Model
Stores detailed video data and metadata for each analyzed video.
model Video {
id String @id @default(cuid())
analysisId String
youtubeId String
title String
description String? @db.Text
content String @db.Text // Transcript content
// Video Metadata
duration Int // Duration in seconds
viewCount Int?
likeCount Int?
thumbnail String
uploadDate DateTime?
channelId String
channelName String
// Language and Localization
language String // Primary language
availableLanguages String[] // All available languages
tags String[] // Video tags
transcriptLanguages String[] // Available transcript languages
// Enhanced Data (JSON fields)
commentSentiment Json? // Comment sentiment analysis
channelMetrics Json? // Channel-specific metrics
// Timestamps
createdAt DateTime @default(now())
updatedAt DateTime @updatedAt
// Relationships
analysis Analysis @relation(fields: [analysisId], references: [id], onDelete: Cascade)
comments Comment[]
@@unique([youtubeId, analysisId])
}Key Features:
- Complete Video Data: All YouTube metadata and content
- Multi-language Support: Handles multiple transcript languages
- Rich Content: Full transcript content for analysis
- Unique Constraints: Prevents duplicate videos per analysis
Comment Model
Stores individual video comments for sentiment analysis.
model Comment {
id String @id @default(cuid())
videoId String
text String
author String
likes Int?
publishedAt DateTime?
sentiment Json? // Sentiment analysis results
// Relationships
video Video @relation(fields: [videoId], references: [id], onDelete: Cascade)
}Authentication Models
Account Model
OAuth account connections for Auth.js integration.
model Account {
id String @id @default(cuid())
userId String
type String
provider String
providerAccountId String
refresh_token String? @db.Text
access_token String? @db.Text
expires_at Int?
token_type String?
scope String?
id_token String? @db.Text
session_state String?
createdAt DateTime @default(now()) @map(name: "created_at")
updatedAt DateTime @default(now()) @map(name: "updated_at")
user User @relation(fields: [userId], references: [id], onDelete: Cascade)
@@unique([provider, providerAccountId])
@@index([userId])
@@map(name: "accounts")
}Session Model
User session management for Auth.js.
model Session {
id String @id @default(cuid())
sessionToken String @unique
userId String
expires DateTime
user User @relation(fields: [userId], references: [id], onDelete: Cascade)
@@index([userId])
@@map(name: "sessions")
}VerificationToken Model
Email verification tokens.
model VerificationToken {
identifier String
token String @unique
expires DateTime
@@unique([identifier, token])
@@map(name: "verification_tokens")
}Database Relationships
Entity Relationship Diagram
User
├── 1:N → Account (OAuth accounts)
├── 1:N → Session (Active sessions)
├── 1:N → Analysis (User's analyses)
└── 1:N → AutoAnalysis (Auto-analysis configs)
└── 1:N → Analysis (Generated analyses)
Analysis
├── N:1 → User (Owner)
├── N:1 → AutoAnalysis (Optional: if auto-generated)
└── 1:N → Video (Analyzed videos)
└── 1:N → Comment (Video comments)
Relationship Details
User Relationships
- User → Analysis: One user can have many analyses (1:N)
- User → AutoAnalysis: One user can have many auto-analysis configs (1:N)
- User → Account: One user can have multiple OAuth accounts (1:N)
- User → Session: One user can have multiple active sessions (1:N)
Analysis Relationships
- Analysis → User: Each analysis belongs to one user (N:1)
- Analysis → AutoAnalysis: Analysis may be generated by auto-analysis (N:1, optional)
- Analysis → Video: One analysis contains many videos (1:N)
Video Relationships
- Video → Analysis: Each video belongs to one analysis (N:1)
- Video → Comment: One video can have many comments (1:N)
Indexing Strategy
Primary Indexes
All models use cuid() for primary keys, providing:
- Collision Resistance: Extremely low probability of conflicts
- Sortability: Lexicographically sortable by creation time
- URL Safety: Safe for use in URLs and APIs
Performance Indexes
User Table
// Unique constraints for authentication
email String? @unique
stripeCustomerId String? @unique
stripeSubscriptionId String? @uniqueAccount Table
// OAuth provider uniqueness
@@unique([provider, providerAccountId])
@@index([userId]) // Foreign key optimizationAutoAnalysis Table
@@index([userId]) // User's auto-analyses
@@index([isActive]) // Active configurations for cron
@@index([lastRunAt]) // Cron job optimizationVideo Table
@@unique([youtubeId, analysisId]) // Prevent video duplicates per analysisQuery Optimization Examples
Efficient User Analysis History
// Optimized query with proper indexes
const analyses = await prisma.analysis.findMany({
where: { userId }, // Uses userId index
orderBy: { createdAt: 'desc' }, // Uses createdAt for sorting
take: 10, // Limit results
include: {
videos: {
select: { thumbnail: true }, // Only needed fields
take: 1 // First video thumbnail only
}
}
});Auto-Analysis Cron Query
// Optimized for cron job performance
const activeConfigs = await prisma.autoAnalysis.findMany({
where: {
isActive: true, // Uses isActive index
lastRunAt: { // Uses lastRunAt index
lt: new Date(Date.now() - 30 * 60 * 1000) // 30 minutes ago
}
},
include: { user: true }
});Data Types and Validation
JSON Field Usage
The schema uses PostgreSQL's native JSON support for flexible data storage:
Analysis Performance Data
interface PerformanceData {
totalViews: number;
averageViews: number;
totalLikes: number;
averageLikes: number;
engagementRate: number;
uploadFrequency: string;
topPerformingVideos: Array<{
title: string;
views: number;
likes: number;
}>;
}Chart Data
interface ChartData {
viewsOverTime: Array<{
date: string;
views: number;
}>;
performanceMetrics: Array<{
metric: string;
value: number;
change: number;
}>;
sentimentDistribution: {
positive: number;
neutral: number;
negative: number;
};
}Comment Sentiment
interface CommentSentiment {
score: number; // -1 to 1
magnitude: number; // 0 to 1
label: 'positive' | 'negative' | 'neutral';
confidence: number; // 0 to 1
}Enum Definitions
UserRole
enum UserRole {
ADMIN = "ADMIN", // Full system access
USER = "USER" // Standard user access
}TriggerType
enum TriggerType {
manual = "manual", // User-initiated
auto = "auto", // General automation
per_video = "per_video", // Every new video
batch_3 = "batch_3", // Every 3 videos
batch_5 = "batch_5" // Every 5 videos
}Migration Strategies
Development Workflow
# Create new migration
npx prisma migrate dev --name add_new_feature
# Apply migrations to production
npx prisma migrate deploy
# Generate Prisma client
npx prisma generateMigration Best Practices
1. Additive Changes
// ✅ Safe: Adding optional fields
model User {
id String @id
newField String? // Optional field is safe
}2. Data Migrations
// Example: Migrating analysis types
import { PrismaClient } from '@prisma/client';
const prisma = new PrismaClient();
async function migrateAnalysisTypes() {
await prisma.analysis.updateMany({
where: { analysisType: null },
data: { analysisType: 'standard-summary' }
});
}3. Index Migrations
-- Create index concurrently (PostgreSQL)
CREATE INDEX CONCURRENTLY idx_analysis_created_at
ON "Analysis" ("createdAt" DESC);Backup Strategies
Pre-Migration Backup
# Backup before major migrations
pg_dump $DATABASE_URL > backup_$(date +%Y%m%d_%H%M%S).sqlPoint-in-Time Recovery
# PostgreSQL PITR setup
postgresql.conf:
wal_level = replica
archive_mode = on
archive_command = 'cp %p /archive/%f'Performance Considerations
Connection Pooling
// Prisma connection pooling
const prisma = new PrismaClient({
datasources: {
db: {
url: process.env.DATABASE_URL + "?connection_limit=10&pool_timeout=20"
}
}
});Query Optimization
Efficient Pagination
// Cursor-based pagination for large datasets
const analyses = await prisma.analysis.findMany({
take: 20,
skip: cursor ? 1 : 0,
cursor: cursor ? { id: cursor } : undefined,
orderBy: { createdAt: 'desc' }
});Batch Operations
// Efficient batch inserts
await prisma.video.createMany({
data: videoData,
skipDuplicates: true // Ignore unique constraint violations
});Database Monitoring
Query Performance
// Prisma query logging
const prisma = new PrismaClient({
log: [
{ level: 'query', emit: 'event' },
{ level: 'info', emit: 'stdout' },
{ level: 'warn', emit: 'stdout' },
{ level: 'error', emit: 'stdout' },
],
});
prisma.$on('query', (e) => {
console.log('Query: ' + e.query);
console.log('Duration: ' + e.duration + 'ms');
});Data Integrity and Constraints
Foreign Key Constraints
// Cascade deletes for data consistency
user User @relation(fields: [userId], references: [id], onDelete: Cascade)
// Set null for optional relationships
autoAnalysis AutoAnalysis? @relation(fields: [autoAnalysisId], references: [id], onDelete: SetNull)Unique Constraints
// Prevent duplicate auto-analyses per user/channel
@@unique([userId, channelId])
// Prevent duplicate videos per analysis
@@unique([youtubeId, analysisId])Data Validation
// Prisma client validation
const createUser = await prisma.user.create({
data: {
email: "user@example.com", // Must be unique
credits: 10, // Must be integer
role: "USER" // Must be valid enum value
}
});Security Considerations
Row Level Security (Future Enhancement)
-- Example RLS policy for multi-tenant security
CREATE POLICY user_isolation ON analysis
FOR ALL TO authenticated
USING (user_id = auth.uid());Data Encryption
// Sensitive data encryption before storage
import crypto from 'crypto';
const encryptSensitiveData = (data: string): string => {
const cipher = crypto.createCipher('aes-256-cbc', process.env.ENCRYPTION_KEY);
let encrypted = cipher.update(data, 'utf8', 'hex');
encrypted += cipher.final('hex');
return encrypted;
};Audit Logging
// Audit trail for sensitive operations
model AuditLog {
id String @id @default(cuid())
userId String
action String // CREATE, UPDATE, DELETE
table String // Table name
recordId String // Record ID
oldData Json? // Previous data
newData Json? // New data
timestamp DateTime @default(now())
}This database schema provides a robust foundation for the YouTube Analyzer application, supporting complex analysis workflows, automated processing, and efficient data retrieval patterns while maintaining data integrity and performance.