🏭 Domains✍️ Khoa📅 19/04/2026☕ 15 phút đọc

Domain: Video Streaming Platform

YouTube serve 1 tỷ giờ video mỗi ngày. Netflix chiếm 15% global internet bandwidth. TikTok xử lý 1 tỷ video uploads mỗi tháng. Đằng sau những con số này là infrastructure phức tạp nhất trong tech: video encoding pipeline, adaptive bitrate streaming, CDN optimization, và content recommendation at massive scale.

Section này mô tả cách thiết kế video platform như YouTube/TikTok ở production scale.

1. Video Upload Pipeline

1.1 Upload Flow Architecture

┌──────────────┐         ┌────────────┐         ┌──────────────────┐
│  Client App  │────────►│ Upload API │────────►│ Object Storage   │
│              │         │ (presigned)│         │ (S3 / GCS)       │
└──────────────┘         └────────────┘         └────────┬─────────┘
                                                          │
                                                          ▼
                                                  ┌──────────────────┐
                                                  │ Message Queue    │
                                                  │ (Kafka / SQS)    │
                                                  └────────┬─────────┘
                                                           │
                     ┌─────────────────────────────────────┼────────────────┐
                     │                                     │                │
                     ▼                                     ▼                ▼
            ┌─────────────────┐              ┌─────────────────────────────────┐
            │ Transcoding     │              │ Thumbnail Generation            │
            │ Worker Pool     │              │ + Metadata Extraction           │
            │ (FFmpeg)        │              └─────────────────────────────────┘
            └────────┬────────┘
                     │
                     ▼
            ┌─────────────────┐
            │ Multiple        │
            │ Resolutions     │
            │ (360p - 4K)     │
            └────────┬────────┘
                     │
                     ▼
            ┌─────────────────┐
            │ CDN Upload      │
            │ (CloudFront,    │
            │  Cloudflare)    │
            └─────────────────┘

1.2 Presigned URL for Direct Upload

import (
    "github.com/aws/aws-sdk-go/aws"
    "github.com/aws/aws-sdk-go/aws/session"
    "github.com/aws/aws-sdk-go/service/s3"
    "time"
)

func GeneratePresignedUploadURL(filename string, userID int64) (string, error) {
    sess := session.Must(session.NewSession())
    svc := s3.New(sess)
    
    // Unique S3 key
    key := fmt.Sprintf("uploads/%d/%d-%s", userID, time.Now().Unix(), filename)
    
    req, _ := svc.PutObjectRequest(&s3.PutObjectInput{
        Bucket: aws.String("my-video-bucket"),
        Key:    aws.String(key),
    })
    
    // Presigned URL valid for 30 minutes
    url, err := req.Presign(30 * time.Minute)
    if err != nil {
        return "", err
    }
    
    // Save upload record to DB (status: UPLOADING)
    video := &Video{
        UserID:    userID,
        S3Key:     key,
        Status:    "UPLOADING",
        CreatedAt: time.Now(),
    }
    db.Insert(video)
    
    return url, nil
}

Client upload directly to S3 (không qua backend → giảm bandwidth cost).

1.3 Resumable Upload (Multipart)

// Frontend: Upload large files với resumable chunks
async function uploadVideo(file) {
    const CHUNK_SIZE = 5 * 1024 * 1024; // 5MB chunks
    const totalChunks = Math.ceil(file.size / CHUNK_SIZE);
    
    // Bước 1: Initiate multipart upload
    const { uploadId, uploadUrl } = await fetch('/api/videos/upload/init', {
        method: 'POST',
        body: JSON.stringify({ filename: file.name, filesize: file.size })
    }).then(r => r.json());
    
    // Bước 2: Upload chunks
    const uploadedParts = [];
    for (let i = 0; i < totalChunks; i++) {
        const start = i * CHUNK_SIZE;
        const end = Math.min(start + CHUNK_SIZE, file.size);
        const chunk = file.slice(start, end);
        
        const partNumber = i + 1;
        const partUrl = `${uploadUrl}?partNumber=${partNumber}&uploadId=${uploadId}`;
        
        const response = await fetch(partUrl, {
            method: 'PUT',
            body: chunk
        });
        
        const etag = response.headers.get('ETag');
        uploadedParts.push({ PartNumber: partNumber, ETag: etag });
        
        // Update progress
        const progress = Math.round((i + 1) / totalChunks * 100);
        updateProgressBar(progress);
    }
    
    // Bước 3: Complete multipart upload
    await fetch('/api/videos/upload/complete', {
        method: 'POST',
        body: JSON.stringify({ uploadId, parts: uploadedParts })
    });
}

Benefit: Nếu upload fail ở chunk 50/100, chỉ cần retry chunk đó (không upload lại từ đầu).

2. Video Transcoding — FFmpeg Pipeline

2.1 Adaptive Bitrate Streaming (ABR)

Tạo multiple versions của video với các resolution/bitrate khác nhau:

Original: 4K 3840×2160 @ 20 Mbps
  ↓
Transcoding:
  - 1080p @ 5 Mbps
  - 720p  @ 2.5 Mbps
  - 480p  @ 1 Mbps
  - 360p  @ 0.5 Mbps

Client tự động switch resolution dựa trên bandwidth
  → Smooth playback, no buffering

2.2 FFmpeg Command

# Transcode to multiple resolutions
ffmpeg -i input.mp4 \
  -vf scale=1920:1080 -c:v libx264 -preset medium -crf 23 -b:v 5M -maxrate 5M -bufsize 10M -c:a aac -b:a 128k output_1080p.mp4 \
  -vf scale=1280:720  -c:v libx264 -preset medium -crf 23 -b:v 2.5M -maxrate 2.5M -bufsize 5M -c:a aac -b:a 128k output_720p.mp4 \
  -vf scale=854:480   -c:v libx264 -preset medium -crf 23 -b:v 1M -maxrate 1M -bufsize 2M -c:a aac -b:a 96k output_480p.mp4 \
  -vf scale=640:360   -c:v libx264 -preset medium -crf 23 -b:v 500k -maxrate 500k -bufsize 1M -c:a aac -b:a 64k output_360p.mp4

# Generate HLS manifest
ffmpeg -i input.mp4 \
  -c:v libx264 -c:a aac \
  -b:v:0 5M -s:v:0 1920x1080 \
  -b:v:1 2.5M -s:v:1 1280x720 \
  -b:v:2 1M -s:v:2 854x480 \
  -b:v:3 500k -s:v:3 640x360 \
  -var_stream_map "v:0,a:0 v:1,a:1 v:2,a:2 v:3,a:3" \
  -master_pl_name master.m3u8 \
  -f hls -hls_time 6 -hls_list_size 0 \
  -hls_segment_filename "v%v/segment_%03d.ts" \
  "v%v/playlist.m3u8"

Parameters:

-c:v libx264: H.264 codec (widely supported)
-preset medium: Encoding speed vs quality tradeoff
-crf 23: Quality (18-28, lower = better quality)
-hls_time 6: 6-second segments (shorter = faster ABR switching, but more overhead)

2.3 Distributed Transcoding Workers

type TranscodingWorker struct {
    s3         *s3.Client
    kafka      *kafka.Consumer
    ffmpegPath string
}

func (w *TranscodingWorker) Start() {
    for {
        msg := w.kafka.ReadMessage()
        var job TranscodeJob
        json.Unmarshal(msg.Value, &job)
        
        if err := w.processJob(job); err != nil {
            // Retry hoặc DLQ
            log.Error("Transcode failed", "job", job.VideoID, "err", err)
        }
    }
}

func (w *TranscodingWorker) processJob(job TranscodeJob) error {
    // Bước 1: Download original từ S3
    inputPath := fmt.Sprintf("/tmp/%s.mp4", job.VideoID)
    if err := w.s3.DownloadFile(job.S3Key, inputPath); err != nil {
        return err
    }
    defer os.Remove(inputPath)
    
    // Bước 2: Transcode to multiple resolutions
    resolutions := []string{"1080p", "720p", "480p", "360p"}
    for _, res := range resolutions {
        outputPath := fmt.Sprintf("/tmp/%s_%s.mp4", job.VideoID, res)
        
        cmd := exec.Command(w.ffmpegPath,
            "-i", inputPath,
            "-vf", getScaleFilter(res),
            "-c:v", "libx264",
            "-preset", "medium",
            "-crf", "23",
            "-b:v", getBitrate(res),
            "-c:a", "aac",
            "-b:a", "128k",
            outputPath,
        )
        
        if err := cmd.Run(); err != nil {
            return fmt.Errorf("ffmpeg failed for %s: %w", res, err)
        }
        
        // Bước 3: Upload to S3
        s3Key := fmt.Sprintf("videos/%s/%s.mp4", job.VideoID, res)
        if err := w.s3.UploadFile(outputPath, s3Key); err != nil {
            return err
        }
        
        os.Remove(outputPath)
    }
    
    // Bước 4: Generate HLS playlist
    if err := w.generateHLS(job); err != nil {
        return err
    }
    
    // Bước 5: Update DB status
    db.Exec(`UPDATE videos SET status = 'READY', processed_at = now() WHERE id = $1`, job.VideoID)
    
    return nil
}

Scale: Horizontal scaling — spin up 100 workers khi có 10k videos in queue.

3. HLS (HTTP Live Streaming)

3.1 HLS Structure

master.m3u8 (manifest):
  Lists all available variants (resolutions)

#EXTM3U
#EXT-X-STREAM-INF:BANDWIDTH=5000000,RESOLUTION=1920x1080
1080p/playlist.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=2500000,RESOLUTION=1280x720
720p/playlist.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=1000000,RESOLUTION=854x480
480p/playlist.m3u8

───────────────────────────────────────────────────

720p/playlist.m3u8 (variant):
  Lists segments for this resolution

#EXTM3U
#EXT-X-TARGETDURATION:6
#EXT-X-VERSION:3
#EXTINF:6.0,
segment_000.ts
#EXTINF:6.0,
segment_001.ts
#EXTINF:6.0,
segment_002.ts
...
#EXT-X-ENDLIST

Player logic:

Fetch master.m3u8
Measure bandwidth
Select appropriate variant (e.g., 720p)
Fetch 720p/playlist.m3u8
Download segments sequentially
Monitor bandwidth → switch variant if needed

3.2 DASH (Alternative to HLS)

DASH = Dynamic Adaptive Streaming over HTTP (ISO standard)
HLS = Apple's proprietary (nhưng đã trở thành de-facto standard)

Differences:
  - HLS: .m3u8 manifest, .ts segments
  - DASH: .mpd manifest, .mp4 segments
  - DASH: More flexible, supports more codecs
  - HLS: Better Apple ecosystem support

In practice: Serve both (HLS for iOS/Safari, DASH for others) hoặc chỉ HLS (universal support).

4. CDN Strategy

4.1 Multi-tier CDN

┌─────────────┐
│   Origin    │ ← S3 / Origin servers (1 copy)
│  (Storage)  │
└──────┬──────┘
       │
       ▼
┌─────────────┐
│  Edge CDN   │ ← CloudFront, Cloudflare (global PoPs)
│  (Cache)    │
└──────┬──────┘
       │
       ▼
┌─────────────┐
│  ISP Cache  │ ← Netflix Open Connect (tại ISP datacenter)
│  (Optional) │
└─────────────┘

Netflix Open Connect: Deploy cache servers directly tại ISP → 95% traffic served locally.

4.2 Cache Strategy

Cache-Control headers:
  - Video segments (.ts): max-age=31536000 (1 năm, immutable)
  - Playlists (.m3u8): max-age=10 (10 seconds, để update live streams)
  - Thumbnails: max-age=86400 (1 ngày)

Range requests:
  Cho phép client download partial video (seeking) mà không cần download toàn bộ

GET /videos/abc123/720p/segment_010.ts
Range: bytes=0-1048575  (first 1MB)

Response:
HTTP/1.1 206 Partial Content
Content-Range: bytes 0-1048575/5242880
Content-Length: 1048576
Cache-Control: public, max-age=31536000, immutable

4.3 Geographic Load Balancing

DNS-based routing (Route 53, Cloudflare):
  User ở Việt Nam → redirect to Singapore PoP
  User ở US → redirect to Virginia PoP

Anycast IP:
  Cùng 1 IP address announced từ multiple locations
  → Router tự động chọn gần nhất

5. Video Recommendation Engine

5.1 Collaborative Filtering

User-Item Matrix:

           Video1  Video2  Video3  Video4
User A       5       ?       3       ?
User B       4       2       ?       5
User C       ?       3       4       4

Predict rating: User A cho Video 2
  → Find similar users (User B, C đã xem Video 2)
  → Weighted average of their ratings

Matrix Factorization (ALS):

from pyspark.ml.recommendation import ALS

# Load interaction data (user_id, video_id, rating)
interactions = spark.read.parquet("s3://data/interactions.parquet")

als = ALS(
    maxIter=10,
    regParam=0.1,
    userCol="user_id",
    itemCol="video_id",
    ratingCol="rating",
    coldStartStrategy="drop"
)

model = als.fit(interactions)

# Generate recommendations for all users
user_recs = model.recommendForAllUsers(20)  # top 20 per user

# Save to serving layer (Redis / DynamoDB)
user_recs.write.format("redis").save()

5.2 Content-based Filtering

Video embeddings:
  - Title/description: BERT embeddings (768-dim vector)
  - Thumbnail: ResNet embeddings (2048-dim vector)
  - Audio: VGGish embeddings (128-dim vector)
  - Category, tags (one-hot encoding)

Cosine similarity:
  sim(video1, video2) = (emb1 · emb2) / (||emb1|| × ||emb2||)

User profile embedding:
  Weighted average của videos user đã xem
  → Recommend videos có embedding gần user profile

import numpy as np
from sklearn.metrics.pairwise import cosine_similarity

def recommend_similar_videos(video_id, top_k=10):
    # Fetch video embedding
    video_emb = redis.get(f"video_emb:{video_id}")
    video_emb = np.frombuffer(video_emb, dtype=np.float32)
    
    # Fetch all video embeddings (or use Faiss for billion-scale)
    all_embeddings = load_all_embeddings()  # shape (N, 768)
    
    # Compute similarity
    similarities = cosine_similarity([video_emb], all_embeddings)[0]
    
    # Top-K
    top_indices = np.argsort(similarities)[::-1][:top_k]
    top_video_ids = [index_to_video_id[i] for i in top_indices]
    
    return top_video_ids

5.3 Two-stage Ranking (YouTube-style)

Stage 1: Candidate Generation (retrieve top 1000 from millions)
  - User history-based CF
  - Content-based similarity
  - Trending videos (time decay)
  - Subscribed channels' new uploads

Stage 2: Ranking (re-rank 1000 → 20)
  - Deep Neural Network với features:
    * User demographics, watch history, engagement rate
    * Video metadata, quality score, upload recency
    * Context: time of day, device, location
  - Objective: Predict watch time (not just CTR)
    → Optimize for engagement, not clickbait

import tensorflow as tf

# Simplified ranking model
def build_ranking_model():
    # Input features
    user_features = tf.keras.Input(shape=(50,), name='user_features')
    video_features = tf.keras.Input(shape=(100,), name='video_features')
    context_features = tf.keras.Input(shape=(10,), name='context_features')
    
    # Concatenate
    concat = tf.keras.layers.Concatenate()([user_features, video_features, context_features])
    
    # Deep layers
    x = tf.keras.layers.Dense(256, activation='relu')(concat)
    x = tf.keras.layers.Dropout(0.3)(x)
    x = tf.keras.layers.Dense(128, activation='relu')(x)
    x = tf.keras.layers.Dropout(0.3)(x)
    
    # Output: predicted watch time (regression)
    output = tf.keras.layers.Dense(1, activation='linear', name='watch_time')(x)
    
    model = tf.keras.Model(
        inputs=[user_features, video_features, context_features],
        outputs=output
    )
    
    model.compile(optimizer='adam', loss='mse', metrics=['mae'])
    return model

# Training
model = build_ranking_model()
model.fit(
    [train_user_feat, train_video_feat, train_context_feat],
    train_watch_time,
    epochs=10,
    batch_size=512,
    validation_split=0.2
)

# Inference
predictions = model.predict([user_feat, video_feat, context_feat])
ranked_videos = sorted(zip(candidate_videos, predictions), key=lambda x: x[1], reverse=True)

6. Live Streaming

6.1 RTMP Ingestion

Streamer                  Ingest Server               CDN
  │                           │                        │
  │─ RTMP stream ────────────►│                        │
  │  1920x1080 @ 6Mbps        │                        │
  │                           │─ Transcode ────────────►│
  │                           │  (multiple bitrates)   │
  │                           │                        │
  │                           │─ HLS/DASH Packaging ───►│
  │                           │                        │
  │                           │                        │
Viewers                       │                        │
  │                           │                        │
  │◄────────────────── HLS segments ───────────────────┤

RTMP: Real-Time Messaging Protocol (Adobe)

Legacy, nhưng vẫn dùng cho ingestion (OBS, Streamlabs broadcast qua RTMP)
Delivery: HLS hoặc DASH (RTMP không scale cho playback)

6.2 Low-latency Streaming

Standard HLS: 15-30 seconds latency (do segment size + buffering)

Low-latency HLS (LL-HLS):
  - Smaller segments (2s instead of 6s)
  - Partial segments (chunked transfer encoding)
  - Latency: 3-5 seconds

WebRTC:
  - Peer-to-peer (or SFU: Selective Forwarding Unit)
  - Sub-second latency (<500ms)
  - Use case: Video calls, live auctions, real-time gaming

Trade-off:
  - HLS: Scale tốt (millions viewers), high latency
  - WebRTC: Low latency, nhưng expensive (no CDN, need SFU cluster)

6.3 Chat & Engagement

Live stream chat:
  - WebSocket hoặc Server-Sent Events (SSE)
  - Message rate limiting: 1 msg/second per user
  - Moderation: Auto-filter spam, profanity
  - Super chat (paid messages): Priority display + revenue share

Implementation:
  - Redis Pub/Sub cho chat messages
  - Kafka cho persistent storage (replay chat)
  - Elasticsearch cho chat search/moderation

7. Analytics & Metrics

7.1 Video Metrics

Engagement metrics:
  - View count (unique viewers)
  - Watch time (total minutes watched)
  - Completion rate (% users watched to end)
  - Average view duration
  - Like / dislike ratio
  - Comment count, share count

Quality of Service (QoS):
  - Buffering ratio (% time spent buffering)
  - Startup time (time to first frame)
  - Bitrate switches (ABR quality changes)
  - Error rate

7.2 Real-time Event Tracking

// Client-side tracking
player.on('play', () => {
    trackEvent('video_play', { video_id, user_id, timestamp });
});

player.on('pause', () => {
    trackEvent('video_pause', { video_id, user_id, position });
});

player.on('ended', () => {
    trackEvent('video_complete', { video_id, user_id, watch_duration });
});

player.on('bufferstart', () => {
    trackEvent('buffer_start', { video_id, position, bitrate });
});

// Batch events → send mỗi 10s hoặc khi user close tab
const eventBuffer = [];
function trackEvent(event_type, data) {
    eventBuffer.push({ event_type, data, timestamp: Date.now() });
    
    if (eventBuffer.length >= 10) {
        flushEvents();
    }
}

function flushEvents() {
    fetch('/api/analytics/events', {
        method: 'POST',
        body: JSON.stringify(eventBuffer)
    });
    eventBuffer.length = 0;
}

Backend:

func HandleAnalyticsEvents(w http.ResponseWriter, r *http.Request) {
    var events []AnalyticsEvent
    json.NewDecoder(r.Body).Decode(&events)
    
    // Publish to Kafka cho real-time processing
    for _, event := range events {
        kafkaProducer.Publish("video-events", event)
    }
    
    // Batch insert to ClickHouse (OLAP DB cho analytics)
    clickhouse.BatchInsert(events)
}

7.3 Data Pipeline (Lambda Architecture)

Real-time (Flink / Spark Streaming):
  - Kafka → Flink → Redis (hot metrics: views last 1 hour)
  - Update dashboards real-time

Batch (Spark / Presto):
  - S3 → Spark → Redshift / BigQuery
  - Daily aggregation: total views per video, revenue, retention
  - ML training data preparation

Serving:
  - Redis: Real-time counters
  - DynamoDB: Per-video metrics (fast lookup)
  - Redshift: Historical analytics (SQL queries)

8. Content Moderation & Copyright

8.1 Copyright Detection (Content ID)

YouTube Content ID system:
  1. Rightsholders upload reference videos (songs, movies)
  2. System generates fingerprints (perceptual hash)
  3. When user uploads video:
     - Extract fingerprint
     - Match against reference DB
     - If match → claim (monetize, block, or track)

Implementation:
  - Audio fingerprinting: Chromaprint, AcoustID
  - Video fingerprinting: Perceptual hashing (frame-level)

import chromaprint
import acoustid

def generate_audio_fingerprint(video_path):
    # Extract audio từ video
    audio_path = extract_audio(video_path)  # ffmpeg
    
    # Generate fingerprint
    duration, fp = acoustid.fingerprint_file(audio_path)
    return fp

def check_copyright(video_path):
    fp = generate_audio_fingerprint(video_path)
    
    # Query reference DB
    matches = db.query("SELECT * FROM reference_fingerprints WHERE fingerprint = %s", fp)
    
    if matches:
        return {
            "status": "COPYRIGHTED",
            "owner": matches[0]['owner'],
            "action": "MONETIZE"  # or BLOCK, TRACK
        }
    
    return {"status": "CLEAR"}

8.2 NSFW & Violence Detection

from transformers import pipeline

classifier = pipeline("video-classification", model="MCG-NJU/videomae-base")

def detect_nsfw_violence(video_path):
    # Sample frames (every 5 seconds)
    frames = extract_frames(video_path, interval=5)
    
    nsfw_scores = []
    for frame in frames:
        result = classifier(frame)
        # result = [{'label': 'nsfw', 'score': 0.95}, ...]
        
        nsfw_score = next((r['score'] for r in result if 'nsfw' in r['label'].lower()), 0)
        violence_score = next((r['score'] for r in result if 'violence' in r['label'].lower()), 0)
        
        nsfw_scores.append(max(nsfw_score, violence_score))
    
    avg_score = sum(nsfw_scores) / len(nsfw_scores)
    
    if avg_score > 0.8:
        return "REJECT"
    elif avg_score > 0.5:
        return "MANUAL_REVIEW"
    else:
        return "APPROVED"

9. Monetization

9.1 Ad Insertion (Server-Side Ad Insertion - SSAI)

Client requests video:
  GET /videos/abc123/master.m3u8

Server injects ads vào playlist:
  - Pre-roll: Ad trước video
  - Mid-roll: Ad ở giữa (tại specific timestamps)
  - Post-roll: Ad sau video

Modified playlist:
#EXTM3U
#EXT-X-STREAM-INF:BANDWIDTH=5000000
720p/playlist.m3u8

720p/playlist.m3u8:
#EXTM3U
#EXTINF:6.0,
ad_segment_000.ts  ← Ad pre-roll
#EXTINF:6.0,
ad_segment_001.ts
#EXTINF:6.0,
segment_000.ts     ← Actual video
#EXTINF:6.0,
segment_001.ts
#EXTINF:6.0,
ad_segment_002.ts  ← Ad mid-roll
...

SSAI vs CSAI (Client-Side Ad Insertion):

SSAI: Ad stitched vào stream → ad blocker không chặn được
CSAI: Client fetch ad riêng → dễ bị block

9.2 Subscription Model

Tiered pricing:
  - Free: 480p max, ads
  - Premium ($5/mo): 1080p, no ads
  - Ultra ($10/mo): 4K, no ads, offline download

Implementation:
  - Check user tier khi request video
  - Serve playlist phù hợp (limit resolution)
  - Inject ads dựa trên tier

func GetVideoPlaylist(w http.ResponseWriter, r *http.Request) {
    videoID := r.URL.Query().Get("id")
    userID := getAuthenticatedUser(r)
    
    // Check subscription tier
    tier := db.GetUserTier(userID)
    
    var maxResolution string
    var includeAds bool
    
    switch tier {
    case "free":
        maxResolution = "480p"
        includeAds = true
    case "premium":
        maxResolution = "1080p"
        includeAds = false
    case "ultra":
        maxResolution = "4k"
        includeAds = false
    }
    
    // Generate playlist
    playlist := generatePlaylist(videoID, maxResolution, includeAds)
    
    w.Header().Set("Content-Type", "application/vnd.apple.mpegurl")
    w.Write([]byte(playlist))
}

10. Interview Questions

Q: Thiết kế YouTube — upload, processing, và playback flow?

Components:

Upload API (presigned URLs, multipart upload)
Transcoding pipeline (Kafka + worker pool với FFmpeg)
Object storage (S3) + CDN (CloudFront)
Metadata DB (PostgreSQL sharded by video_id)
Analytics pipeline (Kafka → Flink → ClickHouse)

Bottlenecks:

Transcoding: CPU-intensive → horizontal scale workers
CDN cost: 70% cost là bandwidth → optimize với compression, ABR
Storage: Petabytes → S3 Glacier cho old videos

Q: Làm sao serve 1 triệu concurrent live viewers?

Answer:

HLS with 6s segments → viewers pull segments every 6s (not persistent connections)
CDN edge caching → origin chỉ serve 1 copy, CDN replicate
Anycast routing → geographic load distribution
Tiered CDN: Edge PoP → Regional PoP → Origin

Math:

1M viewers × 6s segment × 2.5 Mbps = ~300 Gbps aggregate
CDN cache hit ratio 95% → origin chỉ serve 15 Gbps

Q: Tối ưu video startup time?

Answer:

Preload initial segment: Fetch first segment before user clicks play
Smaller initial segment: 2s instead of 6s (faster to download)
Lower initial bitrate: Start at 360p, upgrade khi bandwidth stable
CDN optimization: Use PoP gần user nhất
HTTP/2 multiplexing: Parallel fetch segments + playlist

Q: Copyright detection at scale — 500 uploads/minute?

Answer:

Async processing: Upload → Kafka → Workers generate fingerprints
Sharded fingerprint DB: Hash-based sharding cho billions fingerprints
Approximate matching: Locality-Sensitive Hashing (LSH) để tolerate variations
Priority queue: Popular uploaders → fast-track review

Tóm tắt

Video streaming = High bandwidth, CPU-intensive, storage-heavy system.

Key challenges:

Transcoding cost: CPU-intensive → distributed workers
CDN cost: 70% operational cost → optimize compression, ABR, caching
Storage cost: Petabytes → tiered storage (hot/warm/cold)
Global distribution: Low-latency playback → multi-region CDN
Recommendation: Billions videos → two-stage ranking, ML models

Patterns quan trọng:

HLS/DASH cho adaptive streaming
Presigned URLs cho direct upload (bypass backend)
Multi-tier CDN (edge → origin)
Async transcoding pipeline (Kafka + workers)
ML-based ranking (candidate generation + re-ranking)

Domain: Video Streaming Platform

1. Video Upload Pipeline

1.1 Upload Flow Architecture

1.2 Presigned URL for Direct Upload

1.3 Resumable Upload (Multipart)

2. Video Transcoding — FFmpeg Pipeline

2.1 Adaptive Bitrate Streaming (ABR)

2.2 FFmpeg Command

2.3 Distributed Transcoding Workers

3. HLS (HTTP Live Streaming)

3.1 HLS Structure

3.2 DASH (Alternative to HLS)

4. CDN Strategy

4.1 Multi-tier CDN

4.2 Cache Strategy

4.3 Geographic Load Balancing

5. Video Recommendation Engine

5.1 Collaborative Filtering

5.2 Content-based Filtering

5.3 Two-stage Ranking (YouTube-style)

6. Live Streaming

6.1 RTMP Ingestion

6.2 Low-latency Streaming

6.3 Chat & Engagement

7. Analytics & Metrics

7.1 Video Metrics

7.2 Real-time Event Tracking

7.3 Data Pipeline (Lambda Architecture)

8. Content Moderation & Copyright

8.1 Copyright Detection (Content ID)

8.2 NSFW & Violence Detection

9. Monetization

9.1 Ad Insertion (Server-Side Ad Insertion - SSAI)

9.2 Subscription Model

10. Interview Questions

Tóm tắt

Tài liệu tham khảo