🏭 Domains✍️ Khoa📅 19/04/2026☕ 15 phút đọc

Domain: Video Streaming Platform

YouTube serve 1 tỷ giờ video mỗi ngày. Netflix chiếm 15% global internet bandwidth. TikTok xử lý 1 tỷ video uploads mỗi tháng. Đằng sau những con số này là infrastructure phức tạp nhất trong tech: video encoding pipeline, adaptive bitrate streaming, CDN optimization, và content recommendation at massive scale.

Section này mô tả cách thiết kế video platform như YouTube/TikTok ở production scale.


1. Video Upload Pipeline

1.1 Upload Flow Architecture

┌──────────────┐         ┌────────────┐         ┌──────────────────┐
│  Client App  │────────►│ Upload API │────────►│ Object Storage   │
│              │         │ (presigned)│         │ (S3 / GCS)       │
└──────────────┘         └────────────┘         └────────┬─────────┘
                                                          │
                                                          ▼
                                                  ┌──────────────────┐
                                                  │ Message Queue    │
                                                  │ (Kafka / SQS)    │
                                                  └────────┬─────────┘
                                                           │
                     ┌─────────────────────────────────────┼────────────────┐
                     │                                     │                │
                     ▼                                     ▼                ▼
            ┌─────────────────┐              ┌─────────────────────────────────┐
            │ Transcoding     │              │ Thumbnail Generation            │
            │ Worker Pool     │              │ + Metadata Extraction           │
            │ (FFmpeg)        │              └─────────────────────────────────┘
            └────────┬────────┘
                     │
                     ▼
            ┌─────────────────┐
            │ Multiple        │
            │ Resolutions     │
            │ (360p - 4K)     │
            └────────┬────────┘
                     │
                     ▼
            ┌─────────────────┐
            │ CDN Upload      │
            │ (CloudFront,    │
            │  Cloudflare)    │
            └─────────────────┘

1.2 Presigned URL for Direct Upload

import (
    "github.com/aws/aws-sdk-go/aws"
    "github.com/aws/aws-sdk-go/aws/session"
    "github.com/aws/aws-sdk-go/service/s3"
    "time"
)

func GeneratePresignedUploadURL(filename string, userID int64) (string, error) {
    sess := session.Must(session.NewSession())
    svc := s3.New(sess)
    
    // Unique S3 key
    key := fmt.Sprintf("uploads/%d/%d-%s", userID, time.Now().Unix(), filename)
    
    req, _ := svc.PutObjectRequest(&s3.PutObjectInput{
        Bucket: aws.String("my-video-bucket"),
        Key:    aws.String(key),
    })
    
    // Presigned URL valid for 30 minutes
    url, err := req.Presign(30 * time.Minute)
    if err != nil {
        return "", err
    }
    
    // Save upload record to DB (status: UPLOADING)
    video := &Video{
        UserID:    userID,
        S3Key:     key,
        Status:    "UPLOADING",
        CreatedAt: time.Now(),
    }
    db.Insert(video)
    
    return url, nil
}

Client upload directly to S3 (không qua backend → giảm bandwidth cost).

1.3 Resumable Upload (Multipart)

// Frontend: Upload large files với resumable chunks
async function uploadVideo(file) {
    const CHUNK_SIZE = 5 * 1024 * 1024; // 5MB chunks
    const totalChunks = Math.ceil(file.size / CHUNK_SIZE);
    
    // Bước 1: Initiate multipart upload
    const { uploadId, uploadUrl } = await fetch('/api/videos/upload/init', {
        method: 'POST',
        body: JSON.stringify({ filename: file.name, filesize: file.size })
    }).then(r => r.json());
    
    // Bước 2: Upload chunks
    const uploadedParts = [];
    for (let i = 0; i < totalChunks; i++) {
        const start = i * CHUNK_SIZE;
        const end = Math.min(start + CHUNK_SIZE, file.size);
        const chunk = file.slice(start, end);
        
        const partNumber = i + 1;
        const partUrl = `${uploadUrl}?partNumber=${partNumber}&uploadId=${uploadId}`;
        
        const response = await fetch(partUrl, {
            method: 'PUT',
            body: chunk
        });
        
        const etag = response.headers.get('ETag');
        uploadedParts.push({ PartNumber: partNumber, ETag: etag });
        
        // Update progress
        const progress = Math.round((i + 1) / totalChunks * 100);
        updateProgressBar(progress);
    }
    
    // Bước 3: Complete multipart upload
    await fetch('/api/videos/upload/complete', {
        method: 'POST',
        body: JSON.stringify({ uploadId, parts: uploadedParts })
    });
}

Benefit: Nếu upload fail ở chunk 50/100, chỉ cần retry chunk đó (không upload lại từ đầu).


2. Video Transcoding — FFmpeg Pipeline

2.1 Adaptive Bitrate Streaming (ABR)

Tạo multiple versions của video với các resolution/bitrate khác nhau:

Original: 4K 3840×2160 @ 20 Mbps
  ↓
Transcoding:
  - 1080p @ 5 Mbps
  - 720p  @ 2.5 Mbps
  - 480p  @ 1 Mbps
  - 360p  @ 0.5 Mbps

Client tự động switch resolution dựa trên bandwidth
  → Smooth playback, no buffering

2.2 FFmpeg Command

# Transcode to multiple resolutions
ffmpeg -i input.mp4 \
  -vf scale=1920:1080 -c:v libx264 -preset medium -crf 23 -b:v 5M -maxrate 5M -bufsize 10M -c:a aac -b:a 128k output_1080p.mp4 \
  -vf scale=1280:720  -c:v libx264 -preset medium -crf 23 -b:v 2.5M -maxrate 2.5M -bufsize 5M -c:a aac -b:a 128k output_720p.mp4 \
  -vf scale=854:480   -c:v libx264 -preset medium -crf 23 -b:v 1M -maxrate 1M -bufsize 2M -c:a aac -b:a 96k output_480p.mp4 \
  -vf scale=640:360   -c:v libx264 -preset medium -crf 23 -b:v 500k -maxrate 500k -bufsize 1M -c:a aac -b:a 64k output_360p.mp4

# Generate HLS manifest
ffmpeg -i input.mp4 \
  -c:v libx264 -c:a aac \
  -b:v:0 5M -s:v:0 1920x1080 \
  -b:v:1 2.5M -s:v:1 1280x720 \
  -b:v:2 1M -s:v:2 854x480 \
  -b:v:3 500k -s:v:3 640x360 \
  -var_stream_map "v:0,a:0 v:1,a:1 v:2,a:2 v:3,a:3" \
  -master_pl_name master.m3u8 \
  -f hls -hls_time 6 -hls_list_size 0 \
  -hls_segment_filename "v%v/segment_%03d.ts" \
  "v%v/playlist.m3u8"

Parameters:

  • -c:v libx264: H.264 codec (widely supported)
  • -preset medium: Encoding speed vs quality tradeoff
  • -crf 23: Quality (18-28, lower = better quality)
  • -hls_time 6: 6-second segments (shorter = faster ABR switching, but more overhead)

2.3 Distributed Transcoding Workers

type TranscodingWorker struct {
    s3         *s3.Client
    kafka      *kafka.Consumer
    ffmpegPath string
}

func (w *TranscodingWorker) Start() {
    for {
        msg := w.kafka.ReadMessage()
        var job TranscodeJob
        json.Unmarshal(msg.Value, &job)
        
        if err := w.processJob(job); err != nil {
            // Retry hoặc DLQ
            log.Error("Transcode failed", "job", job.VideoID, "err", err)
        }
    }
}

func (w *TranscodingWorker) processJob(job TranscodeJob) error {
    // Bước 1: Download original từ S3
    inputPath := fmt.Sprintf("/tmp/%s.mp4", job.VideoID)
    if err := w.s3.DownloadFile(job.S3Key, inputPath); err != nil {
        return err
    }
    defer os.Remove(inputPath)
    
    // Bước 2: Transcode to multiple resolutions
    resolutions := []string{"1080p", "720p", "480p", "360p"}
    for _, res := range resolutions {
        outputPath := fmt.Sprintf("/tmp/%s_%s.mp4", job.VideoID, res)
        
        cmd := exec.Command(w.ffmpegPath,
            "-i", inputPath,
            "-vf", getScaleFilter(res),
            "-c:v", "libx264",
            "-preset", "medium",
            "-crf", "23",
            "-b:v", getBitrate(res),
            "-c:a", "aac",
            "-b:a", "128k",
            outputPath,
        )
        
        if err := cmd.Run(); err != nil {
            return fmt.Errorf("ffmpeg failed for %s: %w", res, err)
        }
        
        // Bước 3: Upload to S3
        s3Key := fmt.Sprintf("videos/%s/%s.mp4", job.VideoID, res)
        if err := w.s3.UploadFile(outputPath, s3Key); err != nil {
            return err
        }
        
        os.Remove(outputPath)
    }
    
    // Bước 4: Generate HLS playlist
    if err := w.generateHLS(job); err != nil {
        return err
    }
    
    // Bước 5: Update DB status
    db.Exec(`UPDATE videos SET status = 'READY', processed_at = now() WHERE id = $1`, job.VideoID)
    
    return nil
}

Scale: Horizontal scaling — spin up 100 workers khi có 10k videos in queue.


3. HLS (HTTP Live Streaming)

3.1 HLS Structure

master.m3u8 (manifest):
  Lists all available variants (resolutions)

#EXTM3U
#EXT-X-STREAM-INF:BANDWIDTH=5000000,RESOLUTION=1920x1080
1080p/playlist.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=2500000,RESOLUTION=1280x720
720p/playlist.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=1000000,RESOLUTION=854x480
480p/playlist.m3u8

───────────────────────────────────────────────────

720p/playlist.m3u8 (variant):
  Lists segments for this resolution

#EXTM3U
#EXT-X-TARGETDURATION:6
#EXT-X-VERSION:3
#EXTINF:6.0,
segment_000.ts
#EXTINF:6.0,
segment_001.ts
#EXTINF:6.0,
segment_002.ts
...
#EXT-X-ENDLIST

Player logic:

  1. Fetch master.m3u8
  2. Measure bandwidth
  3. Select appropriate variant (e.g., 720p)
  4. Fetch 720p/playlist.m3u8
  5. Download segments sequentially
  6. Monitor bandwidth → switch variant if needed

3.2 DASH (Alternative to HLS)

DASH = Dynamic Adaptive Streaming over HTTP (ISO standard)
HLS = Apple's proprietary (nhưng đã trở thành de-facto standard)

Differences:
  - HLS: .m3u8 manifest, .ts segments
  - DASH: .mpd manifest, .mp4 segments
  - DASH: More flexible, supports more codecs
  - HLS: Better Apple ecosystem support

In practice: Serve both (HLS for iOS/Safari, DASH for others) hoặc chỉ HLS (universal support).

4. CDN Strategy

4.1 Multi-tier CDN

┌─────────────┐
│   Origin    │ ← S3 / Origin servers (1 copy)
│  (Storage)  │
└──────┬──────┘
       │
       ▼
┌─────────────┐
│  Edge CDN   │ ← CloudFront, Cloudflare (global PoPs)
│  (Cache)    │
└──────┬──────┘
       │
       ▼
┌─────────────┐
│  ISP Cache  │ ← Netflix Open Connect (tại ISP datacenter)
│  (Optional) │
└─────────────┘

Netflix Open Connect: Deploy cache servers directly tại ISP → 95% traffic served locally.

4.2 Cache Strategy

Cache-Control headers:
  - Video segments (.ts): max-age=31536000 (1 năm, immutable)
  - Playlists (.m3u8): max-age=10 (10 seconds, để update live streams)
  - Thumbnails: max-age=86400 (1 ngày)

Range requests:
  Cho phép client download partial video (seeking) mà không cần download toàn bộ
GET /videos/abc123/720p/segment_010.ts
Range: bytes=0-1048575  (first 1MB)

Response:
HTTP/1.1 206 Partial Content
Content-Range: bytes 0-1048575/5242880
Content-Length: 1048576
Cache-Control: public, max-age=31536000, immutable

4.3 Geographic Load Balancing

DNS-based routing (Route 53, Cloudflare):
  User ở Việt Nam → redirect to Singapore PoP
  User ở US → redirect to Virginia PoP

Anycast IP:
  Cùng 1 IP address announced từ multiple locations
  → Router tự động chọn gần nhất

5. Video Recommendation Engine

5.1 Collaborative Filtering

User-Item Matrix:

           Video1  Video2  Video3  Video4
User A       5       ?       3       ?
User B       4       2       ?       5
User C       ?       3       4       4

Predict rating: User A cho Video 2
  → Find similar users (User B, C đã xem Video 2)
  → Weighted average of their ratings

Matrix Factorization (ALS):

from pyspark.ml.recommendation import ALS

# Load interaction data (user_id, video_id, rating)
interactions = spark.read.parquet("s3://data/interactions.parquet")

als = ALS(
    maxIter=10,
    regParam=0.1,
    userCol="user_id",
    itemCol="video_id",
    ratingCol="rating",
    coldStartStrategy="drop"
)

model = als.fit(interactions)

# Generate recommendations for all users
user_recs = model.recommendForAllUsers(20)  # top 20 per user

# Save to serving layer (Redis / DynamoDB)
user_recs.write.format("redis").save()

5.2 Content-based Filtering

Video embeddings:
  - Title/description: BERT embeddings (768-dim vector)
  - Thumbnail: ResNet embeddings (2048-dim vector)
  - Audio: VGGish embeddings (128-dim vector)
  - Category, tags (one-hot encoding)

Cosine similarity:
  sim(video1, video2) = (emb1 · emb2) / (||emb1|| × ||emb2||)

User profile embedding:
  Weighted average của videos user đã xem
  → Recommend videos có embedding gần user profile
import numpy as np
from sklearn.metrics.pairwise import cosine_similarity

def recommend_similar_videos(video_id, top_k=10):
    # Fetch video embedding
    video_emb = redis.get(f"video_emb:{video_id}")
    video_emb = np.frombuffer(video_emb, dtype=np.float32)
    
    # Fetch all video embeddings (or use Faiss for billion-scale)
    all_embeddings = load_all_embeddings()  # shape (N, 768)
    
    # Compute similarity
    similarities = cosine_similarity([video_emb], all_embeddings)[0]
    
    # Top-K
    top_indices = np.argsort(similarities)[::-1][:top_k]
    top_video_ids = [index_to_video_id[i] for i in top_indices]
    
    return top_video_ids

5.3 Two-stage Ranking (YouTube-style)

Stage 1: Candidate Generation (retrieve top 1000 from millions)
  - User history-based CF
  - Content-based similarity
  - Trending videos (time decay)
  - Subscribed channels' new uploads

Stage 2: Ranking (re-rank 1000 → 20)
  - Deep Neural Network với features:
    * User demographics, watch history, engagement rate
    * Video metadata, quality score, upload recency
    * Context: time of day, device, location
  - Objective: Predict watch time (not just CTR)
    → Optimize for engagement, not clickbait
import tensorflow as tf

# Simplified ranking model
def build_ranking_model():
    # Input features
    user_features = tf.keras.Input(shape=(50,), name='user_features')
    video_features = tf.keras.Input(shape=(100,), name='video_features')
    context_features = tf.keras.Input(shape=(10,), name='context_features')
    
    # Concatenate
    concat = tf.keras.layers.Concatenate()([user_features, video_features, context_features])
    
    # Deep layers
    x = tf.keras.layers.Dense(256, activation='relu')(concat)
    x = tf.keras.layers.Dropout(0.3)(x)
    x = tf.keras.layers.Dense(128, activation='relu')(x)
    x = tf.keras.layers.Dropout(0.3)(x)
    
    # Output: predicted watch time (regression)
    output = tf.keras.layers.Dense(1, activation='linear', name='watch_time')(x)
    
    model = tf.keras.Model(
        inputs=[user_features, video_features, context_features],
        outputs=output
    )
    
    model.compile(optimizer='adam', loss='mse', metrics=['mae'])
    return model

# Training
model = build_ranking_model()
model.fit(
    [train_user_feat, train_video_feat, train_context_feat],
    train_watch_time,
    epochs=10,
    batch_size=512,
    validation_split=0.2
)

# Inference
predictions = model.predict([user_feat, video_feat, context_feat])
ranked_videos = sorted(zip(candidate_videos, predictions), key=lambda x: x[1], reverse=True)

6. Live Streaming

6.1 RTMP Ingestion

Streamer                  Ingest Server               CDN
  │                           │                        │
  │─ RTMP stream ────────────►│                        │
  │  1920x1080 @ 6Mbps        │                        │
  │                           │─ Transcode ────────────►│
  │                           │  (multiple bitrates)   │
  │                           │                        │
  │                           │─ HLS/DASH Packaging ───►│
  │                           │                        │
  │                           │                        │
Viewers                       │                        │
  │                           │                        │
  │◄────────────────── HLS segments ───────────────────┤

RTMP: Real-Time Messaging Protocol (Adobe)

  • Legacy, nhưng vẫn dùng cho ingestion (OBS, Streamlabs broadcast qua RTMP)
  • Delivery: HLS hoặc DASH (RTMP không scale cho playback)

6.2 Low-latency Streaming

Standard HLS: 15-30 seconds latency (do segment size + buffering)

Low-latency HLS (LL-HLS):
  - Smaller segments (2s instead of 6s)
  - Partial segments (chunked transfer encoding)
  - Latency: 3-5 seconds

WebRTC:
  - Peer-to-peer (or SFU: Selective Forwarding Unit)
  - Sub-second latency (<500ms)
  - Use case: Video calls, live auctions, real-time gaming

Trade-off:
  - HLS: Scale tốt (millions viewers), high latency
  - WebRTC: Low latency, nhưng expensive (no CDN, need SFU cluster)

6.3 Chat & Engagement

Live stream chat:
  - WebSocket hoặc Server-Sent Events (SSE)
  - Message rate limiting: 1 msg/second per user
  - Moderation: Auto-filter spam, profanity
  - Super chat (paid messages): Priority display + revenue share

Implementation:
  - Redis Pub/Sub cho chat messages
  - Kafka cho persistent storage (replay chat)
  - Elasticsearch cho chat search/moderation

7. Analytics & Metrics

7.1 Video Metrics

Engagement metrics:
  - View count (unique viewers)
  - Watch time (total minutes watched)
  - Completion rate (% users watched to end)
  - Average view duration
  - Like / dislike ratio
  - Comment count, share count

Quality of Service (QoS):
  - Buffering ratio (% time spent buffering)
  - Startup time (time to first frame)
  - Bitrate switches (ABR quality changes)
  - Error rate

7.2 Real-time Event Tracking

// Client-side tracking
player.on('play', () => {
    trackEvent('video_play', { video_id, user_id, timestamp });
});

player.on('pause', () => {
    trackEvent('video_pause', { video_id, user_id, position });
});

player.on('ended', () => {
    trackEvent('video_complete', { video_id, user_id, watch_duration });
});

player.on('bufferstart', () => {
    trackEvent('buffer_start', { video_id, position, bitrate });
});

// Batch events → send mỗi 10s hoặc khi user close tab
const eventBuffer = [];
function trackEvent(event_type, data) {
    eventBuffer.push({ event_type, data, timestamp: Date.now() });
    
    if (eventBuffer.length >= 10) {
        flushEvents();
    }
}

function flushEvents() {
    fetch('/api/analytics/events', {
        method: 'POST',
        body: JSON.stringify(eventBuffer)
    });
    eventBuffer.length = 0;
}

Backend:

func HandleAnalyticsEvents(w http.ResponseWriter, r *http.Request) {
    var events []AnalyticsEvent
    json.NewDecoder(r.Body).Decode(&events)
    
    // Publish to Kafka cho real-time processing
    for _, event := range events {
        kafkaProducer.Publish("video-events", event)
    }
    
    // Batch insert to ClickHouse (OLAP DB cho analytics)
    clickhouse.BatchInsert(events)
}

7.3 Data Pipeline (Lambda Architecture)

Real-time (Flink / Spark Streaming):
  - Kafka → Flink → Redis (hot metrics: views last 1 hour)
  - Update dashboards real-time

Batch (Spark / Presto):
  - S3 → Spark → Redshift / BigQuery
  - Daily aggregation: total views per video, revenue, retention
  - ML training data preparation

Serving:
  - Redis: Real-time counters
  - DynamoDB: Per-video metrics (fast lookup)
  - Redshift: Historical analytics (SQL queries)

YouTube Content ID system:
  1. Rightsholders upload reference videos (songs, movies)
  2. System generates fingerprints (perceptual hash)
  3. When user uploads video:
     - Extract fingerprint
     - Match against reference DB
     - If match → claim (monetize, block, or track)

Implementation:
  - Audio fingerprinting: Chromaprint, AcoustID
  - Video fingerprinting: Perceptual hashing (frame-level)
import chromaprint
import acoustid

def generate_audio_fingerprint(video_path):
    # Extract audio từ video
    audio_path = extract_audio(video_path)  # ffmpeg
    
    # Generate fingerprint
    duration, fp = acoustid.fingerprint_file(audio_path)
    return fp

def check_copyright(video_path):
    fp = generate_audio_fingerprint(video_path)
    
    # Query reference DB
    matches = db.query("SELECT * FROM reference_fingerprints WHERE fingerprint = %s", fp)
    
    if matches:
        return {
            "status": "COPYRIGHTED",
            "owner": matches[0]['owner'],
            "action": "MONETIZE"  # or BLOCK, TRACK
        }
    
    return {"status": "CLEAR"}

8.2 NSFW & Violence Detection

from transformers import pipeline

classifier = pipeline("video-classification", model="MCG-NJU/videomae-base")

def detect_nsfw_violence(video_path):
    # Sample frames (every 5 seconds)
    frames = extract_frames(video_path, interval=5)
    
    nsfw_scores = []
    for frame in frames:
        result = classifier(frame)
        # result = [{'label': 'nsfw', 'score': 0.95}, ...]
        
        nsfw_score = next((r['score'] for r in result if 'nsfw' in r['label'].lower()), 0)
        violence_score = next((r['score'] for r in result if 'violence' in r['label'].lower()), 0)
        
        nsfw_scores.append(max(nsfw_score, violence_score))
    
    avg_score = sum(nsfw_scores) / len(nsfw_scores)
    
    if avg_score > 0.8:
        return "REJECT"
    elif avg_score > 0.5:
        return "MANUAL_REVIEW"
    else:
        return "APPROVED"

9. Monetization

9.1 Ad Insertion (Server-Side Ad Insertion - SSAI)

Client requests video:
  GET /videos/abc123/master.m3u8

Server injects ads vào playlist:
  - Pre-roll: Ad trước video
  - Mid-roll: Ad ở giữa (tại specific timestamps)
  - Post-roll: Ad sau video

Modified playlist:
#EXTM3U
#EXT-X-STREAM-INF:BANDWIDTH=5000000
720p/playlist.m3u8

720p/playlist.m3u8:
#EXTM3U
#EXTINF:6.0,
ad_segment_000.ts  ← Ad pre-roll
#EXTINF:6.0,
ad_segment_001.ts
#EXTINF:6.0,
segment_000.ts     ← Actual video
#EXTINF:6.0,
segment_001.ts
#EXTINF:6.0,
ad_segment_002.ts  ← Ad mid-roll
...

SSAI vs CSAI (Client-Side Ad Insertion):

  • SSAI: Ad stitched vào stream → ad blocker không chặn được
  • CSAI: Client fetch ad riêng → dễ bị block

9.2 Subscription Model

Tiered pricing:
  - Free: 480p max, ads
  - Premium ($5/mo): 1080p, no ads
  - Ultra ($10/mo): 4K, no ads, offline download

Implementation:
  - Check user tier khi request video
  - Serve playlist phù hợp (limit resolution)
  - Inject ads dựa trên tier
func GetVideoPlaylist(w http.ResponseWriter, r *http.Request) {
    videoID := r.URL.Query().Get("id")
    userID := getAuthenticatedUser(r)
    
    // Check subscription tier
    tier := db.GetUserTier(userID)
    
    var maxResolution string
    var includeAds bool
    
    switch tier {
    case "free":
        maxResolution = "480p"
        includeAds = true
    case "premium":
        maxResolution = "1080p"
        includeAds = false
    case "ultra":
        maxResolution = "4k"
        includeAds = false
    }
    
    // Generate playlist
    playlist := generatePlaylist(videoID, maxResolution, includeAds)
    
    w.Header().Set("Content-Type", "application/vnd.apple.mpegurl")
    w.Write([]byte(playlist))
}

10. Interview Questions

Q: Thiết kế YouTube — upload, processing, và playback flow?

Components:

  • Upload API (presigned URLs, multipart upload)
  • Transcoding pipeline (Kafka + worker pool với FFmpeg)
  • Object storage (S3) + CDN (CloudFront)
  • Metadata DB (PostgreSQL sharded by video_id)
  • Analytics pipeline (Kafka → Flink → ClickHouse)

Bottlenecks:

  • Transcoding: CPU-intensive → horizontal scale workers
  • CDN cost: 70% cost là bandwidth → optimize với compression, ABR
  • Storage: Petabytes → S3 Glacier cho old videos

Q: Làm sao serve 1 triệu concurrent live viewers?

Answer:

  • HLS with 6s segments → viewers pull segments every 6s (not persistent connections)
  • CDN edge caching → origin chỉ serve 1 copy, CDN replicate
  • Anycast routing → geographic load distribution
  • Tiered CDN: Edge PoP → Regional PoP → Origin

Math:

  • 1M viewers × 6s segment × 2.5 Mbps = ~300 Gbps aggregate
  • CDN cache hit ratio 95% → origin chỉ serve 15 Gbps

Q: Tối ưu video startup time?

Answer:

  1. Preload initial segment: Fetch first segment before user clicks play
  2. Smaller initial segment: 2s instead of 6s (faster to download)
  3. Lower initial bitrate: Start at 360p, upgrade khi bandwidth stable
  4. CDN optimization: Use PoP gần user nhất
  5. HTTP/2 multiplexing: Parallel fetch segments + playlist

Q: Copyright detection at scale — 500 uploads/minute?

Answer:

  • Async processing: Upload → Kafka → Workers generate fingerprints
  • Sharded fingerprint DB: Hash-based sharding cho billions fingerprints
  • Approximate matching: Locality-Sensitive Hashing (LSH) để tolerate variations
  • Priority queue: Popular uploaders → fast-track review

Tóm tắt

Video streaming = High bandwidth, CPU-intensive, storage-heavy system.

Key challenges:

  • Transcoding cost: CPU-intensive → distributed workers
  • CDN cost: 70% operational cost → optimize compression, ABR, caching
  • Storage cost: Petabytes → tiered storage (hot/warm/cold)
  • Global distribution: Low-latency playback → multi-region CDN
  • Recommendation: Billions videos → two-stage ranking, ML models

Patterns quan trọng:

  • HLS/DASH cho adaptive streaming
  • Presigned URLs cho direct upload (bypass backend)
  • Multi-tier CDN (edge → origin)
  • Async transcoding pipeline (Kafka + workers)
  • ML-based ranking (candidate generation + re-ranking)

Tài liệu tham khảo