Domain: AdTech & Real-Time Bidding (RTB)
Google xα» lΓ½ 5+ triα»u bid requests mα»i giΓ’y. Mα»t RTB auction phαΊ£i hoΓ n thΓ nh trong 100ms β tα»« khi user load page, gα»i bid request tα»i 50+ bidders, nhαΊn response, chαΊ‘y auction logic, trαΊ£ vα» winning ad, vΓ render. ChαΊm 50ms? User ΔΓ£ scroll qua, impression mαΊ₯t, revenue = 0.
AdTech khΓ΄ng chα» lΓ serve ads. ΔΓ’y lΓ bΓ i toΓ‘n distributed systems α» extreme scale vα»i ultra-low latency requirement, fraud prevention, privacy compliance, vΓ real-time budget management khi burn rate cΓ³ thα» $1000/second.
1. AdTech Ecosystem Overview
1.1 Core Players
Ad Ecosystem Flow:
Publisher (CNN, NYTimes) Advertiser (Nike, Coca-Cola)
β β
β "TΓ΄i cΓ³ ad slot" β "TΓ΄i muα»n mua impressions"
β β
βΌ βΌ
ββββββββββββββββ ββββββββββββββββ
β SSP (Sell β β DSP (Demand β
β Side Platf.)β β Side Platf.) β
β β β β
β - Pubmatic β β - The Trade β
β - Magnite β β Desk β
β - Google Ad β β - Google DV β
β Manager β β 360 β
ββββββββ¬ββββββββ ββββββββ¬ββββββββ
β β
β Bid Request Bid Response
β (OpenRTB) (price, creative)
β β
ββββββββββββββββΊββββββββββββββββββββββββββββββ
β Ad Exchange β
β β
β - Google AdX β
β - OpenX β
β - Xandr β
ββββββββββββββββββ
β
Run auction (50-100ms)
Pick winner
β
βΌ
Return ad creative to
publisher page
Supporting Players:
- DMP (Data Management Platform): audience segments, user profiles
- Ad Verification: fraud detection, viewability tracking
- Attribution Platform: conversion tracking, multi-touch attribution
SSP (Supply-Side Platform): ΔαΊ‘i diα»n publisher, optimize CPM (cost per thousand impressions). Integrate vα»i ad exchanges, manage header bidding, set price floors.
DSP (Demand-Side Platform): ΔαΊ‘i diα»n advertiser, optimize campaign performance (CTR, CPA). Bid trong real-time, manage budgets, targeting rules.
Ad Exchange: "Stock exchange" cho ad inventory. Conduct real-time auctions, enforce quality, prevent fraud. CαΊ§n scale 100k+ QPS/datacenter.
1.2 RTB vs Programmatic Direct
RTB (Real-Time Bidding):
- Auction mα»i impression
- Price dynamic, no guarantee
- Use case: performance marketing, remnant inventory
- Risk: variable cost, need budget pacing
Programmatic Direct / Programmatic Guaranteed:
- Pre-negotiated price, reserved inventory
- Guaranteed impressions
- Use case: brand campaigns, premium publishers
- No auction overhead β higher priority
Header Bidding:
- Client-side auction TRΖ―α»C khi call ad server
- Multiple SSPs compete simultaneously
- Increase competition β higher CPM cho publisher
- Latency challenge: parallel requests nhΖ°ng vαΊ«n chαΊ·n page load
2. RTB Auction Mechanism
2.1 Auction Types
First-Price Auction:
5 bidders submit bids: $2.50, $2.30, $2.00, $1.80, $1.50
Winner: $2.50 bidder
Pay: $2.50 (exactly what they bid)
Pros:
- Simple, transparent
- Publisher revenue = highest bid
Cons:
- Bid shading: bidders lower bids to avoid overpaying
- Winner's curse: winning means you overbid
Second-Price Auction (Vickrey):
Same bids: $2.50, $2.30, $2.00, $1.80, $1.50
Winner: $2.50 bidder
Pay: $2.31 (second highest + $0.01)
Pros:
- Incentive compatible: optimal strategy lΓ bid true value
- No bid shading needed
Cons:
- Less transparent
- Lower revenue for publisher nαΊΏu bid gap lα»n
Generalized Second-Price (GSP) - Google AdWords style:
Search ads: multiple positions (top, side, bottom)
Bid order: $5.00, $4.50, $3.00, $2.50
Position 1: Pay $4.51 (next bid + $0.01)
Position 2: Pay $3.01
Position 3: Pay $2.51
Quality Score adjustment:
Effective bid = bid * quality_score
Higher CTR β better position at lower price
Industry shift: 2019 onwards, Google AdX moved tα»« second-price β first-price do transparency pressure. Bidders phαΊ£i adapt bidding strategies.
2.2 Auction Logic Implementation
package auction
import (
"context"
"sort"
"time"
)
// BidRequest theo OpenRTB 2.5 spec
type BidRequest struct {
ID string `json:"id"`
Imp []Impression `json:"imp"` // impression objects
Site *Site `json:"site,omitempty"`
App *App `json:"app,omitempty"`
Device Device `json:"device"`
User User `json:"user"`
Test int `json:"test,omitempty"` // 0=live, 1=test
AT int `json:"at"` // auction type: 1=first-price, 2=second-price
TMax int `json:"tmax"` // max response time (ms)
BidFloor float64 `json:"bidfloor,omitempty"`
Cur []string `json:"cur,omitempty"` // currencies
}
type Impression struct {
ID string `json:"id"`
Banner *Banner `json:"banner,omitempty"`
Video *Video `json:"video,omitempty"`
BidFloor float64 `json:"bidfloor,omitempty"`
Secure int `json:"secure"` // 0=http, 1=https required
}
type BidResponse struct {
ID string `json:"id"` // echo request ID
SeatBid []SeatBid `json:"seatbid"`
BidID string `json:"bidid,omitempty"`
Cur string `json:"cur"` // response currency
NBR int `json:"nbr,omitempty"` // no-bid reason code
}
type SeatBid struct {
Bid []Bid `json:"bid"`
Seat string `json:"seat,omitempty"` // buyer seat ID
}
type Bid struct {
ID string `json:"id"`
ImpID string `json:"impid"` // which impression
Price float64 `json:"price"` // CPM in currency units
AdM string `json:"adm,omitempty"` // ad markup (HTML/VAST)
NURL string `json:"nurl,omitempty"` // win notice URL
CrID string `json:"crid"` // creative ID
W int `json:"w,omitempty"` // width
H int `json:"h,omitempty"` // height
Cat []string `json:"cat,omitempty"` // IAB categories
}
// Auction Engine
type AuctionEngine struct {
bidders []Bidder
fraudDetector FraudDetector
budgetMgr BudgetManager
}
type Bidder interface {
SendBidRequest(ctx context.Context, req *BidRequest) (*BidResponse, error)
Name() string
Timeout() time.Duration
}
// RunAuction orchestrate toΓ n bα» flow
func (ae *AuctionEngine) RunAuction(ctx context.Context, req *BidRequest) (*AuctionResult, error) {
start := time.Now()
// 1. Anti-fraud check (5-10ms)
if fraudScore := ae.fraudDetector.Score(req); fraudScore > 0.7 {
return nil, ErrFraudulentRequest
}
// 2. Send bid requests parallel tα»i tαΊ₯t cαΊ£ bidders (50-80ms)
bids := ae.collectBids(ctx, req)
// 3. Filter invalid bids
validBids := ae.filterBids(bids, req)
// 4. Check budgets (5-10ms)
validBids = ae.budgetMgr.FilterByBudget(ctx, validBids)
// 5. Run auction (1-2ms)
winner := ae.selectWinner(validBids, req.AT)
// 6. Fire win notice vΓ impression tracking pixels
go ae.notifyWinner(context.Background(), winner)
latency := time.Since(start)
if latency > 100*time.Millisecond {
// Log timeout risk
log.Warn("auction_slow", "latency_ms", latency.Milliseconds())
}
return winner, nil
}
// collectBids: parallel fan-out vα»i timeout
func (ae *AuctionEngine) collectBids(ctx context.Context, req *BidRequest) []*BidResponse {
bidChan := make(chan *BidResponse, len(ae.bidders))
for _, bidder := range ae.bidders {
bidder := bidder // capture loop var
go func() {
// Mα»i bidder cΓ³ timeout riΓͺng
bidCtx, cancel := context.WithTimeout(ctx, bidder.Timeout())
defer cancel()
resp, err := bidder.SendBidRequest(bidCtx, req)
if err == nil && resp != nil {
bidChan <- resp
}
}()
}
// Collect responses trong global timeout
timeout := time.After(req.TMax * time.Millisecond)
var responses []*BidResponse
for i := 0; i < len(ae.bidders); i++ {
select {
case resp := <-bidChan:
responses = append(responses, resp)
case <-timeout:
return responses // return partial results
case <-ctx.Done():
return responses
}
}
return responses
}
// selectWinner: auction logic
func (ae *AuctionEngine) selectWinner(bids []*Bid, auctionType int) *AuctionResult {
if len(bids) == 0 {
return nil
}
// Sort descending by price
sort.Slice(bids, func(i, j int) bool {
return bids[i].Price > bids[j].Price
})
winner := bids[0]
var clearingPrice float64
switch auctionType {
case 1: // First-price
clearingPrice = winner.Price
case 2: // Second-price
if len(bids) > 1 {
clearingPrice = bids[1].Price + 0.01
} else {
clearingPrice = winner.Price
}
}
return &AuctionResult{
Winner: winner,
ClearingPrice: clearingPrice,
TotalBids: len(bids),
Timestamp: time.Now(),
}
}
type AuctionResult struct {
Winner *Bid
ClearingPrice float64
TotalBids int
Timestamp time.Time
}
2.3 Bid Floor & Reserve Price
// Dynamic price floor dα»±a trΓͺn historical data
type PriceFloorStrategy struct {
cache *redis.Client
}
func (pf *PriceFloorStrategy) GetFloor(ctx context.Context, imp *Impression) float64 {
// Cache key: site + placement + device type + time-of-day
key := fmt.Sprintf("floor:%s:%s:%s:%d",
imp.SiteID, imp.PlacementID, imp.DeviceType, time.Now().Hour())
// LαΊ₯y tα»« cache
if cached, err := pf.cache.Get(ctx, key).Float64(); err == nil {
return cached
}
// Fallback: base floor tα»« publisher settings
return imp.BaseFloor
}
// Probabilistic floor: randomize Δα» test higher floors
func (pf *PriceFloorStrategy) GetFloorWithExperiment(imp *Impression) float64 {
baseFloor := pf.GetFloor(context.Background(), imp)
// 10% traffic: test floor = base * 1.2
if rand.Float64() < 0.1 {
return baseFloor * 1.2
}
return baseFloor
}
Trade-off:
- Cao quΓ‘: Γt bids win β fill rate thαΊ₯p β impression waste
- ThαΊ₯p quΓ‘: bΓ‘n rαΊ» inventory β revenue loss
Best practice: ML model predict optimal floor theo real-time signals (time, geo, device).
3. Low-Latency Architecture
3.1 Latency Budget
Total budget: 100ms (user tolerance threshold)
Breakdown:
ββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β DNS + TCP handshake + TLS 15-20ms β
β Request routing (anycast + LB) 5-10ms β
β Ad Exchange processing β
β ββ Parse request 2ms β
β ββ Fraud check 5ms β
β ββ Send bids to DSPs 50-60ms β
β β (parallel fan-out, majority wait) β
β ββ Budget/targeting filter 5ms β
β ββ Run auction 2ms β
β Return response 5-10ms β
β Render ad on page 10-15ms β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Contingency: 5-10ms
TαΊ‘i sao 100ms? Research shows:
- > 100ms: noticeable lag, user scrolled away
- > 200ms: 50% drop trong ad viewability
- > 500ms: timeout, no ad shown
3.2 Global Distribution Strategy
PoP (Point of Presence) Placement:
User CDN-style approach:
- 20-30 edge locations worldwide
- Anycast IP routing β nearest PoP
- Each PoP cΓ³ full ad exchange stack
Geography:
ββββββββββββββββββββββββββββββββββββββββββββββββ
β NA-WEST NA-EAST EU-WEST EU-EAST β
β (LA, SF) (VA, NYC) (London, (Frankfurt, β
β Dublin) Amsterdam) β
β β
β APAC-NORTH APAC-SOUTH LATAM ME-AFRICA β
β (Tokyo, (Singapore, (SΓ£o (Dubai, β
β Seoul) Sydney) Paulo) Joburg) β
ββββββββββββββββββββββββββββββββββββββββββββββββ
Per-PoP capacity: 50k-100k QPS
Peak traffic: 200k-300k QPS globally
Colocation vα»i major DSPs:
- Physical proximity trong cΓΉng DC
- <1ms network latency
- Private interconnect (no public internet)
Architecture per PoP:
Anycast IP
β
βΌ
βββββββββββββββββββ
β L4 Load Balancerβ
β (ECMP, DSR) β
ββββββββββ¬ββββββββββ
β
βββββββββββββββΌββββββββββββββ
β β β
βΌ βΌ βΌ
ββββββββββββ ββββββββββββ ββββββββββββ
β Ad Serverβ β Ad Serverβ β Ad Serverβ
β Instance β β Instance β β Instance β
β β β β β β
β β’ Golang β β (scale β β (500-1k β
β β’ In-mem β β horizon-β β QPS/ β
β caches β β tally) β β instanceβ
ββββββ¬ββββββ ββββββ¬ββββββ ββββββ¬ββββββ
β β β
βββββββββββββββΌββββββββββββββ
β
ββββββββββββββββββββΌβββββββββββββββββββ
β β β
βΌ βΌ βΌ
βββββββββββββββ ββββββββββββββββ ββββββββββββββββ
β Redis β β Budget β β DSP Pool β
β (segments, β β Counter β β (bid req β
β creative) β β (distributed)β β fan-out) β
βββββββββββββββ ββββββββββββββββ ββββββββββββββββ
3.3 Optimization Techniques
1. Connection Pooling vα»i DSPs:
package bidder
import (
"net/http"
"time"
)
// Pre-warmed connection pool cho mα»i DSP
func NewHTTPClient() *http.Client {
transport := &http.Transport{
MaxIdleConns: 200, // total
MaxIdleConnsPerHost: 50, // per DSP
MaxConnsPerHost: 100,
IdleConnTimeout: 90 * time.Second,
DisableCompression: false, // enable gzip
DisableKeepAlives: false, // reuse connections
// Aggressive timeouts
DialContext: (&net.Dialer{
Timeout: 10 * time.Millisecond, // connect timeout
KeepAlive: 30 * time.Second,
}).DialContext,
TLSHandshakeTimeout: 10 * time.Second,
ResponseHeaderTimeout: 40 * time.Second,
ExpectContinueTimeout: 1 * time.Second,
}
return &http.Client{
Transport: transport,
Timeout: 50 * time.Millisecond, // total request timeout
}
}
// HTTP/2 multiplexing: multiple requests on single connection
// β reduce handshake overhead
2. Payload Compression:
// OpenRTB request cΓ³ thα» 10-50KB (device data, user segments)
// Gzip compression: 10KB β 2KB
// Latency saving: ~20-30ms trΓͺn slow networks
func CompressBidRequest(req *BidRequest) ([]byte, error) {
jsonData, err := json.Marshal(req)
if err != nil {
return nil, err
}
var buf bytes.Buffer
gzWriter := gzip.NewWriter(&buf)
if _, err := gzWriter.Write(jsonData); err != nil {
return nil, err
}
gzWriter.Close()
return buf.Bytes(), nil
}
// DSP phαΊ£i support Accept-Encoding: gzip
3. Pre-computation & Caching:
type UserSegmentCache struct {
cache *redis.Client
ttl time.Duration
}
// User segments Δược compute offline hourly
// RTB request chα» lookup, khΓ΄ng compute
func (c *UserSegmentCache) GetSegments(userID string) []string {
key := "segments:" + userID
if cached, err := c.cache.SMembers(context.Background(), key).Result(); err == nil {
return cached
}
// Fallback: empty segments, don't block request
return []string{}
}
// Creative metadata: pre-fetch vΓ o memory
type CreativeCache struct {
creatives sync.Map // concurrent-safe map
}
func (c *CreativeCache) Get(creativeID string) *Creative {
if val, ok := c.creatives.Load(creativeID); ok {
return val.(*Creative)
}
return nil
}
// Reload every 5 minutes tα»« DB
func (c *CreativeCache) ReloadPeriodically(interval time.Duration) {
ticker := time.NewTicker(interval)
for range ticker.C {
// Fetch all active creatives tα»« DB
creatives := fetchAllCreatives()
// Update map atomically
for _, creative := range creatives {
c.creatives.Store(creative.ID, creative)
}
}
}
4. Circuit Breaker cho DSPs:
import "github.com/sony/gobreaker"
type DSPClient struct {
client *http.Client
breaker *gobreaker.CircuitBreaker
}
func NewDSPClient(name string) *DSPClient {
settings := gobreaker.Settings{
Name: name,
MaxRequests: 3, // half-open state
Interval: 10 * time.Second, // error count window
Timeout: 30 * time.Second, // open β half-open duration
ReadyToTrip: func(counts gobreaker.Counts) bool {
failureRatio := float64(counts.TotalFailures) / float64(counts.Requests)
return counts.Requests >= 10 && failureRatio >= 0.6
},
}
return &DSPClient{
client: NewHTTPClient(),
breaker: gobreaker.NewCircuitBreaker(settings),
}
}
func (d *DSPClient) SendBid(req *BidRequest) (*BidResponse, error) {
result, err := d.breaker.Execute(func() (interface{}, error) {
return d.sendHTTP(req)
})
if err != nil {
// Circuit open: fast fail, khΓ΄ng waste time
log.Warn("dsp_circuit_open", "dsp", d.breaker.Name())
return nil, err
}
return result.(*BidResponse), nil
}
4. Bid Request/Response Protocol
4.1 OpenRTB 2.5 Spec
Minimal bid request:
{
"id": "80ce30c53c16e6ede735f123ef6e32361bfc7b22",
"imp": [
{
"id": "1",
"banner": {
"w": 300,
"h": 250,
"pos": 1,
"battr": [13],
"api": [3, 5]
},
"bidfloor": 0.50,
"bidfloorcur": "USD",
"secure": 1
}
],
"site": {
"id": "102855",
"domain": "example.com",
"cat": ["IAB3-1"],
"page": "https://example.com/article/12345",
"publisher": {
"id": "8953",
"name": "Example Publisher"
}
},
"device": {
"ua": "Mozilla/5.0...",
"ip": "192.0.2.1",
"geo": {
"country": "USA",
"region": "CA",
"city": "Los Angeles",
"zip": "90028"
},
"devicetype": 1,
"make": "Apple",
"model": "iPhone",
"os": "iOS",
"osv": "16.3",
"language": "en"
},
"user": {
"id": "55816b39711f9b5acf3b90e313ed29e51665623f",
"buyeruid": "buyer-specific-id"
},
"at": 1,
"tmax": 100,
"cur": ["USD"],
"bcat": ["IAB25", "IAB26"],
"badv": ["competitor.com"]
}
Key fields:
imp[]: array of impressions (cΓ³ thα» multiple ads trΓͺn 1 page)sitehoαΊ·capp: web vs mobile app inventorydevice.ip: dΓΉng cho geo-targeting (nhΖ°ng IP β location khΓ΄ng perfect)user.id: cookie-based user ID (post-GDPR, cΓ³ thα» empty)at: auction type (1=first-price, 2=second-price)tmax: DSP phαΊ£i respond trong timeout nΓ ybcat,badv: blocked categories/advertisers
Bid response:
{
"id": "80ce30c53c16e6ede735f123ef6e32361bfc7b22",
"seatbid": [
{
"bid": [
{
"id": "1",
"impid": "1",
"price": 2.50,
"adid": "creative-12345",
"nurl": "https://dsp.example.com/win?price=${AUCTION_PRICE}",
"adm": "<a href=\"click-url\"><img src=\"creative-url.jpg\"></a>",
"adomain": ["advertiser.com"],
"cid": "campaign-789",
"crid": "creative-12345",
"cat": ["IAB3-1"],
"w": 300,
"h": 250
}
],
"seat": "buyer-123"
}
],
"cur": "USD",
"bidid": "bid-response-abc123"
}
Win notice flow:
1. DSP returns nurl: "https://dsp.com/win?price=${AUCTION_PRICE}"
2. Ad Exchange macro substitution:
β "https://dsp.com/win?price=2.31" (clearing price)
3. Exchange fires GET request to nurl (async, no wait)
4. DSP receives win notice:
- Deduct budget
- Log impression
- Start attribution window
4.2 Payload Optimization
Problem: Full OpenRTB request = 20-50KB. Network transfer = 30-40ms overhead.
Solution 1: Field filtering
// LoαΊ‘i bα» fields khΓ΄ng cαΊ§n thiαΊΏt cho specific DSP
type BidRequestOptimizer struct {
dspProfiles map[string]DSPProfile
}
type DSPProfile struct {
NeedsDeviceDetails bool
NeedsUserSegments bool
NeedsGeoDetails bool
}
func (opt *BidRequestOptimizer) Optimize(req *BidRequest, dspID string) *BidRequest {
profile := opt.dspProfiles[dspID]
optimized := req.ShallowCopy()
if !profile.NeedsDeviceDetails {
optimized.Device.UserAgent = ""
optimized.Device.DeviceID = ""
}
if !profile.NeedsUserSegments {
optimized.User.Data = nil
}
if !profile.NeedsGeoDetails {
optimized.Device.Geo.Lat = 0
optimized.Device.Geo.Lon = 0
}
return optimized
}
Solution 2: Protocol Buffers
// OpenRTB protobuf schema (smaller, faster parsing)
syntax = "proto3";
message BidRequest {
string id = 1;
repeated Impression imp = 2;
Site site = 3;
Device device = 4;
User user = 5;
int32 at = 6;
int32 tmax = 7;
}
message Impression {
string id = 1;
Banner banner = 2;
double bidfloor = 3;
int32 secure = 4;
}
// ...
// Size reduction: JSON 25KB β Protobuf 8KB
// Parse speed: JSON 500ΞΌs β Protobuf 150ΞΌs
5. Budget Management & Pacing
5.1 Distributed Budget Counter
Challenge: Campaign budget $10,000/day, burn rate = $50/second. Khi cΓ³ 20 PoPs worldwide, lΓ m sao ΔαΊ£m bαΊ£o khΓ΄ng overspend?
Naive approach (wrong):
Mα»i PoP check local counter β race condition
PoP1 spend $5000, PoP2 spend $5000 concurrently
β Total $10,000 overspend
Solution 1: Centralized counter vα»i Redis
package budget
import (
"context"
"fmt"
"time"
"github.com/go-redis/redis/v8"
)
type BudgetManager struct {
rdb *redis.Client
}
func (bm *BudgetManager) CanSpend(ctx context.Context, campaignID string, amount float64) (bool, error) {
key := fmt.Sprintf("budget:campaign:%s:%s", campaignID, time.Now().Format("2006-01-02"))
// Atomic increment vα»i check
script := `
local current = tonumber(redis.call('GET', KEYS[1]) or '0')
local limit = tonumber(ARGV[1])
local amount = tonumber(ARGV[2])
if current + amount <= limit then
redis.call('INCRBY', KEYS[1], amount * 1000) -- store as cents
return 1
else
return 0
end
`
result, err := bm.rdb.Eval(ctx, script, []string{key},
10000.00, // daily limit
amount, // bid amount
).Int()
return result == 1, err
}
// Reclaim budget khi bid khΓ΄ng win
func (bm *BudgetManager) Reclaim(ctx context.Context, campaignID string, amount float64) error {
key := fmt.Sprintf("budget:campaign:%s:%s", campaignID, time.Now().Format("2006-01-02"))
return bm.rdb.DecrBy(ctx, key, int64(amount*1000)).Err()
}
// Pre-reserve budget trΖ°α»c khi bid
func (bm *BudgetManager) ReserveForAuction(ctx context.Context, campaignID string, bidAmount float64) (*Reservation, error) {
ok, err := bm.CanSpend(ctx, campaignID, bidAmount)
if err != nil || !ok {
return nil, ErrInsufficientBudget
}
// Reserve trong 5 seconds (auction timeout)
return &Reservation{
CampaignID: campaignID,
Amount: bidAmount,
ExpiresAt: time.Now().Add(5 * time.Second),
}, nil
}
Problem vα»i centralized approach:
- Single Redis instance = bottleneck (50k+ writes/sec)
- Cross-region latency: 50-200ms (kills auction SLA)
- Single point of failure
Solution 2: Distributed with allocation
// Distribute daily budget across PoPs
type DistributedBudgetManager struct {
localCache *LocalBudget
coordinator *BudgetCoordinator
}
// Mα»i PoP nhαΊn allocation (e.g., $500 mα»i 5 phΓΊt)
func (dbm *DistributedBudgetManager) Allocate(ctx context.Context) {
ticker := time.NewTicker(5 * time.Minute)
for range ticker.C {
// Request allocation tα»« central coordinator
allocation := dbm.coordinator.RequestAllocation(dbm.localCache.PopID)
dbm.localCache.SetAllocation(allocation)
}
}
// Check local cache β fast, no network
func (dbm *DistributedBudgetManager) CanSpend(campaignID string, amount float64) bool {
return dbm.localCache.CheckAndDecrement(campaignID, amount)
}
// Local budget counter (in-memory)
type LocalBudget struct {
mu sync.RWMutex
campaigns map[string]*CampaignBudget
PopID string
}
type CampaignBudget struct {
Remaining float64
LastAllocated time.Time
}
func (lb *LocalBudget) CheckAndDecrement(campaignID string, amount float64) bool {
lb.mu.Lock()
defer lb.mu.Unlock()
budget, exists := lb.campaigns[campaignID]
if !exists || budget.Remaining < amount {
return false
}
budget.Remaining -= amount
return true
}
Trade-off:
- Distributed approach: fast (local check), nhΖ°ng cΓ³ overspend risk nαΊΏu allocation hαΊΏt vΓ coordinator lag
- Centralized approach: accurate, nhΖ°ng slower
Real-world hybrid:
1. Pre-allocate budget to PoPs (optimistic)
2. Local check first (99% cases)
3. Khi allocation gαΊ§n hαΊΏt (< 10%), fallback to central Redis check
4. End-of-day reconciliation: refund unused allocations
5.2 Pacing Algorithms
Problem: Campaign $10,000/day. NαΊΏu burn hαΊΏt trong 2 giα» ΔαΊ§u, mαΊ₯t 22 giα» opportunities.
Goal: Distribute spend evenly throughout the day Δα» maximize reach.
Algorithm 1: Simple percentage pacing
func (p *Pacer) ShouldThrottle(campaignID string) bool {
campaign := p.getCampaign(campaignID)
// Expected spend tαΊ‘i thα»i Δiα»m hiα»n tαΊ‘i
hourOfDay := time.Now().Hour()
expectedSpendPct := float64(hourOfDay) / 24.0
// Actual spend percentage
actualSpendPct := campaign.SpentToday / campaign.DailyBudget
// NαΊΏu overspend, throttle
return actualSpendPct > expectedSpendPct + 0.05 // 5% tolerance
}
Algorithm 2: Proportional Integral (PI) Controller
type PIController struct {
Kp float64 // proportional gain
Ki float64 // integral gain
integral float64
}
func (pi *PIController) ComputeThrottle(target, actual float64) float64 {
error := target - actual
pi.integral += error
// PID output: Kp * error + Ki * integral
throttle := pi.Kp*error + pi.Ki*pi.integral
// Clamp [0, 1]
if throttle < 0 {
return 0
}
if throttle > 1 {
return 1
}
return throttle
}
// Usage: throttle% of requests participate in auction
Algorithm 3: Multiplicative bid shading
// Giảm bid amount thay vì skip requests
func (p *Pacer) AdjustBid(campaignID string, originalBid float64) float64 {
pacingFactor := p.getPacingFactor(campaignID) // 0.5 - 1.0
return originalBid * pacingFactor
}
// Example: overspend β pacing factor = 0.7 β bid lower β win less β slow down
Real-world best practice: Hybrid cα»§a throttle + bid shading. Forecast spend curve dα»±a trΓͺn historical hourly CTR patterns (e.g., evening traffic spike β allocate more budget).
6. Targeting & Audience Segments
6.1 User Profiling
type UserProfile struct {
UserID string
Demographics Demographics
Interests []string // ["sports", "technology", "travel"]
Behaviors []Behavior
Segments []string // DMP segment IDs
LastSeen time.Time
}
type Demographics struct {
AgeRange string // "25-34"
Gender string // "M", "F", "U"
Income string // "50k-75k"
Education string
Location GeoData
}
type Behavior struct {
Type string // "purchase", "view", "click"
Category string // "electronics"
Timestamp time.Time
Recency int // days since last action
Frequency int // total count
}
// Targeting criteria tα»« campaign
type TargetingRule struct {
IncludeSegments []string
ExcludeSegments []string
GeoTargeting *GeoTarget
DeviceTargeting *DeviceTarget
DayParting *DayPartingRule
}
type GeoTarget struct {
Countries []string
Regions []string
Cities []string
PostalCodes []string
RadiusTargets []RadiusTarget
}
type RadiusTarget struct {
Lat float64
Lon float64
Radius float64 // km
}
// Matching engine
func (tm *TargetingMatcher) Match(profile *UserProfile, rule *TargetingRule) bool {
// 1. Segment matching
if !tm.matchSegments(profile.Segments, rule) {
return false
}
// 2. Geo check
if rule.GeoTargeting != nil && !tm.matchGeo(profile.Demographics.Location, rule.GeoTargeting) {
return false
}
// 3. Device check
if rule.DeviceTargeting != nil && !tm.matchDevice(profile.Device, rule.DeviceTargeting) {
return false
}
// 4. Day-parting (time-of-day targeting)
if rule.DayParting != nil && !tm.matchDayPart(time.Now(), rule.DayParting) {
return false
}
return true
}
func (tm *TargetingMatcher) matchSegments(userSegments []string, rule *TargetingRule) bool {
userSegSet := toSet(userSegments)
// Check exclude first (faster rejection)
for _, excluded := range rule.ExcludeSegments {
if userSegSet[excluded] {
return false
}
}
// Check include
if len(rule.IncludeSegments) == 0 {
return true // no segment requirement
}
for _, included := range rule.IncludeSegments {
if userSegSet[included] {
return true // any match
}
}
return false
}
6.2 DMP Integration (Data Management Platform)
DMP Flow:
User visits site β Drop pixel
β
βΌ
βββββββββββββββββββ
β DMP (Lotame, ββββββββΊ Aggregate behaviors
β Oracle DMP) β Build segments
ββββββββββ¬βββββββββ Enrich profiles
β
βΌ
Segment assignment
"Auto Intender", "Luxury Shopper"
β
βΌ
ββββββββββββββββββ
β Sync to DSPs β
β Cookie mapping β
ββββββββββββββββββ
β
βΌ
RTB bid request
includes segment IDs
Cookie syncing:
Publisher cookie ID: abc123
DMP cookie ID: xyz789
DSP cookie ID: qwerty456
Mapping table:
abc123 β xyz789 β qwerty456
Khi RTB request, translate ID:
Request includes: user.id = "abc123"
DSP lookup: "abc123" β "qwerty456" (their namespace)
DSP fetch profile: segments = ["auto_intender", "high_income"]
6.3 Privacy Compliance (GDPR, CCPA)
GDPR Impact:
type ConsentManager struct {
cache *redis.Client
}
// Check user consent status
func (cm *ConsentManager) HasConsent(userID string, purpose string) bool {
key := fmt.Sprintf("consent:%s", userID)
// Purposes: targeting, analytics, personalization
consents, err := cm.cache.HGetAll(context.Background(), key).Result()
if err != nil {
// No consent = assume no
return false
}
return consents[purpose] == "1"
}
// TCF (Transparency & Consent Framework) string parsing
func (cm *ConsentManager) ParseTCFString(tcfString string) *ConsentData {
// TCF string encodes consent cho hundreds vendors
// Example: "CPXxYyZPXxYyZADABCEN..."
// Decode bitmap: vendor 123 = allowed, vendor 456 = denied
decoded := decodeTCF(tcfString)
return decoded
}
// Bid request filtering
func (ae *AuctionEngine) filterByConsent(req *BidRequest, bidders []Bidder) []Bidder {
userConsent := ae.consentMgr.ParseTCFString(req.User.Consent)
var allowed []Bidder
for _, bidder := range bidders {
vendorID := bidder.VendorID()
if userConsent.AllowsVendor(vendorID) {
allowed = append(allowed, bidder)
}
}
return allowed
}
Post-cookie era: Contextual targeting
// Khi khΓ΄ng cΓ³ user ID (GDPR, Safari ITP, cookieless),
// fallback to contextual signals
type ContextualTargeting struct {
pageCategories []string // IAB taxonomy
keywords []string
sentiment string // "positive", "negative", "neutral"
pageQuality float64 // 0-1 score
}
func (ct *ContextualTargeting) Extract(url string, content string) *ContextualTargeting {
// NLP pipeline: topic classification, keyword extraction
categories := ct.classifier.Classify(content)
keywords := ct.extractor.Extract(content)
sentiment := ct.sentimentAnalyzer.Analyze(content)
return &ContextualTargeting{
pageCategories: categories,
keywords: keywords,
sentiment: sentiment,
}
}
// Contextual match: relevant ads without tracking user
// Example: "running shoes" article β show sports apparel ads
7. Fraud Detection
7.1 Fraud Types
Bot Traffic:
- Datacenter IPs (AWS, GCP, Azure ranges)
- Suspicious user agents
- Impossible click patterns (100 clicks/minute tα»« 1 IP)
- No mouse movement, no scroll, instant click
Click Fraud / Click Injection:
- Mobile app install fraud
- Click flooding: generate fake clicks Δα» steal attribution
- Click farms: real humans paid to click
Impression Fraud:
- Hidden iframes (1x1 pixel)
- Stacked ads (10 ads same position, count 10 impressions)
- Auto-refresh: reload page every 5 seconds
- Domain spoofing: claim traffic tα»« premium site, actually low-quality
Ad Injection / Malware:
- Browser extension inject ads into pages
- Publisher doesn't get paid, fraudster steals revenue
7.2 Detection Strategies
package fraud
import (
"context"
"net"
)
type FraudDetector struct {
ipBlocklist *IPBlocklist
uaAnalyzer *UserAgentAnalyzer
rateLimiter *RateLimiter
mlScorer *MLFraudScorer
}
// Real-time fraud scoring
func (fd *FraudDetector) Score(req *BidRequest) float64 {
var score float64
// Rule-based checks
score += fd.checkIPReputation(req.Device.IP)
score += fd.checkUserAgent(req.Device.UA)
score += fd.checkClickVelocity(req.User.ID, req.Device.IP)
score += fd.checkDeviceIntegrity(req.Device)
// ML-based anomaly detection
mlScore := fd.mlScorer.Predict(req)
score += mlScore * 0.4 // weight
return clamp(score, 0, 1)
}
// IP reputation check
func (fd *FraudDetector) checkIPReputation(ip string) float64 {
// Check against known datacenter ranges
if fd.ipBlocklist.IsDatacenter(ip) {
return 0.6 // high suspicion
}
// Check IP prefix reputation (MaxMind, IPQualityScore API)
reputation := fd.ipBlocklist.GetReputation(ip)
if reputation < 0.3 {
return 0.5
}
return 0.0
}
// User agent analysis
func (fd *FraudDetector) checkUserAgent(ua string) float64 {
parsed := fd.uaAnalyzer.Parse(ua)
// Check for bot signatures
if parsed.IsBot {
return 0.8
}
// Check for spoofed UAs (inconsistent browser + OS combo)
if !parsed.IsConsistent() {
return 0.4
}
// Headless Chrome detection
if parsed.IsHeadless {
return 0.7
}
return 0.0
}
// Click velocity check
func (fd *FraudDetector) checkClickVelocity(userID, ip string) float64 {
ctx := context.Background()
// Count impressions/clicks trong last 5 minutes
count := fd.rateLimiter.Count(ctx, "clicks", ip, 5*time.Minute)
if count > 100 {
return 0.9 // impossible for human
}
if count > 50 {
return 0.5 // suspicious
}
return 0.0
}
// Device integrity (mobile apps)
func (fd *FraudDetector) checkDeviceIntegrity(device *Device) float64 {
// iOS: check for jailbreak
if device.OS == "iOS" && device.Jailbroken {
return 0.6
}
// Android: check for emulator
if device.OS == "Android" && device.IsEmulator {
return 0.7
}
// Check device ID consistency
if !fd.isConsistentDeviceID(device) {
return 0.5
}
return 0.0
}
7.3 ML-Based Anomaly Detection
# Offline training: fraud classification model
import pandas as pd
from sklearn.ensemble import RandomForestClassifier
from sklearn.preprocessing import StandardScaler
# Features
features = [
'ip_reputation_score',
'user_agent_entropy',
'click_velocity_5min',
'time_to_click_ms', # time between impression β click
'hour_of_day',
'device_type',
'connection_type', # wifi, cellular, datacenter
'ad_size',
'viewability_score',
'mouse_movements', # sỠlượng mouse events
'scroll_depth',
'time_on_page',
]
# Labels: 0 = legitimate, 1 = fraud (from manual review + confirmed fraud)
X_train = df[features]
y_train = df['is_fraud']
# Train model
model = RandomForestClassifier(n_estimators=200, max_depth=10)
model.fit(X_train, y_train)
# Feature importance
importances = pd.Series(model.feature_importances_, index=features).sort_values(ascending=False)
print(importances)
# Deploy model: export to ONNX, load vΓ o Go service
# Real-time inference: <5ms per request
Post-bid verification:
1. Serve ad
2. JavaScript tags fire:
- Mouse movement tracking
- Viewability measurement (intersection observer)
- Time on page
3. Send telemetry to verification service
4. Verification service scores: fraud or legitimate
5. NαΊΏu fraud: refund advertiser, blacklist publisher/IP
8. Analytics & Attribution
8.1 Conversion Tracking
User journey:
Day 1: See banner ad (impression)
Day 2: Click display ad β visit landing page
Day 3: See video ad (impression)
Day 7: Search brand β click search ad β purchase
Question: Which ad(s) should get credit?
Tracking mechanism:
<!-- Ad click URL -->
<a href="https://advertiser.com/product?utm_source=adx&utm_campaign=summer&click_id=abc123">
<!-- Landing page: fire pixel -->
<img src="https://tracker.adtech.com/track?click_id=abc123&event=landing" width="1" height="1">
<!-- Purchase confirmation: fire conversion pixel -->
<img src="https://tracker.adtech.com/track?click_id=abc123&event=purchase&value=99.99" width="1" height="1">
Backend tracking:
type ConversionTracker struct {
db *sql.DB
}
type ClickEvent struct {
ClickID string
UserID string
CampaignID string
CreativeID string
Timestamp time.Time
}
type ConversionEvent struct {
ClickID string
EventType string // "landing", "add_to_cart", "purchase"
Value float64
Timestamp time.Time
}
// Record click
func (ct *ConversionTracker) RecordClick(click *ClickEvent) error {
_, err := ct.db.Exec(`
INSERT INTO clicks (click_id, user_id, campaign_id, creative_id, timestamp)
VALUES ($1, $2, $3, $4, $5)
`, click.ClickID, click.UserID, click.CampaignID, click.CreativeID, click.Timestamp)
return err
}
// Record conversion
func (ct *ConversionTracker) RecordConversion(conv *ConversionEvent) error {
// Lookup click
var click ClickEvent
err := ct.db.QueryRow(`
SELECT click_id, user_id, campaign_id, creative_id, timestamp
FROM clicks
WHERE click_id = $1
`, conv.ClickID).Scan(&click.ClickID, &click.UserID, &click.CampaignID, &click.CreativeID, &click.Timestamp)
if err != nil {
return err // click not found
}
// Attribution window check (e.g., 30 days)
if conv.Timestamp.Sub(click.Timestamp) > 30*24*time.Hour {
return ErrAttributionWindowExpired
}
// Store conversion
_, err = ct.db.Exec(`
INSERT INTO conversions (click_id, user_id, campaign_id, event_type, value, timestamp)
VALUES ($1, $2, $3, $4, $5, $6)
`, conv.ClickID, click.UserID, click.CampaignID, conv.EventType, conv.Value, conv.Timestamp)
return err
}
8.2 Multi-Touch Attribution
Models:
User saw 4 ads before purchase:
Ad A (display, day 1)
Ad B (video, day 3)
Ad C (social, day 5)
Ad D (search, day 7) β purchase $100
Attribution models:
1. Last-Click (simple):
Ad D gets 100% credit = $100
2. First-Click:
Ad A gets 100% credit = $100
3. Linear:
Each ad = 25% credit = $25 each
4. Time-Decay:
More recent ads get more credit
Ad A: 10% = $10
Ad B: 20% = $20
Ad C: 30% = $30
Ad D: 40% = $40
5. Position-Based (U-shaped):
First + last get most credit
Ad A: 40% = $40
Ad B: 10% = $10
Ad C: 10% = $10
Ad D: 40% = $40
6. Data-Driven (ML):
Model learns contribution tα»« historical data
Ad A: 15% (awareness)
Ad B: 35% (high engagement)
Ad C: 20% (consideration)
Ad D: 30% (final push)
Implementation:
type AttributionModel interface {
Allocate(touchpoints []*Touchpoint, conversionValue float64) map[string]float64
}
type Touchpoint struct {
CampaignID string
Timestamp time.Time
Channel string // "display", "video", "search", "social"
}
// Time-decay model
type TimeDecayModel struct {
HalfLife time.Duration // e.g., 7 days
}
func (m *TimeDecayModel) Allocate(touchpoints []*Touchpoint, value float64) map[string]float64 {
if len(touchpoints) == 0 {
return nil
}
conversionTime := touchpoints[len(touchpoints)-1].Timestamp
// Calculate weights using exponential decay
weights := make(map[string]float64)
totalWeight := 0.0
for _, tp := range touchpoints {
timeDiff := conversionTime.Sub(tp.Timestamp)
weight := math.Exp(-timeDiff.Seconds() / m.HalfLife.Seconds())
weights[tp.CampaignID] += weight
totalWeight += weight
}
// Normalize to sum = conversion value
allocated := make(map[string]float64)
for campaignID, weight := range weights {
allocated[campaignID] = (weight / totalWeight) * value
}
return allocated
}
// Data-driven model (ML-based)
type DataDrivenModel struct {
shapleyModel *ShapleyValueModel
}
// Shapley values: game theory approach
// Measures marginal contribution of each touchpoint
func (m *DataDrivenModel) Allocate(touchpoints []*Touchpoint, value float64) map[string]float64 {
// Compute Shapley values (expensive, done offline)
shapleyValues := m.shapleyModel.Compute(touchpoints)
// Allocate proportionally
totalShapley := 0.0
for _, v := range shapleyValues {
totalShapley += v
}
allocated := make(map[string]float64)
for campaignID, shapley := range shapleyValues {
allocated[campaignID] = (shapley / totalShapley) * value
}
return allocated
}
8.3 Real-Time Reporting
// Stream conversions to data warehouse for analytics
type ConversionStreamer struct {
kafka *kafka.Producer
}
func (cs *ConversionStreamer) Stream(conv *ConversionEvent) error {
message := &kafka.Message{
TopicPartition: kafka.TopicPartition{
Topic: "conversions",
Partition: kafka.PartitionAny,
},
Key: []byte(conv.ClickID),
Value: marshal(conv),
}
return cs.kafka.Produce(message, nil)
}
// Consumer: aggregate metrics in real-time
// Tools: Flink, Spark Streaming, ksqlDB
SELECT
campaign_id,
COUNT(*) as impressions,
SUM(CASE WHEN event = 'click' THEN 1 ELSE 0 END) as clicks,
SUM(CASE WHEN event = 'conversion' THEN value ELSE 0 END) as revenue,
window_start,
window_end
FROM events
WINDOW TUMBLING (SIZE 1 MINUTE)
GROUP BY campaign_id, window_start, window_end;
// Feed vΓ o dashboard: real-time campaign performance
9. Interview Questions & System Design
9.1 Core Questions
Q1: Design ad exchange xα» lΓ½ 100k QPS vα»i sub-100ms latency.
Clarifying questions:
- Traffic pattern? Uniform hoαΊ·c cΓ³ peak?
- Geographic distribution? Global or specific regions?
- Budget constraints? Can we use premium infrastructure?
- Data consistency requirements? Strongly consistent budget hoαΊ·c eventual?
High-level architecture:
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Global Traffic β
β (100k QPS = 6M requests/min) β
ββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββ
β Anycast routing
βββββββββββββββββΌββββββββββββββββ
β β β
βΌ βΌ βΌ
ββββββββββ ββββββββββ ββββββββββ
βPoP US β βPoP EU β βPoP APACβ
β20k QPS β β40k QPS β β40k QPS β
βββββ¬βββββ βββββ¬βββββ βββββ¬βββββ
β β β
βΌ βΌ βΌ
Per-PoP stack (40k QPS):
βββββββββββββββββββββββββββββββββββββββββββββββ
β L4 LB (40k QPS) β
β ββ 40 instances Γ 1k QPS/instance β
β ββ Golang servers, in-memory caches β
β β
β Local Redis (segments, creatives) β
β ββ Read-heavy: 100k reads/sec β
β ββ Sub-1ms latency β
β β
β Budget coordinator (distributed) β
β ββ Allocate budget every 5min β
β ββ Central reconciliation (eventual) β
β β
β DSP connections β
β ββ 50 DSPs concurrently β
β ββ HTTP/2 connection pooling β
β ββ Circuit breakers β
βββββββββββββββββββββββββββββββββββββββββββββββ
Optimizations:
1. Pre-compute user segments β cache lookup only
2. HTTP/2 multiplexing β reduce handshake cost
3. Payload compression β gzip (10KB β 2KB)
4. Connection keep-alive β no reconnect overhead
5. Async win notices β don't block response
6. Local budget allocation β avoid cross-region latency
Latency breakdown:
- Request parse + fraud check: 7ms
- DSP fan-out (parallel): 60ms (bottleneck)
- Budget check (local): 2ms
- Auction logic: 1ms
- Response marshal: 3ms
- Network: 10ms
Total: ~83ms (target < 100ms)
Scaling:
- Horizontal: add more PoPs (e.g., 15 β 30 locations)
- Vertical: upgrade instance types (more CPU/mem)
- Autoscaling: based on request latency P99
Monitoring:
- Latency: P50, P95, P99 per PoP
- QPS: current vs capacity
- Win rate: bids submitted vs won
- Revenue: $ per second
- Error rate: timeouts, 5xx responses
Q2: Budget overspend problem: 20 datacenters, $10k/day budget. How to prevent overspend?
Option 1: Centralized counter (strong consistency)
Pros: accurate, no overspend
Cons: single bottleneck, cross-region latency (50-200ms)
β KhΓ΄ng scale, kills latency SLA
Option 2: Distributed with allocation (eventual consistency)
- Allocate $500 to each DC mα»i 5 phΓΊt
- DC tracks locally (fast, no network)
- Rebalance: DCs trαΊ£ lαΊ‘i budget unused
- Risk: overspend ~$500-1000 (5% tolerance)
Pros: low latency, scales
Cons: slight overspend acceptable
Option 3: Hybrid
- Allocate optimistically
- Khi gαΊ§n hαΊΏt (remaining < 10%), fallback to central check
- 90% requests: local check (fast)
- 10% requests: central check (accurate)
Pros: balance latency + accuracy
Real-world choice: Option 3
- Advertiser tolerance: 5-10% overspend acceptable
- End-of-day reconciliation: credit back overspend
Q3: User cΓ³ 20 cookie IDs (Chrome, Safari, mobile app, etc). LΓ m sao unify?
Challenge: Cross-device identity resolution
Approaches:
1. Deterministic matching (login-based):
- User logs in β link all device cookies to user account
- Accurate but coverage thαΊ₯p (only logged-in users)
2. Probabilistic matching (fingerprinting):
- Signals: IP + User Agent + timezone + language + screen resolution
- ML model: predict "same user" probability
- Coverage cao, nhΖ°ng accuracy thαΊ₯p hΖ‘n (~70-80%)
3. Identity graphs (vendor: LiveRamp, Neustar):
- Third-party providers maintain billions mappings
- Publisher/advertiser submit hashed emails
- Get back unified ID
4. Cohort-based (Privacy Sandbox, FLoC):
- Group users into cohorts (~1000 users/cohort)
- Target cohorts, not individuals
- Privacy-preserving, but coarse targeting
Implementation:
class IdentityGraph:
def link(self, cookie_id, user_id):
# Store mapping in graph DB (Neo4j)
graph.add_edge(cookie_id, user_id)
def resolve(self, cookie_id) -> unified_id:
# BFS/DFS to find connected component
cluster = graph.connected_component(cookie_id)
return cluster.canonical_id()
Trade-off:
- Deterministic: accurate, low coverage
- Probabilistic: high coverage, less accurate
- Hybrid: use deterministic when available, fallback probabilistic
Q4: Ad fraud: 30% traffic lΓ bots. LΓ m sao detect real-time?
Detection pipeline:
1. Pre-bid filtering (block obvious bots):
- IP blocklist (datacenters, proxies)
- User-agent blacklist (known bot signatures)
- Rate limiting (100 clicks/min tα»« 1 IP)
Cost: 5-10ms latency
Catch: ~50% bots
2. Real-time ML scoring:
- Features: IP reputation, click velocity, device consistency
- Model: Random Forest, GBDT (pre-trained)
- Inference: <5ms (ONNX runtime)
- Threshold: score > 0.7 β reject
Catch: additional ~30% bots
3. Post-bid verification:
- JavaScript tags measure:
* Mouse movements
* Scroll depth
* Viewability (intersection observer)
* Time on page
- Send telemetry to verification service
- Async scoring: fraud or legit
Catch: final ~15% sophisticated bots
4. Offline analysis (batch):
- Aggregate click patterns
- Find anomalies (e.g., IP cluster clicking same ads)
- Retroactive refund advertisers
- Update blocklists
Layered defense:
- Layer 1: Rule-based (fast, catch obvious)
- Layer 2: ML scoring (moderate, catch sophisticated)
- Layer 3: Behavioral analysis (slow, catch advanced)
- Layer 4: Human review (manual, catch novel attacks)
Cost:
- False positive (block legit user): lost revenue
- False negative (let bot through): advertiser waste money, lose trust
Optimization: tune threshold based on cost function.
9.2 Trade-off Discussions
| Dimension | Option A | Option B | Trade-off |
|---|---|---|---|
| Auction type | First-price | Second-price | Transparency vs bid shading complexity |
| Budget tracking | Centralized | Distributed | Accuracy vs latency |
| User tracking | Cookie-based | Cohort-based | Targeting precision vs privacy |
| Fraud detection | Rule-based | ML-based | Speed vs accuracy |
| Attribution | Last-click | Multi-touch | Simplicity vs fairness |
| Latency vs coverage | Timeout 50ms (fewer bids) | Timeout 100ms (more bids) | Revenue vs latency |
Real-world engineering:
- KhΓ΄ng cΓ³ "best" solution, chα» cΓ³ trade-offs
- Choose dα»±a trΓͺn business constraints: revenue target, user experience, compliance
- Iterate: start simple (last-click, first-price), evolve (multi-touch, data-driven)
10. Advanced Topics
10.1 Header Bidding
<!-- Traditional waterfall: sequential, slow -->
<script>
// Call ad server first
fetchAdFromAdServer()
.then(ad => render(ad))
.catch(() => {
// Fallback to SSP 1
fetchAdFromSSP1()
.catch(() => fetchAdFromSSP2())
});
</script>
<!-- Header bidding: parallel auction -->
<script src="prebid.js"></script>
<script>
var adUnits = [{
code: 'div-banner-1',
mediaTypes: {
banner: { sizes: [[300, 250], [728, 90]] }
},
bids: [
{ bidder: 'appnexus', params: { placementId: '123' } },
{ bidder: 'rubicon', params: { accountId: '456' } },
{ bidder: 'pubmatic', params: { publisherId: '789' } }
]
}];
pbjs.que.push(function() {
pbjs.addAdUnits(adUnits);
pbjs.requestBids({
timeout: 2000, // 2 seconds total
bidsBackHandler: function(bids) {
// All SSPs returned bids
var winningBid = selectWinner(bids);
// Send winning bid to ad server
googletag.setTargeting('hb_pb', winningBid.cpm);
googletag.display('div-banner-1');
}
});
});
</script>
Challenge:
- Page load blocked during bidding (2 seconds)
- User experience impact
- Mobile especially sensitive
Solution: Server-side header bidding
- Move auction to server (faster network)
- Client sends 1 request to server
- Server fans out to SSPs
- Latency: 200ms vs 2000ms
10.2 Video Ads (VAST Protocol)
<!-- VAST: Video Ad Serving Template -->
<VAST version="4.0">
<Ad id="123456">
<InLine>
<AdSystem>ExampleDSP</AdSystem>
<AdTitle>Summer Sale</AdTitle>
<Impression><![CDATA[
https://track.example.com/impression?id=123456
]]></Impression>
<Creatives>
<Creative>
<Linear>
<Duration>00:00:15</Duration>
<MediaFiles>
<MediaFile delivery="progressive" type="video/mp4" width="1920" height="1080">
<![CDATA[https://cdn.example.com/video123.mp4]]>
</MediaFile>
</MediaFiles>
<VideoClicks>
<ClickThrough><![CDATA[
https://advertiser.com/product?utm_source=video
]]></ClickThrough>
<ClickTracking><![CDATA[
https://track.example.com/click?id=123456
]]></ClickTracking>
</VideoClicks>
<TrackingEvents>
<Tracking event="start"><![CDATA[
https://track.example.com/start?id=123456
]]></Tracking>
<Tracking event="firstQuartile"><![CDATA[
https://track.example.com/25pct?id=123456
]]></Tracking>
<Tracking event="midpoint"><![CDATA[
https://track.example.com/50pct?id=123456
]]></Tracking>
<Tracking event="thirdQuartile"><![CDATA[
https://track.example.com/75pct?id=123456
]]></Tracking>
<Tracking event="complete"><![CDATA[
https://track.example.com/complete?id=123456
]]></Tracking>
</TrackingEvents>
</Linear>
</Creative>
</Creatives>
</InLine>
</Ad>
</VAST>
<!-- Video player fires tracking pixels at each quartile -->
<!-- More complex than display ads β higher CPM -->
10.3 Supply Path Optimization (SPO)
Problem: Bid request Δi qua nhiα»u hops, mα»i hop take 5-10% cut
Publisher
β (90% pass-through)
SSP 1
β (90% pass-through)
Ad Network
β (90% pass-through)
Ad Exchange
β
DSP β Advertiser pays $5 CPM
Publisher nhαΊn: $5 * 0.9 * 0.9 * 0.9 = $3.65 (27% lost to middlemen)
SPO goal: Shorten path, increase publisher yield
Strategies:
1. Direct deals vα»i DSPs (bypass intermediaries)
2. Preferred deals (programmatic guaranteed)
3. Auction transparency (show full supply path)
4. DSP blocklist: skip low-quality SSPs/exchanges
sellers.json: IAB standard
- Publishers declare authorized resellers
- DSPs validate supply path
- Block unauthorized reselling
11. Summary & Best Practices
11.1 Key Takeaways
Latency is king: 100ms budget, every millisecond counts
- Pre-compute offline (segments, creative metadata)
- Cache aggressively (Redis, in-memory)
- Connection pooling, HTTP/2, compression
Global distribution: User traffic lΓ global, ads phαΊ£i gαΊ§n user
- 20-30 PoPs worldwide
- Anycast routing
- Colocation vα»i major DSPs
Budget management: Distributed consistency lΓ hard problem
- Allocate optimistically, reconcile eventually
- Hybrid approach: local + central fallback
Fraud is real: 30-50% traffic cΓ³ thα» lΓ fraud
- Multi-layer defense: rules, ML, behavioral
- Post-bid verification critical
- Cost of false positive vs false negative
Privacy compliance: GDPR, CCPA thay Δα»i game
- Consent management mandatory
- Contextual targeting comeback
- First-party data >>> third-party data
Attribution is complex: Multi-touch attribution fairer but harder
- Data-driven models (Shapley values)
- Real-time reporting for campaign optimization
Monitoring & alerting: Revenue at stake, downtime = $$
- Latency P99, QPS, win rate, revenue/sec
- Alert on anomalies (traffic spike, latency spike)
11.2 Production Checklist
Architecture:
- Global PoP distribution (20+ locations)
- Anycast IP routing
- Autoscaling (CPU, latency-based triggers)
- Circuit breakers cho all external calls
- Connection pooling, HTTP/2
- In-memory caching (Redis, Memcached)
Latency:
- P99 latency < 100ms
- Timeout budgets configured
- Parallel bid requests (fan-out)
- Payload compression enabled
- Pre-computed data (offline batch jobs)
Budget:
- Distributed budget tracking
- Pacing algorithm (time-of-day aware)
- End-of-day reconciliation
- Overspend alerts (> 5% threshold)
Fraud:
- IP blocklist (datacenter ranges)
- Real-time ML scoring (<5ms)
- Post-bid verification (JS tags)
- Human review pipeline (manual QA)
Privacy:
- Consent management (TCF 2.0)
- GDPR compliance (data retention, deletion)
- CCPA compliance (opt-out mechanism)
- Contextual targeting fallback
Monitoring:
- Real-time dashboards (Grafana, Datadog)
- Metrics: QPS, latency, win rate, revenue
- Alerts: latency spike, error rate spike
- On-call runbook (incident response)
Testing:
- Load testing (100k+ QPS)
- Chaos engineering (kill random instances)
- A/B testing (auction algorithms, pacing)
- Fraud detection false positive rate tracking
12. Recommended Reading & Resources
Books:
- Real-Time Bidding at Scale - Εukasz Siewicz
- Computational Advertising - Andrei Broder, Vanja Josifovski
- Distributed Systems - Maarten van Steen (consistency models)
Industry Standards:
- OpenRTB 2.5 Spec: https://iabtechlab.com/openrtb
- IAB Tech Lab VAST 4.0: https://iabtechlab.com/vast
- TCF 2.0 (consent): https://iabeurope.eu/tcf-2-0/
Vendor Documentation:
- Google Ad Manager (publisher ad server)
- The Trade Desk (DSP platform)
- Criteo (retargeting, bidding strategies)
Blogs:
- Criteo Engineering Blog (bidding optimization)
- AppNexus (Xandr) Tech Blog (RTB infrastructure)
- Google Ads Developer Blog
Papers:
- "Deep Learning for Click-Through Rate Estimation" (Google, 2016)
- "Ad Click Prediction: a View from the Trenches" (Google, 2013)
- "Real-Time Bidding Algorithms for Performance-Based Display Ad Allocation" (Stanford)
Tools:
- Prebid.js (header bidding)
- OpenX SDK (ad exchange)
- Google Ad Verification SDK
BαΊ‘n vα»«a Δi qua hα» sinh thΓ‘i AdTech β tα»« auction mechanism, low-latency architecture, fraud detection, ΔαΊΏn privacy compliance. Khi build ad exchange handle 100k QPS vα»i sub-100ms latency, bαΊ‘n Δang giαΊ£i quyαΊΏt nhα»―ng bΓ i toΓ‘n phα»©c tαΊ‘p nhαΊ₯t trong distributed systems: ultra-low latency, global scale, real-time budget management, vΓ fraud prevention α» mα»©c billions events/day.
Nhα» rαΊ±ng: AdTech lΓ battle giα»―a advertisers (muα»n targeting chΓnh xΓ‘c), publishers (muα»n maximize revenue), vΓ users (muα»n privacy + good UX). Nhiα»m vα»₯ cα»§a platform engineer lΓ balancing act nΓ y β fast, accurate, fair, vΓ compliant.
Good luck trong interview hoαΊ·c khi build real system! π