🏭 Domains✍️ Khoa📅 19/04/2026☕ 16 phút đọc

Domain: Ride-hailing & Location-based Services

Grab, Gojek, Be — những app này trông đơn giản: bạn gọi xe, tài xế đến đón. Nhưng đằng sau là bài toán matching real-time giữa hàng triệu tài xế và hành khách, tính toán ETA với traffic data, surge pricing động, và geospatial indexing cực kỳ tối ưu để query "10 tài xế gần nhất trong 2km" trong 10ms.


1. Geospatial Indexing

1.1 Vấn đề: Query Nearby Drivers

Naïve approach:
  SELECT * FROM drivers
  WHERE status = 'AVAILABLE'
    AND ST_Distance(location, ST_MakePoint(lng, lat)) < 2000  -- 2km
  ORDER BY ST_Distance(location, ST_MakePoint(lng, lat))
  LIMIT 10;

Vấn đề: Full table scan → O(N) cho mỗi query
Khi có 100k drivers online → chết ngay

1.2 Geohash

Geohash encode lat/lng thành string, vùng gần nhau có prefix chung:

Bangkok (13.7563°N, 100.5018°E) → w4rqg9
  - w4rqg9k  (precision 7: ~153m × 153m)
  - w4rqg9   (precision 6: ~1.2km × 0.6km)
  - w4rqg    (precision 5: ~4.9km × 4.9km)
  - w4rq     (precision 4: ~39km × 19.5km)

Nearby search:
  1. Tính geohash của user location (precision 6)
  2. Query drivers có geohash prefix match
  3. Tính chính xác distance cho kết quả (filter lại)
CREATE TABLE drivers (
    id              BIGSERIAL PRIMARY KEY,
    driver_id       VARCHAR(50) UNIQUE NOT NULL,
    name            TEXT NOT NULL,
    phone           VARCHAR(20),
    vehicle_type    VARCHAR(20),  -- BIKE, CAR, PREMIUM
    status          VARCHAR(20),  -- AVAILABLE, BUSY, OFFLINE
    lat             DOUBLE PRECISION NOT NULL,
    lng             DOUBLE PRECISION NOT NULL,
    geohash_6       CHAR(6),      -- precision 6: ~1km grid
    bearing         INT,          -- heading direction (0-359)
    last_updated    TIMESTAMPTZ NOT NULL DEFAULT now()
);

-- Index quan trọng
CREATE INDEX idx_drivers_geohash ON drivers(geohash_6, status) 
WHERE status = 'AVAILABLE';

CREATE INDEX idx_drivers_location ON drivers USING GIST(
    ll_to_earth(lat, lng)
);

Query:

import "github.com/mmcloughlin/geohash"

func FindNearbyDrivers(db *sql.DB, lat, lng float64, radiusKm float64) ([]Driver, error) {
    // Bước 1: Encode user location
    userGeohash := geohash.EncodeWithPrecision(lat, lng, 6)
    
    // Bước 2: Tính geohash neighbors (9 cells: center + 8 xung quanh)
    neighbors := geohash.Neighbors(userGeohash)
    searchGeohashes := append([]string{userGeohash}, neighbors...)
    
    // Bước 3: Query drivers trong các cells này
    query := `
        SELECT id, driver_id, name, lat, lng,
               earth_distance(ll_to_earth(lat, lng), ll_to_earth($1, $2)) AS distance
        FROM drivers
        WHERE geohash_6 = ANY($3)
          AND status = 'AVAILABLE'
          AND earth_distance(ll_to_earth(lat, lng), ll_to_earth($1, $2)) < $4
        ORDER BY distance
        LIMIT 20
    `
    
    rows, err := db.Query(query, lat, lng, pq.Array(searchGeohashes), radiusKm*1000)
    if err != nil {
        return nil, err
    }
    defer rows.Close()
    
    var drivers []Driver
    for rows.Next() {
        var d Driver
        rows.Scan(&d.ID, &d.DriverID, &d.Name, &d.Lat, &d.Lng, &d.Distance)
        drivers = append(drivers, d)
    }
    return drivers, nil
}

Lưu ý: Geohash có edge case khi 2 điểm gần nhau nhưng nằm 2 cell khác nhau → phải query neighbors. PostgreSQL earthdistance module tính khoảng cách chính xác.

1.3 S2 Geometry (Google's approach)

S2 là thư viện Google dùng để index bề mặt cầu thành các cells có hình dạng đều hơn Geohash (Geohash méo ở vùng cực).

import "github.com/golang/geo/s2"

func GetS2CellID(lat, lng float64, level int) s2.CellID {
    latlng := s2.LatLngFromDegrees(lat, lng)
    cellID := s2.CellIDFromLatLng(latlng).Parent(level)
    return cellID
}

func FindDriversS2(db *sql.DB, lat, lng float64, radiusMeters float64) ([]Driver, error) {
    latlng := s2.LatLngFromDegrees(lat, lng)
    
    // Tạo S2 region (circle)
    region := s2.CapFromCenterAngle(
        s2.PointFromLatLng(latlng),
        s2.RadiusToAngle(s1.Angle(radiusMeters) * s1.Meter),
    )
    
    // Covering: tìm tập cells bao phủ region này
    covering := region.CapBound().CellUnionBound()
    cellIDs := []uint64{}
    for _, cellID := range covering {
        cellIDs = append(cellIDs, uint64(cellID))
    }
    
    // Query drivers trong các cells
    query := `
        SELECT id, driver_id, lat, lng
        FROM drivers
        WHERE s2_cell_id = ANY($1)
          AND status = 'AVAILABLE'
    `
    // ... (tương tự geohash)
}

S2 ưu điểm:

  • Độ chính xác cao hơn Geohash (không méo)
  • Hiệu quả cho nearby search trên diện rộng (global scale)
  • Uber, Lyft, Google Maps đều dùng S2

Trade-off: Implementation phức tạp hơn Geohash.


2. Real-time Location Tracking

2.1 Driver Location Update Flow

Driver App                    Backend                    Rider App
  │                             │                           │
  │─ POST /location (every 5s)─►│                           │
  │   {lat, lng, bearing}       │                           │
  │                             │─ Update Redis ───────────►│
  │                             │  (driver:123:location)    │
  │                             │  TTL 30s                  │
  │                             ├─ Publish Redis Stream ───►│
  │                             │  "driver:location:updates"│
  │                             │                           │
  │                             │                WebSocket ─┤
  │                             │               (subscribe) │
  │                             │◄─────────────────────────►│
  │                             │  push location updates    │

Async worker:
  Mỗi 10s, batch flush Redis → PostgreSQL (historical tracking)

Redis schema:

GEOADD drivers:available <lng> <lat> <driver_id>
  → Redis GEO type: sorted set với geohash scoring

SET driver:123:status "AVAILABLE" EX 30
SET driver:123:bearing 45 EX 30

XADD driver:location:stream * driver_id 123 lat 13.756 lng 100.501
  → Redis Stream: real-time event log
func UpdateDriverLocation(ctx context.Context, rdb *redis.Client, driverID string, lat, lng float64, bearing int) error {
    pipe := rdb.Pipeline()
    
    // Update GEO index
    pipe.GeoAdd(ctx, "drivers:available", &redis.GeoLocation{
        Name:      driverID,
        Longitude: lng,
        Latitude:  lat,
    })
    
    // Set bearing
    pipe.Set(ctx, fmt.Sprintf("driver:%s:bearing", driverID), bearing, 30*time.Second)
    
    // Publish to stream for real-time subscribers
    pipe.XAdd(ctx, &redis.XAddArgs{
        Stream: "driver:location:stream",
        Values: map[string]interface{}{
            "driver_id": driverID,
            "lat":       lat,
            "lng":       lng,
            "bearing":   bearing,
            "timestamp": time.Now().Unix(),
        },
    })
    
    _, err := pipe.Exec(ctx)
    return err
}

func GetNearbyDrivers(ctx context.Context, rdb *redis.Client, lat, lng float64, radiusKm float64) ([]string, error) {
    result, err := rdb.GeoRadius(ctx, "drivers:available", lng, lat, &redis.GeoRadiusQuery{
        Radius:      radiusKm,
        Unit:        "km",
        WithCoord:   true,
        WithDist:    true,
        Count:       20,
        Sort:        "ASC",  // nearest first
    }).Result()
    
    driverIDs := []string{}
    for _, loc := range result {
        driverIDs = append(driverIDs, loc.Name)
    }
    return driverIDs, err
}

2.2 WebSocket Push to Rider

import "github.com/gorilla/websocket"

type Hub struct {
    clients    map[*Client]bool
    broadcast  chan LocationUpdate
    register   chan *Client
    unregister chan *Client
}

type Client struct {
    hub      *Hub
    conn     *websocket.Conn
    send     chan []byte
    rideID   string  // subscribe to specific ride's driver location
}

func (h *Hub) Run() {
    // Redis Stream consumer
    go func() {
        for {
            streams, _ := rdb.XRead(ctx, &redis.XReadArgs{
                Streams: []string{"driver:location:stream", "$"},
                Block:   0,  // block until new message
            }).Result()
            
            for _, msg := range streams[0].Messages {
                driverID := msg.Values["driver_id"].(string)
                lat := msg.Values["lat"].(float64)
                lng := msg.Values["lng"].(float64)
                
                // Tìm ride đang active của driver này
                rideID, _ := rdb.Get(ctx, fmt.Sprintf("driver:%s:active_ride", driverID)).Result()
                if rideID == "" {
                    continue
                }
                
                // Broadcast to rider's WebSocket connection
                update := LocationUpdate{
                    RideID:   rideID,
                    DriverID: driverID,
                    Lat:      lat,
                    Lng:      lng,
                }
                h.broadcast <- update
            }
        }
    }()
    
    // Broadcast to WebSocket clients
    for {
        select {
        case client := <-h.register:
            h.clients[client] = true
        case client := <-h.unregister:
            delete(h.clients, client)
            close(client.send)
        case update := <-h.broadcast:
            // Gửi tới clients đang subscribe ride này
            for client := range h.clients {
                if client.rideID == update.RideID {
                    client.send <- update.ToJSON()
                }
            }
        }
    }
}

3. Driver Matching Algorithm

3.1 Matching Criteria

1. Distance: driver gần nhất
2. Direction: driver đang đi về hướng pickup location
3. Rating: driver có rating cao hơn
4. Acceptance rate: driver ít cancel
5. Vehicle type: match với rider request (bike/car/premium)
6. Surge zone: ưu tiên driver trong surge zone

Scoring function:

type MatchCandidate struct {
    DriverID       string
    Distance       float64  // km
    Rating         float64  // 0-5
    AcceptanceRate float64  // 0-1
    Bearing        int      // 0-359
    VehicleType    string
}

func CalculateMatchScore(candidate MatchCandidate, rideRequest RideRequest) float64 {
    score := 100.0
    
    // Distance penalty (càng xa càng trừ điểm)
    score -= candidate.Distance * 10  // mỗi km trừ 10 điểm
    
    // Rating bonus
    score += (candidate.Rating - 3.0) * 5  // rating > 3.0 được cộng
    
    // Acceptance rate bonus
    score += candidate.AcceptanceRate * 10
    
    // Direction bonus (driver đang đi về phía pickup?)
    pickupBearing := calculateBearing(candidate.Lat, candidate.Lng, rideRequest.PickupLat, rideRequest.PickupLng)
    bearingDiff := abs(candidate.Bearing - pickupBearing)
    if bearingDiff < 45 { // đang đi đúng hướng
        score += 20
    }
    
    // Vehicle type exact match
    if candidate.VehicleType == rideRequest.VehicleType {
        score += 15
    }
    
    return score
}

func FindBestMatch(ctx context.Context, rdb *redis.Client, db *sql.DB, req RideRequest) (*MatchCandidate, error) {
    // Bước 1: Tìm nearby drivers (radius 5km)
    nearbyDriverIDs, _ := GetNearbyDrivers(ctx, rdb, req.PickupLat, req.PickupLng, 5.0)
    
    // Bước 2: Load thông tin drivers từ DB
    query := `
        SELECT driver_id, lat, lng, rating, acceptance_rate, vehicle_type
        FROM drivers
        WHERE driver_id = ANY($1)
          AND status = 'AVAILABLE'
    `
    rows, _ := db.Query(query, pq.Array(nearbyDriverIDs))
    defer rows.Close()
    
    candidates := []MatchCandidate{}
    for rows.Next() {
        var c MatchCandidate
        rows.Scan(&c.DriverID, &c.Lat, &c.Lng, &c.Rating, &c.AcceptanceRate, &c.VehicleType)
        
        // Tính distance chính xác
        c.Distance = haversineDistance(c.Lat, c.Lng, req.PickupLat, req.PickupLng)
        
        // Lấy bearing từ Redis
        bearing, _ := rdb.Get(ctx, fmt.Sprintf("driver:%s:bearing", c.DriverID)).Int()
        c.Bearing = bearing
        
        candidates = append(candidates, c)
    }
    
    // Bước 3: Scoring và chọn best
    var bestCandidate *MatchCandidate
    bestScore := -999999.0
    
    for _, c := range candidates {
        score := CalculateMatchScore(c, req)
        if score > bestScore {
            bestScore = score
            bestCandidate = &c
        }
    }
    
    return bestCandidate, nil
}

3.2 Dispatch Flow

Rider App          Backend              Driver App
  │                   │                     │
  │─ POST /ride ─────►│                     │
  │                   │─ Find best match   │
  │                   ├─ Lock driver ──────►│ (push notification)
  │                   │  (status: ASSIGNED) │
  │                   │                     │
  │◄─ ride created ───│                     │
  │  {ride_id, eta}   │                     │
  │                   │                     │─ Accept/Reject (30s timeout)
  │                   │                     │
  │                   │◄─ Accept ───────────│
  │◄─ driver matched ─│                     │
  │  {driver_id, ...} │                     │
  │                   │                     │
  │  ┌───WebSocket────┼─────WebSocket─────►│
  │  │ real-time location updates          │
  │◄─┤                                      │
  │  └───────────────────────────────────────

Nếu driver reject hoặc timeout:
  → Backend tìm driver tiếp theo trong candidates
  → Retry tối đa 5 lần
  → Nếu không có driver nào accept → "No drivers available"

4. ETA Calculation

4.1 Simple Distance-based

ETA (phút) = Distance (km) / Average Speed (km/h) * 60

Ví dụ:
  Distance = 5km
  Average Speed = 30 km/h  (city traffic)
  ETA = 5 / 30 * 60 = 10 phút

Vấn đề: Không tính traffic, đèn đỏ, đường một chiều.

4.2 Graph-based Routing với Traffic

1. Build road network graph:
   Nodes = intersections
   Edges = road segments, weight = travel time (dynamic)

2. Tính travel time cho mỗi edge:
   - Historical average (time of day, day of week)
   - Real-time traffic data (Google Traffic API, HERE, TomTom)
   - Road attributes (speed limit, lanes, one-way)

3. Dijkstra / A* để tìm shortest path
import "github.com/dominikbraun/graph"

type RoadSegment struct {
    From         string  // intersection ID
    To           string
    Distance     float64 // meters
    SpeedLimit   int     // km/h
    CurrentSpeed int     // km/h (from traffic data)
}

func BuildRoadGraph(segments []RoadSegment) graph.Graph[string, string] {
    g := graph.New(graph.StringHash, graph.Directed())
    
    for _, seg := range segments {
        g.AddVertex(seg.From)
        g.AddVertex(seg.To)
        
        // Weight = travel time in seconds
        travelTime := seg.Distance / (float64(seg.CurrentSpeed) / 3.6)
        g.AddEdge(seg.From, seg.To, graph.EdgeWeight(travelTime))
    }
    
    return g
}

func CalculateETA(g graph.Graph[string, string], fromNode, toNode string) (int, error) {
    path, err := graph.ShortestPath(g, fromNode, toNode)
    if err != nil {
        return 0, err
    }
    
    // Sum travel time của tất cả edges trong path
    totalTime := 0.0
    for i := 0; i < len(path)-1; i++ {
        edge, _ := g.Edge(path[i], path[i+1])
        totalTime += edge.Properties.Weight
    }
    
    return int(totalTime / 60), nil  // convert to minutes
}

4.3 ML-based ETA Prediction

Features:
  - Distance (straight line + road network)
  - Time of day, day of week
  - Weather condition
  - Historical traffic patterns
  - Number of traffic lights on route
  - Starting point, destination (encoded as embedding)

Model: Gradient Boosting (XGBoost / LightGBM)
  Input:  feature vector
  Output: ETA in minutes

Training data:
  Millions of completed trips với actual travel time
import xgboost as xgb
import pandas as pd

# Feature engineering
def extract_features(trip):
    return {
        'distance_km': trip['distance'],
        'hour': trip['pickup_time'].hour,
        'day_of_week': trip['pickup_time'].dayofweek,
        'is_weekend': 1 if trip['pickup_time'].dayofweek >= 5 else 0,
        'is_rush_hour': 1 if trip['pickup_time'].hour in [7,8,9,17,18,19] else 0,
        'pickup_lat': trip['pickup_lat'],
        'pickup_lng': trip['pickup_lng'],
        'dropoff_lat': trip['dropoff_lat'],
        'dropoff_lng': trip['dropoff_lng'],
        'weather': trip['weather'],  # 0=clear, 1=rain, 2=heavy_rain
    }

# Train model
df = pd.read_parquet('s3://trips/historical_trips.parquet')
X = df.apply(extract_features, axis=1).apply(pd.Series)
y = df['actual_duration_minutes']

model = xgb.XGBRegressor(
    max_depth=8,
    learning_rate=0.1,
    n_estimators=500,
    objective='reg:squarederror'
)
model.fit(X, y)

# Prediction
def predict_eta(model, trip):
    features = extract_features(trip)
    eta = model.predict([list(features.values())])[0]
    return max(1, int(eta))  # at least 1 minute

5. Surge Pricing

5.1 Supply-Demand Imbalance

Surge Multiplier = f(demand, supply)

Demand = số ride requests trong vùng X trong 5 phút qua
Supply = số drivers available trong vùng X

Surge = min(3.0, max(1.0, Demand / Supply))

Ví dụ:
  Vùng Hồ Gươm lúc 18h:
    Demand = 100 requests
    Supply = 20 drivers
    Surge = min(3.0, 100/20) = 3.0x  → giá tăng 3 lần
func CalculateSurgeMultiplier(ctx context.Context, rdb *redis.Client, zone string) float64 {
    // Demand: count ride requests trong 5 phút qua
    now := time.Now().Unix()
    fiveMinAgo := now - 300
    demand, _ := rdb.ZCount(ctx, fmt.Sprintf("ride_requests:%s", zone), 
                            fmt.Sprintf("%d", fiveMinAgo), 
                            fmt.Sprintf("%d", now)).Result()
    
    // Supply: count available drivers trong zone
    supply, _ := rdb.SCard(ctx, fmt.Sprintf("drivers_available:%s", zone)).Result()
    
    if supply == 0 {
        return 3.0  // max surge
    }
    
    ratio := float64(demand) / float64(supply)
    surge := math.Min(3.0, math.Max(1.0, ratio*0.1))  // scale factor 0.1
    
    // Làm tròn đến 0.25 (1.0, 1.25, 1.5, ... 3.0)
    return math.Round(surge*4) / 4
}

5.2 Geofencing — Surge Zones

Zone definition (polygons):
  - Hồ Gươm: polygon([(21.028, 105.852), (21.030, 105.856), ...])
  - Sân bay Nội Bài: polygon([...])
  - Quận 1 TPHCM: polygon([...])

Check if point in polygon:
  → PostGIS: ST_Within(point, polygon)
  → Redis: GEOSEARCH với polygon approximation
CREATE TABLE surge_zones (
    id          SERIAL PRIMARY KEY,
    name        VARCHAR(100),
    city        VARCHAR(50),
    geometry    GEOMETRY(POLYGON, 4326),  -- PostGIS type
    base_surge  DECIMAL(3,2) DEFAULT 1.0,
    created_at  TIMESTAMPTZ DEFAULT now()
);

CREATE INDEX idx_surge_zones_geom ON surge_zones USING GIST(geometry);

-- Query zone chứa điểm lat/lng
SELECT id, name, base_surge
FROM surge_zones
WHERE ST_Within(ST_MakePoint(lng, lat), geometry);
func GetSurgeForLocation(db *sql.DB, lat, lng float64) (float64, error) {
    var zone string
    var baseSurge float64
    
    err := db.QueryRow(`
        SELECT name, base_surge
        FROM surge_zones
        WHERE ST_Within(ST_MakePoint($1, $2), geometry)
        LIMIT 1
    `, lng, lat).Scan(&zone, &baseSurge)
    
    if err == sql.ErrNoRows {
        return 1.0, nil  // default: no surge
    }
    if err != nil {
        return 0, err
    }
    
    // Tính dynamic surge dựa trên real-time demand/supply
    dynamicSurge := CalculateSurgeMultiplier(ctx, rdb, zone)
    
    // Combine base surge (fixed) + dynamic surge
    return baseSurge * dynamicSurge, nil
}

6. Trip State Machine

Lifecycle của một ride:

REQUESTED ──────► MATCHED ──────► ACCEPTED ──────► ARRIVING
    │                  │                                │
    │                  ▼ (driver reject)                ▼
    │              SEARCHING ────► (retry)         PICKED_UP
    │                  │                                │
    ▼ (timeout)        ▼ (no driver)                    ▼
CANCELLED          CANCELLED                       IN_PROGRESS
                                                        │
                                                        ▼
                                                    COMPLETED
                                                        │
                                                        ▼
                                                   (Payment)
                                                        │
                                                        ▼
                                                     CLOSED

State transitions:
  REQUESTED → MATCHED:     driver found
  MATCHED → ACCEPTED:      driver accepted
  ACCEPTED → ARRIVING:     driver on the way to pickup
  ARRIVING → PICKED_UP:    driver tapped "Start trip"
  PICKED_UP → IN_PROGRESS: trip started
  IN_PROGRESS → COMPLETED: driver tapped "End trip" + rider confirm
  COMPLETED → CLOSED:      payment processed
type RideStatus string

const (
    StatusRequested  RideStatus = "REQUESTED"
    StatusMatched    RideStatus = "MATCHED"
    StatusAccepted   RideStatus = "ACCEPTED"
    StatusArriving   RideStatus = "ARRIVING"
    StatusPickedUp   RideStatus = "PICKED_UP"
    StatusInProgress RideStatus = "IN_PROGRESS"
    StatusCompleted  RideStatus = "COMPLETED"
    StatusCancelled  RideStatus = "CANCELLED"
    StatusClosed     RideStatus = "CLOSED"
)

func (r *Ride) Transition(to RideStatus) error {
    validTransitions := map[RideStatus][]RideStatus{
        StatusRequested:  {StatusMatched, StatusCancelled},
        StatusMatched:    {StatusAccepted, StatusRequested, StatusCancelled},
        StatusAccepted:   {StatusArriving, StatusCancelled},
        StatusArriving:   {StatusPickedUp, StatusCancelled},
        StatusPickedUp:   {StatusInProgress},
        StatusInProgress: {StatusCompleted},
        StatusCompleted:  {StatusClosed},
    }
    
    allowed := validTransitions[r.Status]
    for _, valid := range allowed {
        if valid == to {
            r.Status = to
            r.UpdatedAt = time.Now()
            return nil
        }
    }
    
    return fmt.Errorf("invalid transition from %s to %s", r.Status, to)
}

7. Fraud Detection & Safety

7.1 GPS Spoofing Detection

Driver giả mạo GPS để fake location (ở xa nhưng claim ở gần để nhận ride):

Detection:
  1. Kiểm tra tốc độ di chuyển:
     Speed = Distance / Time
     Nếu speed > 200 km/h trên đường phố → GPS spoof
  
  2. Kiểm tra bearing consistency:
     Bearing thay đổi đột ngột 180° trong 1s → spoof
  
  3. Kiểm tra GPS accuracy:
     Android/iOS cung cấp accuracy field (meters)
     Accuracy > 100m → reject update
  
  4. Cross-check với cellular tower location
func DetectGPSSpoofing(prevLocation, currLocation LocationUpdate) bool {
    // Check speed
    distance := haversineDistance(prevLocation.Lat, prevLocation.Lng, 
                                   currLocation.Lat, currLocation.Lng)
    timeDiff := currLocation.Timestamp.Sub(prevLocation.Timestamp).Seconds()
    speed := (distance / timeDiff) * 3.6  // km/h
    
    if speed > 150 {  // impossible speed for ground vehicle
        return true
    }
    
    // Check bearing change
    bearingDiff := abs(currLocation.Bearing - prevLocation.Bearing)
    if bearingDiff > 90 && timeDiff < 5 {  // quay đầu đột ngột
        return true
    }
    
    // Check accuracy
    if currLocation.Accuracy > 100 {  // GPS signal too weak
        return true
    }
    
    return false
}

7.2 Safety Features

1. SOS Button:
   - Gửi alert tới control center
   - Chia sẻ real-time location với emergency contact
   - Auto-record audio (với user consent)

2. Trip Sharing:
   - Rider share trip link với bạn bè/gia đình
   - Recipient có thể xem real-time location + driver info

3. Route Deviation Alert:
   - So sánh actual route vs expected route (Google Directions API)
   - Alert rider nếu driver đi sai đường > 500m

8. Interview Questions

8.1 System Design

Q: Thiết kế Grab — focus on matching algorithm

Components:

  1. Driver location service (Redis GEO + WebSocket)
  2. Matching service (scoring algorithm + dispatch)
  3. Ride orchestration (state machine + Saga)
  4. Pricing service (surge calculation)
  5. Notification service (push to driver app)

Constraints:

  • 100k drivers online, 500k riders
  • Match latency < 2s (city average)
  • Location update every 5s

Trade-offs:

  • Redis GEO vs PostgreSQL PostGIS: Redis faster, but eventual consistency
  • Scoring in-memory vs ML model API call: latency vs accuracy
  • Optimistic matching (assign before driver accept) vs pessimistic

Q: Làm sao scale location updates khi có 1 triệu drivers?

Answer:

  1. Sharding: Shard Redis by city/region
    • Hanoi drivers → Redis cluster 1
    • HCMC drivers → Redis cluster 2
  2. Sampling: Chỉ update location khi di chuyển > 50m (reduce writes)
  3. TTL: Redis keys expire sau 30s → auto cleanup inactive drivers
  4. Batch: Driver app batch 3 location points → gửi 1 request

8.2 Trade-off Questions

Q: Geohash vs S2 Geometry?

Answer:

  • Geohash: đơn giản, nhưng méo ở vùng cực, edge cases khi query near cell boundary
  • S2: chính xác, uniform cell shape, nhưng implementation phức tạp

Grab/Uber dùng S2 vì scale global, cần consistency. Startup nhỏ có thể bắt đầu với Geohash.


Q: Tại sao không dùng WebSocket cho driver location update?

Answer:

  • WebSocket duy trì persistent connection → tốn server resources
  • 100k drivers = 100k TCP connections
  • Mobile network unreliable → WebSocket reconnect overhead

Better: HTTP polling mỗi 5s hoặc HTTP/2 Server Push. WebSocket chỉ dùng cho rider (real-time tracking), không dùng cho driver updates.


Tóm tắt

Ride-hailing là bài toán real-time geospatial matching với constraints:

  • Latency: User không chờ > 10s để tìm xe
  • Accuracy: GPS không hoàn hảo, cần filter spoofing
  • Fairness: Matching không chỉ distance, còn rating, direction, acceptance rate
  • Economics: Surge pricing để balance supply-demand

Các kỹ thuật quan trọng:

  • Geohash / S2 Geometry cho nearby search
  • Redis GEO + GEOSEARCH
  • WebSocket push cho real-time tracking
  • State machine cho ride lifecycle
  • ML model cho ETA prediction

Tài liệu tham khảo