Files
clutch/docs/DATABASE_LOGICAL_STRUCTURE.md

110 lines
3.6 KiB
Markdown
Raw Permalink Normal View History

# Database Logical Structure (ER Diagram)
This diagram illustrates the logical relationships and data flow between the storage layers (L1, L2, L3) in the optimized architecture.
```mermaid
erDiagram
%% ==========================================
%% L1 LAYER: RAW DATA (Data Lake)
%% ==========================================
L1A_raw_iframe_network {
string match_id PK
json content "Raw API Response"
timestamp processed_at
}
L1B_tick_snapshots_parquet {
string match_id FK
int tick
int round
json player_states "Positions, HP, Equip"
json bomb_state
string file_path "Parquet File Location"
}
%% ==========================================
%% L2 LAYER: DATA WAREHOUSE (Structured)
%% ==========================================
dim_players {
string steam_id_64 PK
string username
float rating
float avg_clutch_win_rate
}
dim_maps {
int map_id PK
string map_name "de_mirage"
string nav_mesh_path
}
fact_matches {
string match_id PK
int map_id FK
timestamp start_time
int winner_team
int final_score_ct
int final_score_t
}
fact_rounds {
string round_id PK
string match_id FK
int round_num
int winner_side
string win_reason "Elimination/Bomb/Time"
}
L2_Spatial_NavMesh {
string map_name PK
string zone_id
binary distance_matrix "Pre-calculated paths"
}
%% ==========================================
%% L3 LAYER: FEATURE STORE (AI Ready)
%% ==========================================
L3_Offline_Features {
string snapshot_id PK
float feature_tpi "Time Pressure Index"
float feature_crossfire "Tactical Score"
float feature_equipment_diff
int label_is_win "Target Variable"
}
%% ==========================================
%% RELATIONSHIPS
%% ==========================================
%% L1 -> L2 Flow
L1A_raw_iframe_network ||--|{ fact_matches : "Extracts to"
L1A_raw_iframe_network ||--|{ dim_players : "Extracts to"
L1B_tick_snapshots_parquet }|--|| fact_matches : "Belongs to"
L1B_tick_snapshots_parquet }|--|| fact_rounds : "Details"
%% L2 Relations
fact_matches }|--|| dim_maps : "Played on"
fact_rounds }|--|| fact_matches : "Part of"
%% L2 -> L3 Flow (Feature Engineering)
L3_Offline_Features }|--|| L1B_tick_snapshots_parquet : "Computed from"
L3_Offline_Features }|--|| L2_Spatial_NavMesh : "Uses Physics from"
L3_Offline_Features }|--|| dim_players : "Enriched with"
```
## 结构说明 (Structure Explanation)
1. **L1 源数据层**:
* **左上 (L1A)**: 传统的数据库表,存储比赛结果元数据。
* **左下 (L1B)**: **虚线框表示的文件系统**。虽然物理上是 Parquet 文件但在逻辑上它是一张巨大的“Tick 级快照表”,通过 `match_id` 与其他层关联。
2. **L2 数仓层**:
* **核心 (Dim/Fact)**: 标准的星型模型。`fact_matches` 是核心事实表,关联 `dim_players` (人) 和 `dim_maps` (地)。
* **空间 (Spatial)**: 独立的查找表逻辑,为每一张 `dim_maps` 提供物理距离计算支持。
3. **L3 特征层**:
* **右侧 (Features)**: 这是宽表Wide Table每一行直接对应模型的一个训练样本。它不存储原始数据而是存储**计算后的数值** (如 TPI 指数),直接由 L1B (位置) + L2 Spatial (距离) + Dim Players (能力) 融合计算而来。