39 KiB
39 KiB
L3 Feature Mart - Complete Architecture Plan
Version: 2.0 (Complete Redesign)
Date: 2026-01-28
Status: Planning Phase
Executive Summary
基于完整的L2 schema和Profile需求,重新设计L3特征层架构。核心原则:
- 去除冗余:消除Profile_summary.md中的重复指标
- 深度挖掘:利用L2的rounds/events数据进行深层次特征工程
- 模块化计算:按照功能域拆分processor,清晰的职责边界
- 服务解耦:web/services只做查询,不做计算
Part 1: 特征维度重构分析
1.1 现有Profile问题诊断
重复指标识别:
- basic_avg_rating 在 Dashboard + Core Performance 重复
- basic_avg_kd 在 Dashboard + Core Performance 重复
- basic_avg_adr 在 Dashboard + Core Performance 重复
- basic_avg_kast 在 Dashboard + Core Performance 重复
- FK/FD 在 Opening Impact + SIDE Preference 重复
- Clutch 数据在 Multi-Frag + HPS + SPECIAL 重复
- 多个"率"类指标可从原始count计算,不需存储
缺失维度识别:
✗ 地图热力维度(基于xyz坐标)
✗ 武器偏好深度分析(不仅是top5)
✗ 对手强度分层表现(基于ELO差值)
✗ 时间序列波动分析(不仅是volatility)
✗ 队友协同效应(assist network)
✗ 经济效率分层(不同价位段表现)
✗ 回合贡献度评分(综合impact)
1.2 重构后的特征分类体系
🎯 Tier 1: 核心基础层 (CORE)
目标:最常用的聚合统计,直接从fact_match_players计算
| 特征组 | 指标数量 | 典型指标 | L2来源表 |
|---|---|---|---|
| Basic Stats | 15 | rating, kd, adr, kast, rws, hs% | fact_match_players |
| Match Stats | 8 | total_matches, win_rate, avg_duration | fact_matches + fact_match_players |
| Weapon Stats | 12 | awp_kills, knife_kills, zeus_kills, top_weapon | fact_match_players + fact_round_events |
| Objective Stats | 6 | plants, defuses, mvps, flash_assists | fact_match_players |
特点:
- 单表或简单JOIN即可计算
- 无复杂逻辑,纯聚合函数
- 用于Dashboard快速展示
🔥 Tier 2: 战术能力层 (TACTICAL)
目标:反映玩家战术素养的深度指标
| 特征组 | 指标数量 | 典型指标 | 计算复杂度 |
|---|---|---|---|
| Opening Impact | 8 | fk_rate, fd_rate, fk_success_rate, entry_trade_rate | 中 |
| Multi-Kill | 6 | 2k/3k/4k/5k rates, ace_count | 低 |
| Clutch Performance | 10 | 1v1~1v5 win_rate, clutch_impact_score | 中 |
| Utility Mastery | 12 | nade_dmg_per_round, flash_efficiency, smoke_timing | 高 |
| Economy Efficiency | 8 | dmg_per_1k, eco_kd, force_buy_performance | 中 |
特点:
- 需要JOIN多表(players + events + economy)
- 涉及条件筛选和比率计算
- 反映玩家决策质量
🧠 Tier 3: 高级智能层 (INTELLIGENCE)
目标:通过复杂计算提取隐藏模式
| 特征组 | 指标数量 | 典型指标 | 数据源 |
|---|---|---|---|
| High IQ Kills | 8 | wallbang_rate, smoke_kill_rate, blind_kill_rate, iq_score | fact_round_events (flags) |
| Timing Analysis | 12 | kill_time_distribution, death_timing_pattern, aggression_index | fact_round_events (event_time) |
| Pressure Performance | 10 | comeback_kd, losing_streak_kd, matchpoint_kpr | fact_rounds + fact_round_events |
| Position Mastery | 15 | position_heatmap, site_control_rate, rotation_efficiency | fact_round_events (xyz) |
| Trade Network | 8 | trade_kill_rate, trade_response_time, teamwork_score | fact_round_events (self-join) |
特点:
- 需要时间窗口计算(5s/10s trade window)
- 涉及空间分析(xyz聚类)
- 需要序列分析(连败/追分场景)
📊 Tier 4: 稳定性与元数据层 (META)
目标:长期表现模式和元特征
| 特征组 | 指标数量 | 典型指标 | 计算方式 |
|---|---|---|---|
| Stability | 8 | rating_volatility, map_stability, recent_form | 时间序列STDDEV/滑动窗口 |
| Side Preference | 14 | ct_rating, t_rating, side_kd_diff, side_win_diff | fact_match_players_ct/t |
| Opponent Adaptation | 12 | performance_vs_elo_tiers, rank_diff_impact | fact_match_teams (elo) |
| Map Specialization | 10 | map_rating_by_map, best_map, worst_map | GROUP BY map |
| Session Pattern | 8 | daily_performance, streak_analysis, fatigue_index | 时间戳分组 |
特点:
- 跨match维度聚合
- 需要分层/分组分析
- 涉及时间序列特征
🎨 Tier 5: 综合评分层 (COMPOSITE)
目标:多维度加权综合评分,用于雷达图
| 评分维度 | 权重组成 | 输出范围 | 用途 |
|---|---|---|---|
| AIM (枪法) | 25% Rating + 20% KD + 15% ADR + 10% DuelWin + 10% HighEloKD + 20% MultiKill | 0-100 | Radar Axis |
| CLUTCH (残局) | 25% 1v3+ + 20% MatchPtWin + 20% ComebackKD + 15% PressureEntry + 20% Rating | 0-100 | Radar Axis |
| PISTOL (手枪) | 30% PistolKills + 30% PistolWin + 20% PistolKD + 20% PistolHS% | 0-100 | Radar Axis |
| DEFENSE (防守) | 35% CT_Rating + 35% T_Rating + 15% CT_FK + 15% T_FK | 0-100 | Radar Axis |
| UTIL (道具) | 35% UsageRate + 25% NadeDmg + 20% FlashEff + 20% FlashEnemy | 0-100 | Radar Axis |
| STABILITY (稳定) | 30% (100-Volatility) + 30% LossRating + 20% WinRating + 20% Consistency | 0-100 | Radar Axis |
| ECONOMY (经济) | 50% Dmg/$1k + 30% EcoKPR + 20% SaveRoundKD | 0-100 | Radar Axis |
| PACE (节奏) | 40% EntryTiming + 30% TradeSpeed + 30% AggressionIndex | 0-100 | Radar Axis |
特点:
- 依赖Tier 1-4的基础特征
- 标准化 + 加权 = 0-100评分
- 最后计算,存储为独立字段
Part 2: L3 Table Schema Design
2.1 主表:dm_player_features
设计原则:
- 一个player一行,steam_id_64为主键
- 包含所有聚合特征(200+列)
- 按照Tier分组组织列
- 添加元数据列(matches_count, last_updated等)
CREATE TABLE dm_player_features (
-- 主键与元数据
steam_id_64 TEXT PRIMARY KEY,
total_matches INTEGER NOT NULL DEFAULT 0,
total_rounds INTEGER NOT NULL DEFAULT 0,
first_match_date INTEGER, -- Unix timestamp
last_match_date INTEGER,
last_updated TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
-- ==========================================
-- Tier 1: CORE - Basic Stats (15 columns)
-- ==========================================
core_avg_rating REAL DEFAULT 0.0,
core_avg_rating2 REAL DEFAULT 0.0,
core_avg_kd REAL DEFAULT 0.0,
core_avg_adr REAL DEFAULT 0.0,
core_avg_kast REAL DEFAULT 0.0,
core_avg_rws REAL DEFAULT 0.0,
core_avg_hs_kills REAL DEFAULT 0.0,
core_hs_rate REAL DEFAULT 0.0, -- hs/total_kills
core_total_kills INTEGER DEFAULT 0,
core_total_deaths INTEGER DEFAULT 0,
core_total_assists INTEGER DEFAULT 0,
core_avg_assists REAL DEFAULT 0.0,
core_kpr REAL DEFAULT 0.0, -- kills per round
core_dpr REAL DEFAULT 0.0, -- deaths per round
core_survival_rate REAL DEFAULT 0.0, -- survived rounds / total rounds
-- Match Stats (8 columns)
core_win_rate REAL DEFAULT 0.0,
core_wins INTEGER DEFAULT 0,
core_losses INTEGER DEFAULT 0,
core_avg_match_duration INTEGER DEFAULT 0, -- seconds
core_avg_mvps REAL DEFAULT 0.0,
core_mvp_rate REAL DEFAULT 0.0, -- mvps per match
core_avg_elo_change REAL DEFAULT 0.0,
core_total_elo_gained REAL DEFAULT 0.0,
-- Weapon Stats (12 columns)
core_avg_awp_kills REAL DEFAULT 0.0,
core_awp_usage_rate REAL DEFAULT 0.0, -- rounds with AWP / total rounds
core_avg_knife_kills REAL DEFAULT 0.0,
core_avg_zeus_kills REAL DEFAULT 0.0,
core_zeus_buy_rate REAL DEFAULT 0.0,
core_top_weapon TEXT, -- Most used weapon name
core_top_weapon_kills INTEGER DEFAULT 0,
core_top_weapon_hs_rate REAL DEFAULT 0.0,
core_weapon_diversity REAL DEFAULT 0.0, -- Shannon entropy of weapon usage
core_rifle_hs_rate REAL DEFAULT 0.0,
core_pistol_hs_rate REAL DEFAULT 0.0,
core_smg_kills_total INTEGER DEFAULT 0,
-- Objective Stats (6 columns)
core_avg_plants REAL DEFAULT 0.0,
core_avg_defuses REAL DEFAULT 0.0,
core_avg_flash_assists REAL DEFAULT 0.0,
core_plant_success_rate REAL DEFAULT 0.0, -- plants / T rounds
core_defuse_success_rate REAL DEFAULT 0.0, -- defuses / (CT rounds with plant)
core_objective_impact REAL DEFAULT 0.0, -- Weighted score: 2*plant + 3*defuse + 0.5*flash_assist
-- ==========================================
-- Tier 2: TACTICAL - Opening Impact (8)
-- ==========================================
tac_avg_fk REAL DEFAULT 0.0, -- first kills per match
tac_avg_fd REAL DEFAULT 0.0, -- first deaths per match
tac_fk_rate REAL DEFAULT 0.0, -- FK / (FK + FD)
tac_fd_rate REAL DEFAULT 0.0, -- FD / (FK + FD)
tac_fk_success_rate REAL DEFAULT 0.0, -- team win rate when player gets FK
tac_entry_kill_rate REAL DEFAULT 0.0, -- entry_kills per T round
tac_entry_death_rate REAL DEFAULT 0.0,
tac_opening_duel_winrate REAL DEFAULT 0.0, -- entry_kills / (entry_kills + entry_deaths)
-- Multi-Kill (6)
tac_avg_2k REAL DEFAULT 0.0,
tac_avg_3k REAL DEFAULT 0.0,
tac_avg_4k REAL DEFAULT 0.0,
tac_avg_5k REAL DEFAULT 0.0,
tac_multikill_rate REAL DEFAULT 0.0, -- (2k+3k+4k+5k) / rounds
tac_ace_count INTEGER DEFAULT 0,
-- Clutch Performance (10)
tac_clutch_1v1_attempts INTEGER DEFAULT 0,
tac_clutch_1v1_wins INTEGER DEFAULT 0,
tac_clutch_1v1_rate REAL DEFAULT 0.0, -- wins / attempts
tac_clutch_1v2_attempts INTEGER DEFAULT 0,
tac_clutch_1v2_wins INTEGER DEFAULT 0,
tac_clutch_1v2_rate REAL DEFAULT 0.0,
tac_clutch_1v3_plus_attempts INTEGER DEFAULT 0, -- 1v3+1v4+1v5 combined
tac_clutch_1v3_plus_wins INTEGER DEFAULT 0,
tac_clutch_1v3_plus_rate REAL DEFAULT 0.0,
tac_clutch_impact_score REAL DEFAULT 0.0, -- Weighted: 1v1*1 + 1v2*3 + 1v3*7 + 1v4*15 + 1v5*30
-- Utility Mastery (12)
tac_util_flash_per_round REAL DEFAULT 0.0,
tac_util_smoke_per_round REAL DEFAULT 0.0,
tac_util_molotov_per_round REAL DEFAULT 0.0,
tac_util_he_per_round REAL DEFAULT 0.0,
tac_util_usage_rate REAL DEFAULT 0.0, -- Total nades / rounds
tac_util_nade_dmg_per_round REAL DEFAULT 0.0,
tac_util_nade_dmg_per_nade REAL DEFAULT 0.0,
tac_util_flash_time_per_round REAL DEFAULT 0.0,
tac_util_flash_enemies_per_round REAL DEFAULT 0.0,
tac_util_flash_efficiency REAL DEFAULT 0.0, -- flash_enemies / flash_usage
tac_util_smoke_timing_score REAL DEFAULT 0.0, -- Based on smoke usage in execute (40-60s)
tac_util_impact_score REAL DEFAULT 0.0, -- Composite utility impact
-- Economy Efficiency (8)
tac_eco_dmg_per_1k REAL DEFAULT 0.0, -- damage / (equipment_value / 1000)
tac_eco_kpr_eco_rounds REAL DEFAULT 0.0, -- KPR when equipment < $2000
tac_eco_kd_eco_rounds REAL DEFAULT 0.0,
tac_eco_kpr_force_rounds REAL DEFAULT 0.0, -- $2000-$4000
tac_eco_kpr_full_rounds REAL DEFAULT 0.0, -- $4000+
tac_eco_save_discipline REAL DEFAULT 0.0, -- % of eco rounds with proper save
tac_eco_force_success_rate REAL DEFAULT 0.0, -- Win rate in force buy rounds
tac_eco_efficiency_score REAL DEFAULT 0.0, -- Composite economic efficiency
-- ==========================================
-- Tier 3: INTELLIGENCE - High IQ Kills (8)
-- ==========================================
int_wallbang_kills INTEGER DEFAULT 0,
int_wallbang_rate REAL DEFAULT 0.0, -- wallbang / total_kills
int_smoke_kills INTEGER DEFAULT 0,
int_smoke_kill_rate REAL DEFAULT 0.0,
int_blind_kills INTEGER DEFAULT 0,
int_blind_kill_rate REAL DEFAULT 0.0,
int_noscope_kills INTEGER DEFAULT 0,
int_noscope_rate REAL DEFAULT 0.0, -- noscope / awp_kills
int_high_iq_score REAL DEFAULT 0.0, -- Weighted: wallbang*3 + smoke*2 + blind*1.5 + noscope*2
-- Timing Analysis (12)
int_timing_early_kills INTEGER DEFAULT 0, -- 0-30s
int_timing_mid_kills INTEGER DEFAULT 0, -- 30-60s
int_timing_late_kills INTEGER DEFAULT 0, -- 60s+
int_timing_early_kill_share REAL DEFAULT 0.0,
int_timing_mid_kill_share REAL DEFAULT 0.0,
int_timing_late_kill_share REAL DEFAULT 0.0,
int_timing_avg_kill_time REAL DEFAULT 0.0, -- Avg seconds from round start
int_timing_early_deaths INTEGER DEFAULT 0,
int_timing_early_death_rate REAL DEFAULT 0.0,
int_timing_aggression_index REAL DEFAULT 0.0, -- early_kills / early_deaths
int_timing_patience_score REAL DEFAULT 0.0, -- late_kills / total_kills
int_timing_first_contact_time REAL DEFAULT 0.0, -- Avg time to first engagement
-- Pressure Performance (10)
int_pressure_comeback_kd REAL DEFAULT 0.0, -- KD when down 4+ rounds
int_pressure_comeback_rating REAL DEFAULT 0.0,
int_pressure_losing_streak_kd REAL DEFAULT 0.0, -- KD during 3+ round loss streak
int_pressure_matchpoint_kpr REAL DEFAULT 0.0, -- KPR at match point (15-X or 12-X)
int_pressure_matchpoint_rating REAL DEFAULT 0.0,
int_pressure_clutch_composure REAL DEFAULT 0.0, -- Clutch rate in must-win situations
int_pressure_entry_in_loss REAL DEFAULT 0.0, -- FK rate in losing matches
int_pressure_performance_index REAL DEFAULT 0.0, -- Composite pressure metric
int_pressure_big_moment_score REAL DEFAULT 0.0, -- Weighted matchpoint + comeback performance
int_pressure_tilt_resistance REAL DEFAULT 0.0, -- rating_in_loss / rating_in_win
-- Position Mastery (15) - Based on xyz clustering
int_pos_site_a_control_rate REAL DEFAULT 0.0, -- % of rounds controlling A site
int_pos_site_b_control_rate REAL DEFAULT 0.0,
int_pos_mid_control_rate REAL DEFAULT 0.0,
int_pos_favorite_position TEXT, -- Most common position cluster
int_pos_position_diversity REAL DEFAULT 0.0, -- Entropy of position usage
int_pos_rotation_speed REAL DEFAULT 0.0, -- Avg distance traveled between kills
int_pos_map_coverage REAL DEFAULT 0.0, -- % of map areas visited
int_pos_defensive_positioning REAL DEFAULT 0.0, -- CT: avg distance from site
int_pos_aggressive_positioning REAL DEFAULT 0.0, -- T: avg distance pushed
int_pos_lurk_tendency REAL DEFAULT 0.0, -- % of rounds alone vs teammates
int_pos_site_anchor_score REAL DEFAULT 0.0, -- Consistency holding site
int_pos_entry_route_diversity REAL DEFAULT 0.0, -- Different entry paths used
int_pos_retake_positioning REAL DEFAULT 0.0, -- Performance in retake scenarios
int_pos_postplant_positioning REAL DEFAULT 0.0, -- Position quality after plant
int_pos_spatial_iq_score REAL DEFAULT 0.0, -- Composite positioning intelligence
-- Trade Network (8)
int_trade_kill_count INTEGER DEFAULT 0, -- Kills within 5s of teammate death
int_trade_kill_rate REAL DEFAULT 0.0, -- trade_kills / total_kills
int_trade_response_time REAL DEFAULT 0.0, -- Avg seconds to trade teammate
int_trade_given_count INTEGER DEFAULT 0, -- Deaths traded by teammate
int_trade_given_rate REAL DEFAULT 0.0, -- traded_deaths / total_deaths
int_trade_balance REAL DEFAULT 0.0, -- trades_given - trades_made
int_trade_efficiency REAL DEFAULT 0.0, -- (trade_kills + traded_deaths) / (total_kills + deaths)
int_teamwork_score REAL DEFAULT 0.0, -- Composite teamwork metric
-- ==========================================
-- Tier 4: META - Stability (8)
-- ==========================================
meta_rating_volatility REAL DEFAULT 0.0, -- STDDEV of last 20 matches
meta_recent_form_rating REAL DEFAULT 0.0, -- AVG of last 10 matches
meta_win_rating REAL DEFAULT 0.0, -- AVG rating in wins
meta_loss_rating REAL DEFAULT 0.0, -- AVG rating in losses
meta_rating_consistency REAL DEFAULT 0.0, -- 100 - volatility_normalized
meta_time_rating_correlation REAL DEFAULT 0.0, -- Correlation(match_time, rating)
meta_map_stability REAL DEFAULT 0.0, -- STDDEV of rating across maps
meta_elo_tier_stability REAL DEFAULT 0.0, -- STDDEV of rating across opponent ELO tiers
-- Side Preference (14)
meta_side_ct_rating REAL DEFAULT 0.0,
meta_side_t_rating REAL DEFAULT 0.0,
meta_side_ct_kd REAL DEFAULT 0.0,
meta_side_t_kd REAL DEFAULT 0.0,
meta_side_ct_win_rate REAL DEFAULT 0.0,
meta_side_t_win_rate REAL DEFAULT 0.0,
meta_side_ct_fk_rate REAL DEFAULT 0.0, -- FK per CT round
meta_side_t_fk_rate REAL DEFAULT 0.0,
meta_side_ct_kast REAL DEFAULT 0.0,
meta_side_t_kast REAL DEFAULT 0.0,
meta_side_rating_diff REAL DEFAULT 0.0, -- CT - T
meta_side_kd_diff REAL DEFAULT 0.0,
meta_side_preference TEXT, -- 'CT', 'T', or 'Balanced'
meta_side_balance_score REAL DEFAULT 0.0, -- 100 - ABS(CT_rating - T_rating)*50
-- Opponent Adaptation (12)
meta_opp_vs_lower_elo_rating REAL DEFAULT 0.0, -- vs opponents -200 ELO
meta_opp_vs_similar_elo_rating REAL DEFAULT 0.0, -- vs ±200 ELO
meta_opp_vs_higher_elo_rating REAL DEFAULT 0.0, -- vs +200 ELO
meta_opp_vs_lower_elo_kd REAL DEFAULT 0.0,
meta_opp_vs_similar_elo_kd REAL DEFAULT 0.0,
meta_opp_vs_higher_elo_kd REAL DEFAULT 0.0,
meta_opp_elo_adaptation REAL DEFAULT 0.0, -- higher_elo_rating / lower_elo_rating
meta_opp_stomping_score REAL DEFAULT 0.0, -- Performance vs weaker opponents
meta_opp_upset_score REAL DEFAULT 0.0, -- Performance vs stronger opponents
meta_opp_consistency_across_elos REAL DEFAULT 0.0, -- 100 - STDDEV(rating by elo tier)
meta_opp_rank_resistance REAL DEFAULT 0.0, -- Win rate vs higher ELO
meta_opp_smurf_detection REAL DEFAULT 0.0, -- Abnormally high performance vs lower ELO
-- Map Specialization (10)
meta_map_best_map TEXT,
meta_map_best_rating REAL DEFAULT 0.0,
meta_map_worst_map TEXT,
meta_map_worst_rating REAL DEFAULT 0.0,
meta_map_diversity REAL DEFAULT 0.0, -- Entropy of map ratings
meta_map_pool_size INTEGER DEFAULT 0, -- Number of maps with 5+ matches
meta_map_specialist_score REAL DEFAULT 0.0, -- (best - worst) rating
meta_map_versatility REAL DEFAULT 0.0, -- 100 - map_stability
meta_map_comfort_zone_rate REAL DEFAULT 0.0, -- % of matches on top 3 maps
meta_map_adaptation REAL DEFAULT 0.0, -- Avg rating on non-favorite maps
-- Session Pattern (8)
meta_session_avg_matches_per_day REAL DEFAULT 0.0,
meta_session_longest_streak INTEGER DEFAULT 0, -- Days played consecutively
meta_session_weekend_rating REAL DEFAULT 0.0,
meta_session_weekday_rating REAL DEFAULT 0.0,
meta_session_morning_rating REAL DEFAULT 0.0, -- 6-12h
meta_session_afternoon_rating REAL DEFAULT 0.0, -- 12-18h
meta_session_evening_rating REAL DEFAULT 0.0, -- 18-24h
meta_session_night_rating REAL DEFAULT 0.0, -- 0-6h
-- ==========================================
-- Tier 5: COMPOSITE - Radar Scores (8)
-- ==========================================
score_aim REAL DEFAULT 0.0, -- 0-100 normalized
score_clutch REAL DEFAULT 0.0,
score_pistol REAL DEFAULT 0.0,
score_defense REAL DEFAULT 0.0,
score_utility REAL DEFAULT 0.0,
score_stability REAL DEFAULT 0.0,
score_economy REAL DEFAULT 0.0,
score_pace REAL DEFAULT 0.0,
-- Overall composite
score_overall REAL DEFAULT 0.0, -- AVG of all 8 scores
-- Performance tier classification
tier_classification TEXT, -- 'Elite', 'Advanced', 'Intermediate', 'Beginner'
tier_percentile REAL DEFAULT 0.0, -- Overall percentile rank
-- Index for queries
FOREIGN KEY (steam_id_64) REFERENCES dim_players(steam_id_64)
);
CREATE INDEX idx_dm_player_features_rating ON dm_player_features(core_avg_rating DESC);
CREATE INDEX idx_dm_player_features_matches ON dm_player_features(total_matches DESC);
CREATE INDEX idx_dm_player_features_tier ON dm_player_features(tier_classification);
列统计:
- Tier 1 CORE: 41 columns
- Tier 2 TACTICAL: 44 columns
- Tier 3 INTELLIGENCE: 53 columns
- Tier 4 META: 52 columns
- Tier 5 COMPOSITE: 11 columns
- Meta + Keys: 6 columns
- Total: ~207 columns
2.2 辅助表:dm_player_match_history
用途:支持时间序列分析和趋势图
CREATE TABLE dm_player_match_history (
steam_id_64 TEXT,
match_id TEXT,
match_date INTEGER, -- Unix timestamp
match_sequence INTEGER, -- Player's N-th match
-- Core performance
rating REAL,
kd_ratio REAL,
adr REAL,
kast REAL,
is_win BOOLEAN,
-- Match context
map_name TEXT,
opponent_avg_elo REAL,
teammate_avg_rating REAL,
-- Cumulative stats (for moving averages)
cumulative_rating REAL, -- AVG up to this match
rolling_10_rating REAL, -- Last 10 matches AVG
PRIMARY KEY (steam_id_64, match_id),
FOREIGN KEY (steam_id_64) REFERENCES dm_players(steam_id_64),
FOREIGN KEY (match_id) REFERENCES fact_matches(match_id)
);
CREATE INDEX idx_player_history_player_date ON dm_player_match_history(steam_id_64, match_date DESC);
2.3 辅助表:dm_player_map_stats
用途:地图级别细分统计
CREATE TABLE dm_player_map_stats (
steam_id_64 TEXT,
map_name TEXT,
matches INTEGER DEFAULT 0,
wins INTEGER DEFAULT 0,
win_rate REAL DEFAULT 0.0,
avg_rating REAL DEFAULT 0.0,
avg_kd REAL DEFAULT 0.0,
avg_adr REAL DEFAULT 0.0,
avg_kast REAL DEFAULT 0.0,
best_rating REAL DEFAULT 0.0,
worst_rating REAL DEFAULT 0.0,
PRIMARY KEY (steam_id_64, map_name),
FOREIGN KEY (steam_id_64) REFERENCES dm_players(steam_id_64)
);
2.4 辅助表:dm_player_weapon_stats
用途:武器使用统计(Top 10)
CREATE TABLE dm_player_weapon_stats (
steam_id_64 TEXT,
weapon_name TEXT,
total_kills INTEGER DEFAULT 0,
total_headshots INTEGER DEFAULT 0,
hs_rate REAL DEFAULT 0.0,
usage_rounds INTEGER DEFAULT 0, -- Rounds used this weapon
usage_rate REAL DEFAULT 0.0, -- % of all rounds
avg_kills_per_round REAL DEFAULT 0.0, -- When used
effectiveness_score REAL DEFAULT 0.0, -- Composite weapon skill
PRIMARY KEY (steam_id_64, weapon_name),
FOREIGN KEY (steam_id_64) REFERENCES dm_players(steam_id_64)
);
Part 3: Processor Architecture
3.1 Processor职责划分
L3_Builder.py (主控)
├── BasicProcessor (Tier 1: CORE)
│ ├── calculate_basic_stats()
│ ├── calculate_match_stats()
│ ├── calculate_weapon_stats()
│ └── calculate_objective_stats()
│
├── TacticalProcessor (Tier 2: TACTICAL)
│ ├── calculate_opening_impact()
│ ├── calculate_multikill()
│ ├── calculate_clutch()
│ ├── calculate_utility()
│ └── calculate_economy()
│
├── IntelligenceProcessor (Tier 3: INTELLIGENCE)
│ ├── calculate_high_iq_kills()
│ ├── calculate_timing_analysis()
│ ├── calculate_pressure_performance()
│ ├── calculate_position_mastery() # Uses xyz
│ └── calculate_trade_network()
│
├── MetaProcessor (Tier 4: META)
│ ├── calculate_stability()
│ ├── calculate_side_preference()
│ ├── calculate_opponent_adaptation()
│ ├── calculate_map_specialization()
│ └── calculate_session_pattern()
│
└── CompositeProcessor (Tier 5: COMPOSITE)
├── normalize_and_standardize() # Z-score normalization
├── calculate_radar_scores() # 8 dimensions
└── classify_tier() # Elite/Advanced/Intermediate/Beginner
3.2 Processor接口标准
每个processor实现统一接口:
class BaseFeatureProcessor:
@staticmethod
def calculate(steam_id: str, conn_l2: sqlite3.Connection) -> dict:
"""
计算该processor负责的所有特征
Args:
steam_id: 玩家Steam ID
conn_l2: L2数据库连接
Returns:
dict: {column_name: value, ...}
"""
pass
3.3 依赖关系
Tier 1 (CORE) → 无依赖,直接从L2计算
Tier 2 (TACTICAL) → 可能依赖Tier 1的total_rounds等基础值
Tier 3 (INTELLIGENCE) → 独立计算,从L2 events表
Tier 4 (META) → 依赖Tier 1的rating等基础统计
Tier 5 (COMPOSITE) → 依赖Tier 1-4的所有特征,最后计算
计算顺序:
- BasicProcessor (CORE)
- TacticalProcessor + IntelligenceProcessor (并行,无依赖)
- MetaProcessor (需要CORE的rating)
- CompositeProcessor (需要所有前置特征)
Part 4: Web Services 架构
4.1 Service层重构
原则:
- Services只做查询,不做计算
- 复杂聚合逻辑在L3 Processor完成
- Service提供便捷的数据访问接口
# web/services/player_service.py (新建)
class PlayerService:
"""玩家特征查询服务"""
@staticmethod
def get_player_features(steam_id: str) -> dict:
"""获取玩家完整特征(dm_player_features一行)"""
pass
@staticmethod
def get_player_radar_data(steam_id: str) -> dict:
"""获取雷达图数据(8个维度)"""
pass
@staticmethod
def get_player_core_stats(steam_id: str) -> dict:
"""获取核心统计(Dashboard用)"""
pass
@staticmethod
def get_player_history(steam_id: str, limit: int = 20) -> list:
"""获取最近N场历史(趋势图用)"""
pass
@staticmethod
def get_player_map_stats(steam_id: str) -> list:
"""获取各地图统计"""
pass
@staticmethod
def get_player_weapon_stats(steam_id: str, top_n: int = 10) -> list:
"""获取Top N武器统计"""
pass
@staticmethod
def get_players_ranking(
order_by: str = 'core_avg_rating',
limit: int = 100,
offset: int = 0
) -> list:
"""获取玩家排行榜"""
pass
@staticmethod
def compare_players(steam_ids: list) -> dict:
"""对比多个玩家的特征"""
pass
# web/services/stats_service.py (重构)
class StatsService:
"""统计分析服务(保留现有L2查询方法)"""
# 保留原有方法,用于match detail等非profile页面
@staticmethod
def get_match_stats(match_id: str) -> dict:
"""获取比赛统计(从L2 fact_matches)"""
pass
@staticmethod
def get_round_events(match_id: str, round_num: int) -> list:
"""获取回合事件(从L2 fact_round_events)"""
pass
# 新增:全局统计查询
@staticmethod
def get_global_stats() -> dict:
"""全局统计:总场次、总玩家、平均rating等"""
pass
4.2 Routes层适配
# web/routes/players.py (重构)
from web.services.player_service import PlayerService
@bp.route('/profile/<steam_id>')
def player_profile(steam_id):
"""玩家Profile页面"""
# 1. 获取玩家基本信息(dim_players)
player_info = PlayerService.get_player_info(steam_id)
# 2. 获取特征数据(dm_player_features)
features = PlayerService.get_player_features(steam_id)
# 3. 获取历史趋势(dm_player_match_history)
history = PlayerService.get_player_history(steam_id, limit=20)
# 4. 获取地图统计(dm_player_map_stats)
map_stats = PlayerService.get_player_map_stats(steam_id)
# 5. 获取武器统计(dm_player_weapon_stats)
weapon_stats = PlayerService.get_player_weapon_stats(steam_id, top_n=10)
return render_template('players/profile.html',
player=player_info,
features=features,
history=history,
map_stats=map_stats,
weapon_stats=weapon_stats)
@bp.route('/api/players/<steam_id>/features')
def api_player_features(steam_id):
"""API: 获取玩家特征(JSON)"""
features = PlayerService.get_player_features(steam_id)
return jsonify(features)
@bp.route('/api/players/ranking')
def api_ranking():
"""API: 玩家排行榜"""
order_by = request.args.get('order_by', 'core_avg_rating')
limit = int(request.args.get('limit', 100))
offset = int(request.args.get('offset', 0))
players = PlayerService.get_players_ranking(
order_by=order_by,
limit=limit,
offset=offset
)
return jsonify(players)
4.3 Template数据映射
profile.html结构:
{# Dashboard Cards #}
<div class="dashboard">
<div class="card">Rating: {{ features.core_avg_rating }}</div>
<div class="card">K/D: {{ features.core_avg_kd }}</div>
<div class="card">ADR: {{ features.core_avg_adr }}</div>
<div class="card">KAST: {{ features.core_avg_kast }}%</div>
</div>
{# Radar Chart #}
<canvas id="radarChart" data-scores='{{
{
"aim": features.score_aim,
"clutch": features.score_clutch,
"pistol": features.score_pistol,
"defense": features.score_defense,
"utility": features.score_utility,
"stability": features.score_stability,
"economy": features.score_economy,
"pace": features.score_pace
} | tojson
}}'></canvas>
{# Trend Chart #}
<canvas id="trendChart" data-history='{{ history | tojson }}'></canvas>
{# Core Performance Section #}
<div class="stats-grid">
<div>Rating: {{ features.core_avg_rating | round(2) }}</div>
<div>K/D: {{ features.core_avg_kd | round(2) }}</div>
<div>KAST: {{ (features.core_avg_kast * 100) | round(1) }}%</div>
<div>RWS: {{ features.core_avg_rws | round(1) }}</div>
<div>ADR: {{ features.core_avg_adr | round(1) }}</div>
</div>
{# Gunfight Section #}
<div class="stats-grid">
<div>Avg HS: {{ features.core_avg_hs_kills | round(1) }}</div>
<div>HS Rate: {{ (features.core_hs_rate * 100) | round(1) }}%</div>
<div>Assists: {{ features.core_avg_assists | round(1) }}</div>
<div>AWP K: {{ features.core_avg_awp_kills | round(1) }}</div>
<div>Knife K: {{ features.core_avg_knife_kills | round(2) }}</div>
<div>Zeus K: {{ features.core_avg_zeus_kills | round(2) }}</div>
</div>
{# Opening Impact Section #}
<div class="stats-grid">
<div>FK: {{ features.tac_avg_fk | round(1) }}</div>
<div>FD: {{ features.tac_avg_fd | round(1) }}</div>
<div>FK Rate: {{ (features.tac_fk_rate * 100) | round(1) }}%</div>
<div>FD Rate: {{ (features.tac_fd_rate * 100) | round(1) }}%</div>
</div>
{# Clutch Section #}
<div class="stats-grid">
<div>1v1: {{ features.tac_clutch_1v1_wins }}/{{ features.tac_clutch_1v1_attempts }} ({{ (features.tac_clutch_1v1_rate * 100) | round(1) }}%)</div>
<div>1v2: {{ features.tac_clutch_1v2_wins }}/{{ features.tac_clutch_1v2_attempts }} ({{ (features.tac_clutch_1v2_rate * 100) | round(1) }}%)</div>
<div>1v3+: {{ features.tac_clutch_1v3_plus_wins }}/{{ features.tac_clutch_1v3_plus_attempts }} ({{ (features.tac_clutch_1v3_plus_rate * 100) | round(1) }}%)</div>
</div>
{# High IQ Kills Section #}
<div class="stats-grid">
<div>Wallbang: {{ features.int_wallbang_kills }} ({{ (features.int_wallbang_rate * 100) | round(2) }}%)</div>
<div>Smoke: {{ features.int_smoke_kills }} ({{ (features.int_smoke_kill_rate * 100) | round(2) }}%)</div>
<div>Blind: {{ features.int_blind_kills }} ({{ (features.int_blind_kill_rate * 100) | round(2) }}%)</div>
<div>NoScope: {{ features.int_noscope_kills }} ({{ (features.int_noscope_rate * 100) | round(2) }}%)</div>
<div>IQ Score: {{ features.int_high_iq_score | round(1) }}</div>
</div>
{# Map Stats Section #}
{% for map_stat in map_stats %}
<div class="map-row">
<span>{{ map_stat.map_name }}</span>
<span>{{ map_stat.matches }}场</span>
<span>{{ (map_stat.win_rate * 100) | round(1) }}%</span>
<span>{{ map_stat.avg_rating | round(2) }}</span>
</div>
{% endfor %}
{# Weapon Stats Section #}
{% for weapon in weapon_stats %}
<div class="weapon-row">
<span>{{ weapon.weapon_name }}</span>
<span>{{ weapon.total_kills }}击杀</span>
<span>{{ (weapon.hs_rate * 100) | round(1) }}% HS</span>
<span>{{ (weapon.usage_rate * 100) | round(1) }}%使用率</span>
</div>
{% endfor %}
Part 5: 实施计划
Phase 1: Schema & Infrastructure (1-2 days)
- ✅ 创建L3 schema (dm_player_features + 辅助表)
- ✅ 初始化L3.db
- ✅ 创建processor基类
Phase 2: Core Processors (2-3 days)
- 实现BasicProcessor (Tier 1)
- 实现TacticalProcessor (Tier 2)
- 测试基础特征计算
Phase 3: Advanced Processors (2-3 days)
- 实现IntelligenceProcessor (Tier 3)
- 实现MetaProcessor (Tier 4)
- 实现CompositeProcessor (Tier 5)
Phase 4: Services Refactoring (1-2 days)
- 创建PlayerService
- 重构StatsService
- 更新Routes层
Phase 5: Testing & Validation (1 day)
- 运行L3_Builder完整构建
- 验证特征计算正确性
- Performance测试
Phase 6: Frontend Integration (2 days)
- 更新profile.html模板
- 适配新的feature字段
- 测试UI展示
Part 6: 关键技术点
6.1 标准化与归一化
Z-score标准化(用于Composite Score):
def z_score_normalize(value, mean, std):
"""Z-score标准化到0-100"""
if std == 0:
return 50.0
z = (value - mean) / std
# 将z-score映射到0-100,mean=50
normalized = 50 + (z * 15) # ±3σ覆盖约99.7%
return max(0, min(100, normalized))
6.2 加权评分计算
示例:AIM Score
def calculate_aim_score(features, all_players_stats):
"""
AIM Score = 25% Rating + 20% KD + 15% ADR + 10% DuelWin + 10% HighEloKD + 20% MultiKill
"""
weights = {
'rating': 0.25,
'kd': 0.20,
'adr': 0.15,
'duel_win': 0.10,
'high_elo_kd': 0.10,
'multikill': 0.20
}
# 分别标准化每个组件
rating_norm = z_score_normalize(features['core_avg_rating'],
all_players_stats['rating_mean'],
all_players_stats['rating_std'])
kd_norm = z_score_normalize(features['core_avg_kd'],
all_players_stats['kd_mean'],
all_players_stats['kd_std'])
# ... 其他组件
# 加权求和
aim_score = (rating_norm * weights['rating'] +
kd_norm * weights['kd'] +
# ... 其他)
return aim_score
6.3 时间窗口分析
Trade Kill识别(5秒窗口):
WITH death_events AS (
SELECT
match_id, round_num, event_time,
victim_steam_id as dead_player,
attacker_steam_id as killer
FROM fact_round_events
WHERE event_type = 'kill' AND victim_steam_id IN (
SELECT steam_id FROM team_mates -- 同队队友
)
),
trade_kills AS (
SELECT
e1.attacker_steam_id,
COUNT(*) as trade_count
FROM fact_round_events e1
JOIN death_events d
ON e1.match_id = d.match_id
AND e1.round_num = d.round_num
AND e1.victim_steam_id = d.killer -- 杀死队友的敌人
AND e1.event_time BETWEEN d.event_time AND d.event_time + 5 -- 5秒内
WHERE e1.event_type = 'kill'
GROUP BY e1.attacker_steam_id
)
6.4 位置聚类分析
基于xyz的位置分类:
from sklearn.cluster import DBSCAN
import numpy as np
def cluster_positions(xyz_data):
"""
使用DBSCAN聚类识别常用位置
Args:
xyz_data: [(x, y, z), ...]
Returns:
cluster_labels, position_names
"""
coords = np.array(xyz_data)
# DBSCAN参数:eps=距离阈值,min_samples=最小点数
clustering = DBSCAN(eps=500, min_samples=5).fit(coords)
labels = clustering.labels_
# 为每个cluster分配语义化名称(基于map区域)
position_names = map_cluster_to_semantic_name(coords, labels)
return labels, position_names
Part 7: 数据质量保证
7.1 空值处理策略
class SafeAggregator:
@staticmethod
def safe_divide(numerator, denominator, default=0.0):
"""安全除法"""
if denominator == 0 or denominator is None:
return default
return numerator / denominator
@staticmethod
def safe_avg(values, default=0.0):
"""安全平均"""
if not values or len(values) == 0:
return default
return sum(values) / len(values)
7.2 最小样本量要求
MIN_MATCHES_FOR_FEATURES = {
'core': 5, # 基础统计至少5场
'tactical': 10, # 战术分析至少10场
'intelligence': 15, # 智能分析至少15场
'meta': 20, # 元数据分析至少20场
'composite': 20, # 综合评分至少20场
}
def check_sample_size(steam_id, tier):
"""检查是否满足最小样本量"""
match_count = get_player_match_count(steam_id)
return match_count >= MIN_MATCHES_FOR_FEATURES[tier]
Part 8: 性能优化策略
8.1 批量计算
# L3_Builder.py 主循环
def rebuild_all_features():
"""批量重建所有玩家特征"""
players = get_all_players() # 从dim_players获取
for player in players:
steam_id = player['steam_id_64']
# 计算所有特征
features = {}
features.update(BasicProcessor.calculate(steam_id, conn_l2))
features.update(TacticalProcessor.calculate(steam_id, conn_l2))
features.update(IntelligenceProcessor.calculate(steam_id, conn_l2))
features.update(MetaProcessor.calculate(steam_id, conn_l2))
features.update(CompositeProcessor.calculate(steam_id, conn_l2, features))
# 批量写入
upsert_player_features(steam_id, features)
# 每100个玩家提交一次
if len(batch) >= 100:
conn_l3.commit()
8.2 增量更新
def update_player_features_incremental(steam_id, new_match_id):
"""增量更新:仅计算新增match影响的特征"""
# 1. 获取现有特征
old_features = get_player_features(steam_id)
# 2. 计算新match的统计
new_match_stats = get_match_player_stats(new_match_id, steam_id)
# 3. 增量更新(rolling average等)
updated_features = incremental_update(old_features, new_match_stats)
# 4. 更新数据库
upsert_player_features(steam_id, updated_features)
8.3 查询优化
-- 创建必要的索引
CREATE INDEX idx_match_players_steam ON fact_match_players(steam_id_64);
CREATE INDEX idx_round_events_attacker ON fact_round_events(attacker_steam_id);
CREATE INDEX idx_round_events_victim ON fact_round_events(victim_steam_id);
CREATE INDEX idx_round_events_time ON fact_round_events(match_id, round_num, event_time);
总结
本架构方案实现了:
✅ 特征去重:消除Profile中的所有重复指标
✅ 深度挖掘:利用rounds/events/economy数据进行高级特征工程
✅ 模块化设计:5层processor清晰分工,易于维护扩展
✅ 服务解耦:web/services只做查询,不做计算
✅ 性能优化:批量计算 + 增量更新 + 查询索引
✅ 质量保证:空值处理 + 最小样本量 + 标准化流程
预期效果:
- L3表包含207列精心设计的特征
- 支持完整的Profile界面展示
- 计算性能:1000玩家约10-15分钟
- 查询性能:单玩家profile加载 < 100ms
下一步开始实施!