docs: 更新 L1A 增量功能说明及 ETL 文档
This commit is contained in:
12
ETL/L1A.py
12
ETL/L1A.py
@@ -1,4 +1,16 @@
|
|||||||
|
"""
|
||||||
|
L1A Data Ingestion Script
|
||||||
|
|
||||||
|
This script reads raw JSON files from the 'output_arena' directory and ingests them into the SQLite database.
|
||||||
|
It supports incremental updates by default, skipping files that have already been processed.
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
python ETL/L1A.py # Standard incremental run
|
||||||
|
python ETL/L1A.py --force # Force re-process all files (overwrite existing data)
|
||||||
|
"""
|
||||||
|
|
||||||
import os
|
import os
|
||||||
|
|
||||||
import json
|
import json
|
||||||
import sqlite3
|
import sqlite3
|
||||||
import glob
|
import glob
|
||||||
|
|||||||
@@ -1,7 +1,23 @@
|
|||||||
L1A output_arena/iframe_network.json -> L1A.sqlite(Primary Key: match_id)
|
# ETL Pipeline Documentation
|
||||||
|
|
||||||
|
## 1. L1A (Raw Data Ingestion)
|
||||||
|
**Status**: ✅ Supports Incremental Update
|
||||||
|
|
||||||
|
This script ingests raw JSON files from `output_arena/` into `database/L1A/L1A.sqlite`.
|
||||||
|
|
||||||
|
### Usage
|
||||||
|
```bash
|
||||||
|
# Standard Run (Incremental)
|
||||||
|
# Only processes new files that are not yet in the database.
|
||||||
|
python ETL/L1A.py
|
||||||
|
|
||||||
|
# Force Refresh
|
||||||
|
# Reprocesses ALL files, overwriting existing records.
|
||||||
|
python ETL/L1A.py --force
|
||||||
|
```
|
||||||
|
|
||||||
L1B demoparser2 -> L1B.sqlite
|
L1B demoparser2 -> L1B.sqlite
|
||||||
|
|
||||||
L2 L1A.sqlite (+L1b.sqlite) -> L2.sqlite
|
L2 L1A.sqlite (+L1b.sqlite) -> L2.sqlite
|
||||||
|
|
||||||
L3 Deep Dive.
|
L3 Deep Dive
|
||||||
Reference in New Issue
Block a user