配送视频门头/交付时刻标注规范

目标

为 ONNX 帧分类模型提供训练标签：每条视频标注 门头最佳时刻 与 交付最佳时刻（各 1 个秒数）。

字段说明

字段	类型	必填	说明
`media_id`	int	是	对应 `collection_media.id`
`video_path`	string	是	本地路径或 FTP HTTP URL
`storefront_time_sec`	float	是	门头最佳帧时刻（秒）
`handover_time_sec`	float	是	交付最佳帧时刻（秒）
`store_type`	string	否	便利店/超市/餐饮/其他
`has_voice_marker`	bool	否	是否含「到店/交付」语音
`recorder_sn`	string	否	设备 SN
`driver_date`	string	否	司机+日期，用于 train/val/test 分组
`split`	string	是	`train` / `val` / `test`
`notes`	string	否	badcase 说明

类别边界

storefront（门头）：店招、门牌、店铺入口为主体；人可入画但货品非主体
handover（交付）：货品/包装在画面中心，可见递交、放置、签收动作
other（负样本）：行车、仓库、店内走动、空镜等（训练时自动从非 ±5s 窗口采样）

标注操作

播放整段 MP4，暂停在 最清晰、构图最好 的门头画面，记录当前秒数
继续播放，在 货品交接最清晰 的一帧记录秒数
约束：handover_time_sec > storefront_time_sec（通常相差数十秒以上）
若某条视频无交付场景（仅到店），在 notes 标注「无交付」，该条暂不纳入训练

导出格式（JSONL）

每行一条 JSON：

{"media_id": 123, "video_path": "http://host/collection_media/20250609/123.mp4", "storefront_time_sec": 742.5, "handover_time_sec": 1085.2, "store_type": "便利店", "has_voice_marker": true, "driver_date": "driver001_20250609", "split": "train"}

数据划分

train / val / test = 70% / 15% / 15%
按 driver_date 或 recorder_sn + 日期 分组划分，避免同司机同天视频泄漏到测试集

规模建议

阶段	视频数
POC	80~100
内测	300+
上线	1000+

Label Studio

见 label_studio_config.xml。导入 tools/export_media_list.py 生成的 CSV 后，标注两个时间点并导出 JSON，再用 tools/convert_labelstudio.py 转为 JSONL。