The approach authored by Constantin Patsch, Marsil Zakour, Yuankai Wu, Eckehard Steinbach achieved the 2nd place at the Egocentric Vision Workshop (EgoVis)@ CVPR 2025 on the mistake detection benchmark of the HoloAssist dataset, featuring challenging egocentric assembly scenarios.
The model addresses the task of online mistake detection, where real-time video analysis allows human operators to correct errors as they occur. Upon detecting an error, a large language model (LLM) is further leveraged to generate explanatory feedback.