
Regarding ramen, I found that only the instances of chopsticks and egg were averaged when calculating IoU in the final step. After inspecting the evaluation script, I discovered a bug where a mask filename did not match the corresponding GT filename. Following the fix, the metrics showed a noticeable drop. The same issue also applies to figurines and teatime.