Feature request
We have a couple tensors whose construction is highly dynamic and can't be captured while tracing (cu_seqlens, vision(image/video)_cu_seqlens, ... , vision(image/video)_position_ids, ...). I'm opening this issue to track what we are gonna do about them. The two options are either using the processor since that's where we have the most of dynamism / per-sample processing loops. or unifying the collator API (processor - > collator -> inference) as part of the default path.
Motivation
see #45396
Your contribution
can implement it :)
Feature request
We have a couple tensors whose construction is highly dynamic and can't be captured while tracing (cu_seqlens, vision(image/video)_cu_seqlens, ... , vision(image/video)_position_ids, ...). I'm opening this issue to track what we are gonna do about them. The two options are either using the processor since that's where we have the most of dynamism / per-sample processing loops. or unifying the collator API (processor - > collator -> inference) as part of the default path.
Motivation
see #45396
Your contribution
can implement it :)