Skip to content

[feat] Native-Resolution Image Synthesis#113

Closed
pufanyi wants to merge 33 commits intomainfrom
pufanyi/nit
Closed

[feat] Native-Resolution Image Synthesis#113
pufanyi wants to merge 33 commits intomainfrom
pufanyi/nit

Conversation

@pufanyi
Copy link
Copy Markdown
Collaborator

@pufanyi pufanyi commented Dec 10, 2025

Currently support c2i

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +186 to +190
label = torch.tensor(batch_label)
hw_list = torch.tensor(hw_list, dtype=torch.int32)

# # Move tensors to model device for FSDP2 compatibility
# device = next(self.model.parameters()).device
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Move NiT training metadata to the model device

During NitTrainer.compute_loss the labels and hw_list remain on CPU (torch.tensor(...) with no .to(device) after the move block was commented out), while the latents/noise and model live on CUDA. When FlowMatchingLoss calls the NiT model, the label embeddings and rotary cache use these CPU tensors, which will raise a device mismatch as soon as training runs on GPU. Please move label and hw_list to the model device before building model_kwargs.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +186 to +188
label = torch.tensor(batch_label)
hw_list = torch.tensor(hw_list, dtype=torch.int32)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Move labels and shapes to model device before forward

The new NiT trainer builds label and hw_list tensors on CPU but then passes them directly into self.loss_fn/self.model while images and latents are on CUDA. NitModel embeds y and computes RoPE grids inside the forward pass, so CPU indices against CUDA buffers will raise a device mismatch as soon as training starts. The commented-out .to(device) calls just below suggest these tensors were intended to be moved. Please transfer label and hw_list to the model device before invoking the model to avoid runtime crashes when running on GPUs.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrong config edit?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😱😱😱

Copy link
Copy Markdown
Collaborator

@kcz358 kcz358 Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this part nvidia radio files all necessary? Would be nice if we can remove some parts if the modeling nit is not actually using it

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I will try it

… calculate token dimensions and return class ID.
…tDataProcessor` to accept a dictionary row.
…ating type hints in NitDataset and NitProcessor.
…, and modify example configuration for dataset format and output directory.
…itTrainer for enhanced model configuration and loss computation
@Luodian Luodian closed this Feb 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants