Skip to content

feat: unified Transformers4Rec with CLM/MLM/PLM/RTD training objective#707

Open
hieuddo wants to merge 3 commits into
PreferredAI:masterfrom
hieuddo:seq
Open

feat: unified Transformers4Rec with CLM/MLM/PLM/RTD training objective#707
hieuddo wants to merge 3 commits into
PreferredAI:masterfrom
hieuddo:seq

Conversation

@hieuddo

@hieuddo hieuddo commented Jul 5, 2026

Copy link
Copy Markdown
Member

Description

Previously, we ported several sequential models from https://github.com/PreferredAI/CoVE. In the context of CoVE, we experimented and found out that a specific setting leads to better performance: CLM training objective + sequence breakdown ([a,b,c,d] was broken down to [a]->b, [a,b]->c, [a,b,c]->d) and we only calculated loss at the last position (e.g., for [a,b,c]->d, we only calculated loss at the predicted output with input [a,b,c].

For general use, I think we should have the options for other training objectives (CLM, MLM, PLM, and RTD), all four derived from https://github.com/NVIDIA-Merlin/Transformers4Rec.

So here, we try to aggregate all Transformers-based models into one unified class, with all four training objectives. We still keep the one setting we believe is the best, mentioned above (CLM + session breakdown + loss at last position).

Checklist:

  • I have added tests.
  • I have updated the documentation accordingly.
  • I have updated README.md (if you are adding a new model).
  • I have updated examples/README.md (if you are adding a new example).

@hieuddo

hieuddo commented Jul 5, 2026

Copy link
Copy Markdown
Member Author

Quick tuning with two datasets, sorted by val_MRR descendingly.

Dataset: diginetica

combo val_AUC val_MRR val_NDCG@10 val_Recall@10 test_AUC test_MRR test_NDCG@10 test_Recall@10 train_s
gpt2-clm-all 0.7892 0.3458 0.3775 0.4855 0.8362 0.3163 0.3502 0.4649 32.2
bert-clm-last 0.7605 0.3442 0.3719 0.463 0.7612 0.3235 0.3517 0.4482 113.9
gpt2-clm-last 0.8055 0.3341 0.3637 0.4662 0.8483 0.3324 0.3638 0.4749 105.3
xlnet-plm 0.8093 0.3157 0.3548 0.4855 0.8185 0.2713 0.3086 0.4415 84.0
xlnet-mlm 0.8064 0.3085 0.3431 0.463 0.8498 0.2601 0.3001 0.4482 51.0
bert-mlm 0.8158 0.232 0.2618 0.3762 0.8329 0.214 0.2453 0.3679 31.8
electra-mlm 0.8203 0.2129 0.2369 0.3408 0.8287 0.2092 0.2348 0.3411 34.9
bert-rtd 0.8279 0.2053 0.2233 0.3055 0.8283 0.1926 0.2234 0.3445 69.4
electra-rtd 0.8293 0.1907 0.2116 0.3087 0.8346 0.1815 0.2034 0.301 63.7

Dataset: ml100k

For ml-100k dataset, we load the data as USIT format with UUIT data (rating data doesn't have session_id)

combo val_AUC val_MRR val_NDCG@10 val_Recall@10 test_AUC test_MRR test_NDCG@10 test_Recall@10 train_s
gpt2-clm-last 0.882 0.0673 0.0737 0.1474 0.8687 0.0618 0.0668 0.1347 1471.0
bert-clm-last 0.8871 0.0597 0.063 0.1273 0.8698 0.0514 0.0516 0.1029 1674.0
gpt2-clm-all 0.8355 0.0472 0.0488 0.1007 0.8128 0.0357 0.0327 0.071 50.5
xlnet-mlm 0.8625 0.0471 0.0494 0.1029 0.8344 0.0351 0.0344 0.0795 47.7
xlnet-plm 0.8687 0.0444 0.0423 0.0838 0.8393 0.0335 0.0307 0.0679 75.2
electra-rtd 0.8547 0.0387 0.036 0.071 0.8249 0.0269 0.0222 0.0509 53.4
bert-rtd 0.8517 0.0361 0.0365 0.0774 0.8353 0.0299 0.0279 0.0626 53.8
bert-mlm 0.8504 0.0338 0.0321 0.0668 0.8308 0.0303 0.0268 0.0541 32.4
electra-mlm 0.8575 0.0331 0.0321 0.0721 0.8366 0.0285 0.0244 0.053 35.9

@hieuddo hieuddo requested review from lthoang and qtuantruong July 5, 2026 15:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant