diff --git a/README.md b/README.md index e271822b..8f39a17b 100644 --- a/README.md +++ b/README.md @@ -74,6 +74,7 @@ Click-through rate (CTR) prediction is a critical task for various industrial ap | 38 | SIGIR'23 | [EulerNet](./model_zoo/EulerNet) | [EulerNet: Adaptive Feature Interaction Learning via Euler's Formula for CTR Prediction](https://dl.acm.org/doi/10.1145/3539618.3591681) :triangular_flag_on_post:**Huawei** | [:arrow_upper_right:](https://github.com/Ethan-TZ/EulerNet/tree/main/%23Code4FuxiCTR%23) | `torch` | | 39 | CIKM'23 | [GDCN](./model_zoo/GDCN) | [Towards Deeper, Lighter and Interpretable Cross Network for CTR Prediction](https://dl.acm.org/doi/pdf/10.1145/3583780.3615089) :triangular_flag_on_post:**Microsoft** | | `torch` | | 40 | ICML'24 | [WuKong](./model_zoo/WuKong) | [Wukong: Towards a Scaling Law for Large-Scale Recommendation](https://arxiv.org/abs/2403.02545) :triangular_flag_on_post:**Meta** | [:arrow_upper_right:](https://github.com/reczoo/BARS/tree/main/ranking/ctr/WuKong) | `torch` | +| 41 | KDD'25 | [QNN-α](./model_zoo/QNN) | [Revisiting Feature Interactions from the Perspective of Quadratic Neural Networks for Click-through Rate Prediction](https://arxiv.org/abs/2505.17999) :triangular_flag_on_post:**Huawei** | [:arrow_upper_right:](https://github.com/salmon1802/QNN/tree/main/checkpoints) | `torch` | |:open_file_folder: **Behavior Sequence Modeling**| | 42 | KDD'18 | [DIN](./model_zoo/DIN) | [Deep Interest Network for Click-Through Rate Prediction](https://www.kdd.org/kdd2018/accepted-papers/view/deep-interest-network-for-click-through-rate-prediction) :triangular_flag_on_post:**Alibaba** | [:arrow_upper_right:](https://github.com/reczoo/BARS/tree/main/ranking/ctr/DIN) | `torch` | | 43 | AAAI'19 | [DIEN](./model_zoo/DIEN) | [Deep Interest Evolution Network for Click-Through Rate Prediction](https://arxiv.org/abs/1809.03672) :triangular_flag_on_post:**Alibaba** | [:arrow_upper_right:](https://github.com/reczoo/BARS/tree/main/ranking/ctr/DIEN) | `torch` | diff --git a/model_zoo/QNN/README.md b/model_zoo/QNN/README.md new file mode 100644 index 00000000..e6d26690 --- /dev/null +++ b/model_zoo/QNN/README.md @@ -0,0 +1,81 @@ +# Revisiting Feature Interactions from the Perspective of Quadratic Neural Networks for Click-through Rate Prediction + +## Model Overview + +Click-through rate (CTR) prediction aims to accurately estimate user click behavior by leveraging multiple features, including user profiles, item attributes, and contextual information. +Existing CTR prediction models typically employ the Hadamard product for feature interaction, but the underlying mechanism remains insufficiently explored. +From the perspective of quadratic neural networks (QNN), we conducts both theoretical and empirical analyses of the Hadamard product-based feature interaction mechanism, and provides a novel interpretive framework to systematically explain its effectiveness and limitations. +Furthermore, we innovatively introduce multi-head Khatri–Rao products as an efficient alternative to the Hadamard product and propose a Self-Ensemble Loss, which further improves model performance without increasing inference latency. + +
+QNN model +
+ + + +## Requirements + +We have tested FinalMLP with the following requirements. + +```python +python: 3.8 +pytorch: 1.10 +fuxictr: 2.0.1 +``` + +## Configuration Guide + + +The `dataset_config.yaml` file contains all the dataset settings as follows. + +| Params | Type | Default | Description | +| ----------------------------- | ---- | ------- | --------------------------------------------------------------------------------------------------------------------------------------- | +| data_root | str | | the root directory to load and save data data | +| data_format | str | | input data format, "h5", "csv", or "tfrecord" supported | +| train_data | str | None | training data path | +| valid_data | str | None | validation data path | +| test_data | str | None | test data path | +| min_categr_count | int | 1 | min count to filter category features, | +| feature_cols | list | | a list of features with the following dict keys | +| feature_cols::name | str\|list | | feature column name in csv. A list is allowed in which the features have the same feature type and will be expanded accordingly. | +| feature_cols::active | bool | | whether to use the feature | +| feature_cols::dtype | str | | the input data dtype, "int"\|"str" | +| feature_cols::type | str | | feature type "numeric"\|"categorical"\|"sequence"\|"meta" | +| label_col | dict | | specify label column | +| label_col::name | str | | label column name in csv | +| label_col::dtype | str | | label data dtype | + + + +The `model_config.yaml` file contains all the model hyper-parameters as follows. + +| Params | Type | Default | Description | +| ----------------------- | --------------- | ----------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| model | str | "QNN_alpha" | model name,  which should be same with model class name | +| dataset_id | str | "TBD" | dataset_id to be determined | +| loss | str | "binary_crossentropy" | loss function | +| metrics | list | ['logloss', 'AUC'] | a list of metrics for evaluation | +| task | str | "binary_classification" | task type supported: ```"regression"```, ```"binary_classification"``` | +| optimizer | str | "adam" | optimizer used for training | +| learning_rate | float | 1.0e-3 | learning rate | +| embedding_regularizer | float\|str | 0 | regularization weight for embedding matrix: L2 regularization is applied by default. Other optional examples: ```"l2(1.e-3)"```, ```"l1(1.e-3)"```, ```"l1_l2(1.e-3, 1.e-3)"```. | +| net_regularizer | float\|str | 0 | regularization weight for network parameters: L2 regularization is applied by default. Other optional examples: ```"l2(1.e-3)"```, ```"l1(1.e-3)"```, ```"l1_l2(1.e-3, 1.e-3)"```. | +| batch_size | int | 10000 | batch size, usually a large number for CTR prediction task | +| embedding_dim | int | 32 | embedding dimension of features. Note that field-wise embedding_dim can be specified in ```feature_specs```. | +| num_layers | int | 3 | number of network layers | +| num_row | int | 3 | number of rows of Khatri-Rao Product, hyperparameter M in paper | +| net_dropout | float | 0 | dropout rate in QNN | +| batch_norm | bool | False | whether using BN in QNN | +| num_heads | int | 1 | number of heads used for Khatri-Rao Product | +| epochs | int | 100 | the max number of epochs for training, which can early stop via monitor metrics. | +| shuffle | bool | True | whether shuffle the data samples for each epoch of training | +| seed | int | 2021 | the random seed used for reproducibility | +| monitor | str\|dict | 'AUC' | the monitor metrics for early stopping. It supports a single metric, e.g., ```"AUC"```. It also supports multiple metrics using a dict, e.g., {"AUC": 2, "logloss": -1} means ```2*AUC - logloss```. | +| monitor_mode | str | 'max' | ```"max"``` means that the higher the better, while ```"min"``` denotes that the lower the better. | +| model_root | str | './checkpoints/' | the dir to save model checkpoints and running logs | +| early_stop_patience | int | 2 | training is stopped when monitor metric fails to become better for ```early_stop_patience=2```consective evaluation intervals. | +| save_best_only | bool | True | whether to save the best model checkpoint only | +| eval_steps | int\|None | None | evaluate the model on validation data every ```eval_steps```. By default, ```None``` means evaluation every epoch. | + + +#### For reproducing the results, please refer to https://github.com/salmon1802/QNN/tree/main/checkpoints diff --git a/model_zoo/QNN/config/dataset_config.yaml b/model_zoo/QNN/config/dataset_config.yaml new file mode 100644 index 00000000..a1dbabfe --- /dev/null +++ b/model_zoo/QNN/config/dataset_config.yaml @@ -0,0 +1,7 @@ +### Tiny data for tests only +tiny_npz: + data_root: ../../data/ + data_format: npz + train_data: ../../data/tiny_npz/train.npz + valid_data: ../../data/tiny_npz/valid.npz + test_data: ../../data/tiny_npz/test.npz \ No newline at end of file diff --git a/model_zoo/QNN/config/model_config.yaml b/model_zoo/QNN/config/model_config.yaml new file mode 100644 index 00000000..4cec732f --- /dev/null +++ b/model_zoo/QNN/config/model_config.yaml @@ -0,0 +1,36 @@ +Base: + model_root: './checkpoints/' + num_workers: 8 + verbose: 1 + early_stop_patience: 2 + pickle_feature_encoder: True + save_best_only: True + eval_steps: null + debug_mode: False + group_id: null + use_features: null + feature_specs: null + feature_config: null + +QNN_alpha_default: # This is a config template + model: QNN_alpha + dataset_id: TBD + loss: 'binary_crossentropy' + metrics: ['logloss', 'AUC'] + task: binary_classification + optimizer: adam + learning_rate: 1.e-3 + embedding_regularizer: 1.e-5 + net_regularizer: 0 + batch_size: 10000 + embedding_dim: 16 + num_layers: 3 + num_row: 3 + net_dropout: 0.1 + num_heads: 1 + batch_norm: True + epochs: 100 + shuffle: True + seed: 2025 + monitor: {'AUC': 1, 'logloss': 0} + monitor_mode: 'max' \ No newline at end of file diff --git a/model_zoo/QNN/fuxictr_version.py b/model_zoo/QNN/fuxictr_version.py new file mode 100644 index 00000000..2b5cb266 --- /dev/null +++ b/model_zoo/QNN/fuxictr_version.py @@ -0,0 +1,3 @@ +# pip install -U fuxictr +import fuxictr +assert fuxictr.__version__ == "2.0.1" diff --git a/model_zoo/QNN/run_expid.py b/model_zoo/QNN/run_expid.py new file mode 100644 index 00000000..669c2804 --- /dev/null +++ b/model_zoo/QNN/run_expid.py @@ -0,0 +1,96 @@ +# ========================================================================= +# Copyright (C) 2022. Huawei Technologies Co., Ltd. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ========================================================================= + +import sys +sys.path.append("/home/lhh/code") +import os +os.chdir(os.path.dirname(os.path.realpath(__file__))) +import sys +import logging +from datetime import datetime +from fuxictr.utils import load_config, set_logger, print_to_json, print_to_list, delete_model_files +from fuxictr.features import FeatureMap +from fuxictr.pytorch.torch_utils import seed_everything +from fuxictr.pytorch.dataloaders import H5DataLoader +from fuxictr.preprocess import FeatureProcessor, build_dataset +from fuxictr.datasets.criteo import FeatureProcessor +import src as model_zoo +import gc +import argparse +import os +from pathlib import Path +import os, torch +os.environ['CUDA_LAUNCH_BLOCKING'] = "1" + +if __name__ == '__main__': + ''' Usage: python run_expid.py --config {config_dir} --expid {experiment_id} --gpu {gpu_device_id} + ''' + parser = argparse.ArgumentParser() + parser.add_argument('--config', type=str, default='./config/', help='The config directory.') + parser.add_argument('--expid', type=str, default='QNN_T26_Tenrec', help='The experiment id to run.') + parser.add_argument('--gpu', type=int, default=0, help='The gpu index, -1 for cpu') + args = vars(parser.parse_args()) + + experiment_id = args['expid'] + params = load_config(args['config'], experiment_id) + params['gpu'] = args['gpu'] + set_logger(params) + logging.info("Params: " + print_to_json(params)) + seed_everything(seed=params['seed']) + + data_dir = os.path.join(params['data_root'], params['dataset_id']) + feature_map_json = os.path.join(data_dir, "feature_map.json") + if params["data_format"] == "csv": + # Build feature_map and transform h5 data + feature_encoder = FeatureProcessor(**params) + params["train_data"], params["valid_data"], params["test_data"] = \ + build_dataset(feature_encoder, **params) + feature_map = FeatureMap(params['dataset_id'], data_dir) + feature_map.load(feature_map_json, params) + logging.info("Feature specs: " + print_to_json(feature_map.features)) + + model_class = getattr(model_zoo, params['model']) + model = model_class(feature_map, **params) + model.count_parameters() # print number of parameters used in model + + train_gen, valid_gen = H5DataLoader(feature_map, stage='train', **params).make_iterator() + model.fit(train_gen, validation_data=valid_gen, **params) + + logging.info('****** Validation evaluation ******') + valid_result = model.evaluate(valid_gen) + del train_gen, valid_gen + gc.collect() + + logging.info('******** Test evaluation ********') + test_gen = H5DataLoader(feature_map, stage='test', **params).make_iterator() + test_result = {} + model.testing = True + if test_gen: + test_result = model.evaluate(test_gen) + # model.device = torch.device("cpu") + # model.model_to_device() + # avg_inference_time_per_sample = model.latency(test_gen) + + result_filename = Path(args['config']).name.replace(".yaml", "") + '.csv' + with open(result_filename, 'a+') as fw: + fw.write(' {},[command] python {},[exp_id] {},[dataset_id] {},[train] {},[val] {},[test] {}\n' \ + .format(datetime.now().strftime('%Y%m%d-%H%M%S'), + ' '.join(sys.argv), experiment_id, params['dataset_id'], + "N.A.", print_to_list(valid_result), print_to_list(test_result))) + + model_dir = os.path.join(params["model_root"], feature_map.dataset_id) + delete_model_files(model_dir, params["model_id"]) + diff --git a/model_zoo/QNN/src/QNN.py b/model_zoo/QNN/src/QNN.py new file mode 100644 index 00000000..435544e2 --- /dev/null +++ b/model_zoo/QNN/src/QNN.py @@ -0,0 +1,251 @@ +# ========================================================================= +# Copyright (C) 2025. salmon1802@github. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ========================================================================= + +import torch +from torch import nn +from fuxictr.pytorch.models import BaseModel +from fuxictr.pytorch.layers import FeatureEmbedding +from fuxictr.pytorch.torch_utils import get_activation +import torch.nn.functional as F + + +class QNN(BaseModel): + def __init__(self, + feature_map, + model_id="QNN", + gpu=-1, + learning_rate=1e-3, + embedding_dim=16, + num_cross_layers=3, + net_dropout=0, + batch_norm=False, + hidden_activations='ReLU', + neuron_type='T1', + embedding_regularizer=None, + net_regularizer=None, + **kwargs): + super(QNN, self).__init__(feature_map, + model_id=model_id, + gpu=gpu, + embedding_regularizer=embedding_regularizer, + net_regularizer=net_regularizer, + **kwargs) + self.embedding_layer = FeatureEmbedding(feature_map, embedding_dim) + input_dim = feature_map.sum_emb_out_dim() + self.qnn = QuadraticNeuralNetworks(input_dim=input_dim, + num_cross_layers=num_cross_layers, + net_dropout=net_dropout, + hidden_activations=hidden_activations, + neuron_type=neuron_type, + batch_norm=batch_norm) + self.neuron_type = neuron_type + self.compile(kwargs["optimizer"], kwargs["loss"], learning_rate) + self.reset_parameters() + self.model_to_device() + + def forward(self, inputs): + X = self.get_inputs(inputs) + feature_emb = self.embedding_layer(X, dynamic_emb_dim=True) + y_pred = self.qnn(feature_emb) + y_pred = self.output_activation(y_pred) + return_dict = {"y_pred": y_pred} + return return_dict + +class QuadraticNeuralNetworks(nn.Module): + def __init__(self, + input_dim, + num_cross_layers=3, + net_dropout=0.1, + hidden_activations='relu', + neuron_type='T1', + batch_norm=False): + super(QuadraticNeuralNetworks, self).__init__() + self.num_cross_layers = num_cross_layers + self.dropout = nn.ModuleList() + self.norm = nn.ModuleList() + self.layer = nn.ModuleList() + self.activation = nn.ModuleList() + self.neuron_type = neuron_type + if neuron_type in ["T1", "T2", "T6"]: + self.compressed = nn.Linear(input_dim, 100, bias=False) + self.fc = nn.Linear(100, 1) + else: + self.fc = nn.Linear(input_dim, 1) + for i in range(num_cross_layers): + self.layer.append(QuadraticLayer(input_dim, neuron_type=neuron_type)) + if net_dropout > 0: + self.dropout.append(nn.Dropout(net_dropout)) + if batch_norm: + self.norm.append(nn.BatchNorm1d(input_dim)) + self.activation.append(get_activation(hidden_activations)) + + def forward(self, x): + if self.neuron_type in ["T1", "T2", "T6"]: + x = self.compressed(x) + for i in range(self.num_cross_layers): + x = self.layer[i](x) + if len(self.norm) > i: + x = self.norm[i](x) + if self.activation[i] is not None: + x = self.activation[i](x) + if len(self.dropout) > i: + x = self.dropout[i](x) + logit = self.fc(x) + return logit + +class QuadraticLayer(nn.Module): + def __init__(self, input_dim, neuron_type="T1"): + super(QuadraticLayer, self).__init__() + self.neuron_type = neuron_type + self.input_dim = input_dim + if neuron_type in ["T1", "T6"]: + self.bi_linear = nn.Bilinear(100, 100, 100) + self.linear = nn.Linear(100, 100) + elif neuron_type == "T2": + self.bi_linear = nn.Bilinear(100, 100, 100) + elif neuron_type in ["T3", "T4", "T9", "T11", "T14", "T19", "T20", "T21"]: + self.linear = nn.Linear(input_dim, input_dim) + elif neuron_type in ["T5", "T10", "T16", "T17", "T23", "T24"]: + self.linear = nn.Linear(input_dim, input_dim * 2) + elif neuron_type == "T7": + self.linear1 = nn.Linear(input_dim, input_dim * 2) + self.linear2 = nn.Linear(input_dim, input_dim) + elif neuron_type == "T8": + self.linear = nn.Linear(input_dim, input_dim * 3) + elif neuron_type == "T12": + self.linear1 = nn.Linear(input_dim, input_dim) + self.linear2 = nn.Linear(input_dim, input_dim // 2) + elif neuron_type == "T13": + self.linear = nn.Linear(input_dim, input_dim // 2 * 3) + elif neuron_type == "T15": + self.linear = nn.Linear(input_dim, input_dim // 2) + elif neuron_type == "T18": + self.linear = nn.Linear(input_dim, input_dim) + self.downsize = nn.Linear(input_dim * 2, input_dim) + elif neuron_type == "T22": + self.linear = nn.Linear(input_dim, input_dim) + self.alpha = nn.Parameter(torch.ones(input_dim)) # 可学习的权重 + elif neuron_type == "T25": + self.linear = nn.Sequential(nn.Linear(input_dim, input_dim // 2), + nn.ReLU(), + nn.Linear(input_dim // 2, input_dim)) + elif neuron_type in ["T26", "T27", "T28", "T29", "T30"]: + self.linear = nn.Linear(input_dim, input_dim * 2) + else: + assert "there is no such neuron_type type!" + + def forward(self, x): + if self.neuron_type == "T1": + x = self.bi_linear(x, x) + self.linear(x) + elif self.neuron_type == "T2": + x = self.bi_linear(x, x) + elif self.neuron_type == "T3": + x = self.linear(x * x) + elif self.neuron_type == "T4": + x = self.linear(x) + x = x * x + elif self.neuron_type == "T5": + x = self.linear(x) + x1, x2 = torch.chunk(x, chunks=2, dim=-1) + x = x1 * x2 + elif self.neuron_type == "T6": + x = self.bi_linear(x, x) + self.linear(x * x) + elif self.neuron_type == "T7": + h = self.linear1(x) + h1, h2 = torch.chunk(h, chunks=2, dim=-1) + x = h1 * h2 + self.linear2(x * x) + elif self.neuron_type == "T8": + x = self.linear(x) + x1, x2, x3 = torch.chunk(x, chunks=3, dim=-1) + x = x1 * x2 + x3 + elif self.neuron_type == "T9": + x = x * self.linear(x) + x + elif self.neuron_type == "T10": + h = self.linear(x) + h1, h2 = torch.chunk(h, chunks=2, dim=-1) + x = x * h1 + h2 + elif self.neuron_type == "T11": + x = self.linear(x) + x1, x2 = torch.chunk(x, chunks=2, dim=-1) + x = torch.cat([x1, x2 * x2], dim=-1) + elif self.neuron_type == "T12": + h = self.linear1(x) + x = self.linear2(x * x) + h1, h2 = torch.chunk(h, chunks=2, dim=-1) + x = torch.cat([h1 * h2, x], dim=-1) + elif self.neuron_type == "T13": + x = self.linear(x) + x1, x2, x3 = torch.chunk(x, chunks=3, dim=-1) + x = torch.cat([x1 * x2, x3], dim=-1) + elif self.neuron_type == "T14": + x = self.linear(x) + x1, x2 = torch.chunk(x, chunks=2, dim=-1) + x = torch.cat([x1 * x2, x2], dim=-1) + elif self.neuron_type == "T15": + x = self.linear(x) + x = torch.cat([x * x, x], dim=-1) + # T9 variants + elif self.neuron_type == "T16": # 双线性 + h = self.linear(x) + h1, h2 = torch.chunk(h, chunks=2, dim=-1) + x = x * (h1 * h2) + x + elif self.neuron_type == "T17": + h = self.linear(x) + h1, h2 = torch.chunk(h, chunks=2, dim=-1) + x = x * h1 + h2 + x + elif self.neuron_type == "T18": + x = torch.cat([x * self.linear(x), x], dim=-1) + x = self.downsize(x) + elif self.neuron_type == "T19": + x = F.relu(self.linear(x)) * x + x + elif self.neuron_type == "T20": + h = self.linear(x) + x = x * h + h + x + elif self.neuron_type == "T21": + x = (x * self.linear(x)) ** 2 + x + elif self.neuron_type == "T22": + x = x * self.linear(x) + self.alpha * x + elif self.neuron_type == "T23": + h = self.linear(x) + h1, h2 = torch.chunk(h, chunks=2, dim=-1) + x = h1 * h2 + x + elif self.neuron_type == "T24": + h = self.linear(x) + h1, h2 = torch.chunk(h, chunks=2, dim=-1) + x = h1 * F.relu(h2) + x + elif self.neuron_type == "T25": + x = x * self.linear(x) + x + elif self.neuron_type == "T26": + h = self.linear(x) + h1, h2 = torch.chunk(h, chunks=2, dim=-1) + x = torch.stack([F.relu(h1) * x, F.relu(h2) * x], dim=1).mean(dim=1) + x + elif self.neuron_type == "T27": + h = self.linear(x) + h1, h2 = torch.chunk(h, chunks=2, dim=-1) + x = torch.stack([F.relu(h1) * (x**2), F.relu(h2) * (x**2)], dim=1).mean(dim=1) + x + elif self.neuron_type == "T28": + h = self.linear(x) + h1, h2 = torch.chunk(h, chunks=2, dim=-1) + x = torch.stack([(F.relu(h1)**2) * x, (F.relu(h2)**2) * x], dim=1).mean(dim=1) + x + elif self.neuron_type == "T29": + h = self.linear(x) + h1, h2 = torch.chunk(h, chunks=2, dim=-1) + x = torch.stack([(F.relu(h1)) * x, (F.relu(h2)**2) * x], dim=1).mean(dim=1) + x + elif self.neuron_type == "T30": + h = self.linear(x) + h1, h2 = torch.chunk(h, chunks=2, dim=-1) + x = torch.stack([F.relu(h1) * x, F.relu(h2) * (x**2)], dim=1).mean(dim=1) + x + return x diff --git a/model_zoo/QNN/src/QNN_alpha.py b/model_zoo/QNN/src/QNN_alpha.py new file mode 100644 index 00000000..44984d4a --- /dev/null +++ b/model_zoo/QNN/src/QNN_alpha.py @@ -0,0 +1,136 @@ +# ========================================================================= +# Copyright (C) 2025. salmon1802@github. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ========================================================================= + +import torch +from torch import nn +from fuxictr.pytorch.models import BaseModel +from fuxictr.pytorch.layers import FeatureEmbedding + +class QNN_alpha(BaseModel): + def __init__(self, + feature_map, + model_id="QNN_alpha", + gpu=-1, + learning_rate=1e-3, + embedding_dim=16, + num_layers=3, + num_heads=1, + num_row=2, + net_dropout=0, + batch_norm=False, + embedding_regularizer=None, + net_regularizer=None, + **kwargs): + super(QNN_alpha, self).__init__(feature_map, + model_id=model_id, + gpu=gpu, + embedding_regularizer=embedding_regularizer, + net_regularizer=net_regularizer, + **kwargs) + self.embedding_layer = FeatureEmbedding(feature_map, embedding_dim) + input_dim = feature_map.sum_emb_out_dim() + num_fields = feature_map.num_fields + self.qnn = QuadraticNeuralNetworks(input_dim=input_dim, + num_layers=num_layers, + net_dropout=net_dropout, + num_heads=num_heads, + num_row=num_row, + batch_norm=batch_norm, + num_fields=num_fields) + self.compile(kwargs["optimizer"], kwargs["loss"], learning_rate) + self.reset_parameters() + self.model_to_device() + + def forward(self, inputs): + X = self.get_inputs(inputs) + feature_emb = self.embedding_layer(X, dynamic_emb_dim=True) + if self.training: + y_pred1 = self.qnn(feature_emb) + y_pred2 = self.qnn(feature_emb) + return_dict = {"y_pred1": self.output_activation(y_pred1), + "y_pred2": self.output_activation(y_pred2)} + else: + y_pred = self.qnn(feature_emb) + y_pred = self.output_activation(y_pred) + return_dict = {"y_pred": y_pred} + return return_dict + + def add_loss(self, inputs): + return_dict = self.forward(inputs) + y_true = self.get_labels(inputs) + y_pred1 = return_dict["y_pred1"] + y_pred2 = return_dict["y_pred2"] + y_pred = (y_pred1 + y_pred2) * 0.5 + loss = self.loss_fn(y_pred, y_true, reduction='mean') + loss1 = self.loss_fn(y_pred1, y_pred.detach(), reduction='mean') + loss2 = self.loss_fn(y_pred2, y_pred.detach(), reduction='mean') + loss = loss + loss1 + loss2 + return loss + +class QuadraticNeuralNetworks(nn.Module): + def __init__(self, + input_dim, + num_fields, + num_layers=3, + net_dropout=0.1, + num_heads=2, + num_row=2, + batch_norm=False): + super(QuadraticNeuralNetworks, self).__init__() + self.num_layers = num_layers + self.dropout = nn.ModuleList() + self.layer = nn.ModuleList() + for i in range(num_layers): + self.layer.append(QuadraticLayer(input_dim, num_row=num_row, num_heads=num_heads, batch_norm=batch_norm, num_fields=num_fields, net_dropout=net_dropout)) + if net_dropout > 0: + self.dropout.append(nn.Dropout(net_dropout)) + self.fc = nn.Linear(input_dim, 1) + + def forward(self, x): + for i in range(self.num_layers): + x = self.layer[i](x) + if len(self.dropout) > i: + x = self.dropout[i](x) + logit = self.fc(x) + return logit + + +class QuadraticLayer(nn.Module): + def __init__(self, input_dim, num_fields, num_row=2, num_heads=2, batch_norm=False, net_dropout=0.1): + super(QuadraticLayer, self).__init__() + self.linear = nn.Sequential(nn.Linear(input_dim, input_dim * num_row), + nn.BatchNorm1d(input_dim * num_row) if batch_norm else nn.Identity(), + nn.ReLU()) + if net_dropout > 0: + self.dropout = nn.Dropout(net_dropout) + self.net_dropout = net_dropout + self.num_fields = num_fields + self.num_row = num_row + self.embedding_dim = input_dim // num_fields + self.num_heads = num_heads + self.input_dim = input_dim + self.head_dim = input_dim // num_heads + + def forward(self, x): # Khatri-Rao product + ego_x = x + x = x.view(-1, self.num_fields, self.embedding_dim) + multihead_x = torch.tensor_split(x, self.num_heads, dim=-1) # d = D/H + multihead_x = torch.stack(multihead_x, dim=1).view(-1, self.num_heads, self.head_dim) # B × H × d + h = self.linear(ego_x).view(-1, self.num_heads, self.num_row, self.head_dim) # B × H × R × d + if self.net_dropout > 0: + h = self.dropout(h) + x = torch.einsum("bhd,bhrd->bhrd", multihead_x, h).sum(dim=-2).view(-1, self.input_dim) + ego_x + return x diff --git a/model_zoo/QNN/src/__init__.py b/model_zoo/QNN/src/__init__.py new file mode 100644 index 00000000..4d6be9c9 --- /dev/null +++ b/model_zoo/QNN/src/__init__.py @@ -0,0 +1,4 @@ +from .QNN import * +from .QNN_alpha import * + +