Skip to content

[SIGIR 2022] ORCAS-I: Queries Annotated with Intent using Weak Supervision

Notifications You must be signed in to change notification settings

ProjectDossier/intents_labelling

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

147 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Intents Labelling project

This package serves as basis for the paper "ORCAS-I: Queries Annotated with Intent using Weak Supervision"

Link to the paper: arXiv

DOI of the paper: https://doi.org/10.1145/3477495.3531737

DOI of the dataset: DOI

Installation

Create conda environment:

$ conda create --name intents_labelling python==3.8.12

Activate the environment:

$ source activate intents_labelling

Use pip to install requirements:

(intents_labelling) $ pip install -r requirements.txt

Install intents_labelling package for development

(intents_labelling) $ pip install -e .

Install spacy language model:

(intents_labelling) $ python -m spacy download en_core_web_lg

List of movie titles can be found here.

Put all data files in data/input/ directory.

Usage

Create a training set which will be a sample of ORCAS dataset. Filter out testset examples

(intents_labelling) $ python intents_labelling/create_train_file.py

Create snorkel annotations

(intents_labelling) $ python intents_labelling/main.py

Train Bert model

(intents_labelling) $ python intents_labelling/models/train_bert_classifier.py

About

[SIGIR 2022] ORCAS-I: Queries Annotated with Intent using Weak Supervision

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •