Are dev/test sets used for training?

A few datasets are used for training: NUCLE, Lang-8, FCE, WI and LOCNESS. Do you only use the training sets, or also the development and test sets?

<img width="351" alt="Screenshot 2023-03-30 at 20 17 03" src="https://user-images.githubusercontent.com/30549145/228941221-96bf243b-c411-4148-b77e-49c8fad34f65.png">

Noticeably, you evaluate on the BEA-2019 dev set, which includes WI and LOCNESS, so I would imagine you only train on the training sets of the datasets above?

My source of confusion is from your dataset sizes and how they differ from the follow-up work: https://arxiv.org/pdf/2203.13064.pdf

It seems that you used the full FCE dataset for GECTOR, and only the FCE training set for the ensembling paper.

<img width="290" alt="Screenshot 2023-03-30 at 20 24 34" src="https://user-images.githubusercontent.com/30549145/228942862-f69ee124-5cca-4064-93f9-2dc39dd56d72.png">



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Are dev/test sets used for training? #187

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Are dev/test sets used for training? #187

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions