First of all, thanks for your awesome work. The result is really impressive.
Here I felt confused about the training steps for GLDv2 clean version. In your paper, it is described as:
The global model is trained for 25 epochs (39.5M steps) for the training dataset, using a learning rate of 0.005625, and a batch size of 144.
And the GLDv2 clean set, as you wrote, has 1.58M images from 81k landmarks.
Then obviously, if you want to train 25 epochs, then that means you train (1.58/144)*25 = 0.27M steps, and for 39.5M steps, it is ~3657 epochs. So which description is correct, or did I just miss something?
First of all, thanks for your awesome work. The result is really impressive.
Here I felt confused about the training steps for GLDv2 clean version. In your paper, it is described as:
The global model is trained for 25 epochs (39.5M steps) for the training dataset, using a learning rate of 0.005625, and a batch size of 144.And the GLDv2 clean set, as you wrote, has
1.58M images from 81k landmarks.Then obviously, if you want to train 25 epochs, then that means you train
(1.58/144)*25 = 0.27Msteps, and for 39.5M steps, it is ~3657 epochs. So which description is correct, or did I just miss something?