Deterministic-ally get activation_index, fixed identation, added support for python3#4
Deterministic-ally get activation_index, fixed identation, added support for python3#4kendricktan wants to merge 3 commits intojacobgil:masterfrom
Conversation
jacobgil
left a comment
There was a problem hiding this comment.
Hi, thanks a lot for this.
I will be able to test this over the weekend.
I'm not sure I understand it correctly, I added a comment about some part of the new code. Can you please explain how it works?
| for layer, (name, module) in enumerate(self.model.features._modules.items()): | ||
| x = module(x) | ||
| if isinstance(module, torch.nn.modules.conv.Conv2d): | ||
| x.register_hook(self.compute_rank) |
There was a problem hiding this comment.
self.compute_rank is now a function that returns a function (hook). It looks like the pytorch hook will call compute_rank, it will return hook as a function object (but won't run it), and self.filter_ranks won't be computed anywhere.
There was a problem hiding this comment.
self.compute_rank now returns a function (hook). So when self.compute_rank(activation_index) is called, hook (a partial function with the local variable activation_index) is passed in as the call back function for register_hook.
So when the gradients are updated, hook is called, but doesn't need to calculate the activation_index because it's given when you called (self.compute_ranks(INDEX))
There was a problem hiding this comment.
Thanks.
But if so, then wasn't the intention to do:
x.register_hook(self.compute_rank(activation_index))
self.activations.append(x)
Othwerwise x isn't appended to self.activations and can't be used from within hook, and pytorch isn't registering the gradient callback to the partial function from self.compute_rank.
|
It's hard to explain, but here's a code snippet that explain what partial functions do. |
jacobgil
left a comment
There was a problem hiding this comment.
Hey @kendricktan, just wanted to make sure you saw my latest comment.
Does it make sense?
| for layer, (name, module) in enumerate(self.model.features._modules.items()): | ||
| x = module(x) | ||
| if isinstance(module, torch.nn.modules.conv.Conv2d): | ||
| x.register_hook(self.compute_rank) |
There was a problem hiding this comment.
Thanks.
But if so, then wasn't the intention to do:
x.register_hook(self.compute_rank(activation_index))
self.activations.append(x)
Othwerwise x isn't appended to self.activations and can't be used from within hook, and pytorch isn't registering the gradient callback to the partial function from self.compute_rank.
|
Oh whoops you are completely right, it should be registering the hook with the partial function and appending X to the activations, not the other way around. I should have slept before committing this. I'll change it when I have time, thanks. |
|
fixed in 212f1b5 |
|
Hello, thank you for the blog post and the code. I run your code but get some problem, "python finetune.py |
|
Как по телефону вычислить хазяина,кто знает помогите 89635264714 вот этого гада! |
Hi there, just wanted to say thank you for the blog post and the code example. I noticed that the function
compute_rankin finetune.py is mutating a global state, namelygrad_indexto calculateactivation_index.See:
pytorch-pruning/finetune.py
Line 73 in 7c3a5af
While its fine for a single GPU, I noticed that it becomes non-deterministic while being pruned/trained on multiple GPUs.
This pull request solves that issue as well as added support for python3.