Add LLVM pass to optimize promote calls based on thread tracing state.#301
Conversation
| vector<llvm::Constant *> GlobalsAsConsts; | ||
| for (llvm::GlobalVariable *G : Globals) { | ||
| GlobalsAsConsts.push_back(cast<llvm::Constant>(G)); |
There was a problem hiding this comment.
I've had to make a change similar to this, but which never landed.
The way I did it was:
- not record TLS in the array
- keep a count of how many elements do make it into the array.
- adjust the size of the array declaration with the count (instead of
Globals.size()).
That should work, right? If the JIT runtime doesn't need the TLS variables. I assume this is the case, because there's little it could do with the nulls you are writing in currently.
There was a problem hiding this comment.
JIT runtime can't use TLS values because TLS addresses are thread-specific and cannot be stored in a static array. For example here we get them by offset.
So I think we do need the tls vars in the array to be aware of them, just cant use their values.
| @@ -140,12 +131,18 @@ class YkConditionalPromoteCalls : public ModulePass { | |||
|
|
|||
| IRBuilder<> Builder(CI); | |||
|
|
|||
| // Metadata to mark instructions that shouldn't be serialised into AOT IR. | |||
There was a problem hiding this comment.
Interesting, so you mark individual instructions that shouldn't be serialised.
For the basic block tracer stuff, I marked the purpose of blocks by annotating the first instruction in the block. The serialiser would then have custom logic for the contents of the blocks.
Your way may be better. I'm not sure. Did you try marking blocks first, then went to this approach, or?
There was a problem hiding this comment.
I think we should use similar approach for consistency.
Updated 👉 f8fc8a777493b7e4f1e143b78a3469be438ebe59
There was a problem hiding this comment.
Now that we've implemented both ways, which do we think is better?
There was a problem hiding this comment.
I think the block-level approach is better because it captures why we handle blocks specially, not just that we should skip them.
That said, it raises a separation of concerns question here - should the YkIRWriter need to know about pass-internal logic?
|
FWIW this LGTM now. |
For the basic block tracing stuff, it has to serialise stuff as though some instructions have been moved around between blocks. Sure, we could plumb in some kind of specification of how to move instructions and what blocks should and shouldn't be serialised, then the logic is inside the pass in question and the serialiser is then dumb, but that's a lot of work. I suggest we roll with the block-based method for now. |
yes, agree |
|
Please squash |
f8fc8a7 to
472df6b
Compare
Squashed 👉 472df6be2120bd2af7f0d6c4b3523b6ec734bdf0 |
|
Ci failed cause I forgot to update .ll test |
|
Please squash. |
This commit introduces the YkConditionalPromoteCalls pass, which modifies the IR to conditionally skip calls to __yk_promote_* and __yk_idempotent_promote_* functions when not tracing.
754ce56 to
c1625e7
Compare
|
Done c1625e7 |
This PR introduces the YkConditionalPromoteCalls pass, which modifies the IR to conditionally skip calls to _yk_promote* and _yk_idempotent_promote* functions based on the jit tracing state.
Related yk Pr 👉 ykjit/yk#2051