Llama65B patch for int4 fp32 by Abhishek-Varma · Pull Request #1769 · nod-ai/AMD-SHARK-Studio

Abhishek-Varma · 2023-08-16T17:28:23Z

Adding a WIP patch over @dan-garvey 's patch here for int4/fp32.

Apart from the bug fixes, I have also changed the order of compilation (Second and then First Vicuna/Llama).

Have put it on compilation - if IR gets generated we should be able to use the diff of this patch on top of @dan-garvey 's patch.

Signed-off-by: Abhishek Varma <abhishek@nod-labs.com>

Abhishek-Varma · 2023-08-16T17:29:33Z

Would need to get rid of the hardcoded dumps of IR post compilation's success before getting the diff of this patch in.
CC: @dan-garvey

Llama65B patch for int4 fp32

bd5b2b4

Signed-off-by: Abhishek Varma <abhishek@nod-labs.com>

Abhishek-Varma requested a review from dan-garvey August 16, 2023 17:28

Provide feedback