Skip to content

Llama65B patch for int4 fp32#1769

Draft
Abhishek-Varma wants to merge 1 commit intonod-ai:mainfrom
Abhishek-Varma:llama65b_v1
Draft

Llama65B patch for int4 fp32#1769
Abhishek-Varma wants to merge 1 commit intonod-ai:mainfrom
Abhishek-Varma:llama65b_v1

Conversation

@Abhishek-Varma
Copy link
Copy Markdown
Contributor

Adding a WIP patch over @dan-garvey 's patch here for int4/fp32.

Apart from the bug fixes, I have also changed the order of compilation (Second and then First Vicuna/Llama).

Have put it on compilation - if IR gets generated we should be able to use the diff of this patch on top of @dan-garvey 's patch.

Signed-off-by: Abhishek Varma <abhishek@nod-labs.com>
@Abhishek-Varma
Copy link
Copy Markdown
Contributor Author

Would need to get rid of the hardcoded dumps of IR post compilation's success before getting the diff of this patch in.
CC: @dan-garvey

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant