GSOC 2026 : Project #6 Develop an OpenVINO-Domain Specialized Coder Model with SFT/GRPO/RAG #34261
Replies: 7 comments 4 replies
-
|
Hey @abhayuvi, this is a fantastic and well-thought-out breakdown for GSoC! Your focus on GRPO reward functions (compilation success/API validity) and quantization targets (NNCF/INT4) shows you've really looked into the technical constraints. Regarding the base model, Qwen2.5-Coder-7B has been showing great performance in recent benchmarks, so that could be a strong contender. For the RAG knowledge base, don't forget the Jupyter Notebooks in the OpenVINO training materials—they often have the most up-to-date API usage patterns. Good luck with the proposal! |
Beta Was this translation helpful? Give feedback.
-
|
Hi @abhayuvi,
Curated Data: We should prioritize the latest OpenVINO 2024.x/2025.x documentation. Additionally, specifically curating examples of "Legacy API -> New API" migrations, OpenVINO GenAI API usage, and NNCF quantization snippets will drastically improve the RAG performance.
Looking forward to discussing this further with you and seeing how your implementation evolves! |
Beta Was this translation helpful? Give feedback.
-
|
Thank you for putting so much thought into this. This is an excellent analysis, and I completely agree with your assessment. Here are my thoughts.
Using GitHub to communicate is fine. Looking forward to your next move |
Beta Was this translation helpful? Give feedback.
-
|
Also @7taozhou7, @yinquan251 — have been looking through the repo for good first issues to contribute to before the proposal, but couldn't find any open ones currently. Would appreciate any pointers on where a meaningful contribution could be made , whether that's documentation, tests, or any area related to the project scope. |
Beta Was this translation helpful? Give feedback.
-
|
Wow, this is sick.I am currently working on Reinforcement and RAG alignment work and I thought of drafting a proposal for this project and working on the project but i think @abhayuvi is the right fit here. The architecture explains a lot without being too complex. Good luck man. |
Beta Was this translation helpful? Give feedback.
-
|
Hi @7taozhou7 and @yinquan251, wanted to share a quick update on the architecture and the proposal.
The updated architecture diagram and full proposal draft have also been shared over email from abhyuvi.raj@gmail.com for review. Would really appreciate any feedback or suggestions before the deadline, happy to make changes if anything needs to be reconsidered. Also thanks for your previous feedbacks which led to correct and improve the architecture. |
Beta Was this translation helpful? Give feedback.
-
|
hey @abhayuvi , really nice work on the clean roadmap. I was exploring similar ideas yesterday, but looks like you’ve already structured it much better. given the timeline, I’d be happy to collaborate and contribute wherever it helps. I genuinely think working together on a clear direction like this can speed things up a lot. feel free to ping me anytime if you need any help, good luck! |
Beta Was this translation helpful? Give feedback.

Uh oh!
There was an error while loading. Please reload this page.
-
Hello @7taozhou7 , @yinquan251 and Everyone! Hope you're doing well.
I'm Yuvraj, graduating this summer in B.Tech CS with AI & ML. For the past 10 months I've been working as a full-time intern at a US-based MNC, where I've built and shipped RAG and SFT pipelines that are currently in production and used globally . so the core technologies in this project are ones I work with daily, not just academically.
I've been following the OpenVINO repo since my 2nd year, and this year Project-6 "Develop an OpenVINO-Domain Specialized Coder Model with SFT/GRPO/RAG" immediately caught my attention. I spent this weekend doing a deep dive into the project description and the referenced HuggingFace cookbook, and put together a draft architecture covering the full pipeline:
After working through the design, I have a few questions I'd love your input on:
I know the project is marked Hard and I've likely missed things in my initial design , that's exactly what I'm hoping to discuss. I'd love to collaborate and contribute meaningfully to this.
If anyone else in the community has worked on similar OpenVINO + LLM projects or has thoughts on the architecture, I'd love to hear your input too! Looking forward to the discussion.
Beta Was this translation helpful? Give feedback.
All reactions