-
-
Notifications
You must be signed in to change notification settings - Fork 685
AI tooling notice for contributions #22982
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
As discussed at the October meeting. Meta Disclosure: I used an LLM to review this policy.
benjyw
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! But I wonder if we could mention certain LLM-isms that we don't want to see. For example, AI-generated code usually contains too many superfluous comments that simply recapitulate what the code line below the comment obviously does.
|
Definitely a good jump off point, and I'll have more commentary later - but two notes: That Oxide post was excellent. Personally, I generally don't care much about what people do in the privacy of their own homes - so using LLMs to read/reason about/debug are kinda 🤷🏽. I'm more interested in work product disclosures (which is that human note in the first paragraph you wrote). But for LLM assistance disclosure, if someone asked ChatGPT to explain how Python and Rust interact - I don't care about that, nor would I expect them to disclose it. Though... I guess it does provide useful context for the PR reviewer to know what grasp of the domain that the contributor has? I dunno - on the fence about that. EDIT: (Jan 8) Continuing my thoughts from yesterday:Re: Work product disclosure: I should also note that I care more about work product disclosures, as there are sometimes legal implications of that for my company or my clients (re: IP assignment, licensing, blatant theft, etc). So, that's a big deal for me and that's my bias - less important for Pants specifically. I'm not much of an LLM user for a variety of reasons (ethical, legal, and practical). I've also had to review my fair share of LLM-generated code for PRs (largely in my dayjob, but also in Pants/OSS). There are broadly two types of PRs that include LLM (output) assistance which I encounter often: The deft touchLLM output is used for rote boilerplate (which is generally fine - I view this similar to code templates I make) or used in smaller sections of code, similar to how others might copy/paste or cargo-cult code. In the former case, I might see a disclosure like:
in the PR comments for that file. When it's correct, that's great. But usually when there is a situation where we can auto-generate boilerplate, I'll ask the author for a template that we can generate once and deterministically use from then on. In the latter, I can hone in on parts of the PR where I need to focus, and then explain to the author whether it's correct/not and what to tweak, why, etc... These are usually disclosed on the PR lines/changes as:
in the same way, I would have previously expected a disclosure on these sections of code as:
I'm cool with both of these. The sledgehammerI'll happily refer to this as AI slop. Whether correct or not, the author doesn't really know, hasn't learned the codebase, doesn't really understand how/why the code works, and just "wants the thing". Good for them, but unfortunately, this is a maintainability nightmare which someone (usually other than the author) gets burdened with. If I offer commentary, I'm often just literally re-prompting their machine - so why even bother. While this doesn't go 1:1 with drive-bys, they still feel like them. Consistently adding entropy to the codebase and making it harder to maintain for future devs. This is unrelated to the size of the PR, sometimes it's a big PR bomb, other times just a hundred lines of code. FWIW I consider this to be similar to broadly copying/pasting and then renaming variables in a non-trivial function/file - so this isn't a "new thing" in programming by any means - just an easier one to do. I absolutely detest these PRs. DisclosureMy thoughts on the mechanism for disclosure would be that they need to be at the level of the change itself. In the PR description implies larger or over-arching assistance, while I personally prefer more localized disclosures (e.g. at the file level or preferably function/line/change level) - as mentioned in my deft touch examples. Examples of where I've currently done this in the Pants repo would be the Python 3.14 branch where there was so much boilerplate cruft, with a few nuggets of "real" changes. So, I guess, the LLM disclosure would be more inline with just... good PR hygiene...? I don't have any clean recommendations on how to modify the wording, as I don't want this to be so large that it turns into disclosure theatre. Semi-related but scope creepNot AI/LLM related, so not in this PR (but I'll forget if I don't write them down literally right now): We should have a "policy" (like, a sentence/paragraph) in contribution notes for drive-bys and for entirely new backends/plugins. My suggestion would be that new backends/new plugins should start as in-repo plugins then get pulled into main after some basic road testing (e.g. https://github.com/sureshjoshi/pants-plugins/tree/main/pants-plugins/experimental/ty). No need to discuss this section in this PR - I'll reference it in another discussion, but a lot of the maintainability concerns I have from LLM code apply here too. ReferencesSome references for various AI policies I've seen around (in case reviewers, bystanders want to see some more in the wild):
|
|
|
||
| ## LLM Assistance Notice | ||
|
|
||
| AI/LLM assisted *code contributions* are welcome. That includes using LLM assistance to write, introspect, read, reason about, debug, or test code contributions. However, the extent of the LLM assistance use must be disclosed in the Pull Request description. All of the normal expectations of a human apply: you need to understand it, have tested it, and be able to discuss the proposed change. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the extent of the LLM assistance use must be disclosed in the Pull Request description
"and/or in the PR comments of the relevant changes"
Blah. I don't like how that sounds
In my day job I've had some success with adding Cursor rules to our main repo, but of course that's just one tool, and anecdotally a lot of people at my work are switching from Cursor to Claude Code, so not sure if that respects rules in the same location or not. Hopefully the AI tools will standardize on a specific filepath for rule files. |
|
|
||
| Gemini generated all of the test cases. | ||
|
|
||
| Minor exceptions to this policy include basic auto-complete functionality or the various Machine Learning techniques that are no longer called "AI". Spellcheck is not an AI. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is "basic auto-complete"? I don't think LSPs have ever been considered to be AI per se, but now there are LLMs that run via LSPs.
This feels a bit like pre-empting future contributor nit-picking, which I think implies those people are not acting in good faith in general.
I don't personally think we need this paragraph, though I have seen something like it in other projects 🤷🏽
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The continue.dev plugin for pycharm imitates autocomplete (possibly replacing autocomplete sometimes), displaying the AI suggestions as autocomplete suggestions. The native JetBrains AI seems to do better, but it still displays (at least some) suggested code as autocomplete.
So far, nearly all of the AI suggestions I get via the continue.dev plugin have been garbage. I try it (because AI is big at work), and end up disabling it again because it slows me down.
All that to say, distinguishing AI vs autocomplete can be quite murky.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Xcode did this too, until I uninstalled the model. Replacing my insanely fast auto-complete with a sketchy, slow one is ... a choice...
But yeah, that's kinda why I didn't want to dive into the rabbit hole of that at all.
|
|
||
| Minor exceptions to this policy include basic auto-complete functionality or the various Machine Learning techniques that are no longer called "AI". Spellcheck is not an AI. | ||
|
|
||
| :::note AI/LLM generation is for *code* changes only. All community interactions, including comments, discussion, issues, PR titles, and descriptions *must be composed by a human*. Be respectful to one another and mindful of the person on the other end. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Goddamn Copilot offering to auto-summarize issues and auto-create PR summaries. It's a bane on OSS.
One-line commits turn into emoji-ridden soliloquies
This is an interesting one, because I'd argue that if someone had a bunch of superfluous comments, that means they didn't actually follow the LLM disclosure in the first place 😆:
If you use an LLM to generate a snippet of code that you'll then commit, then you'll need to have read it and would hopefully go in and strip out useless commentary, or tweak it. This feels like it should be at the level of the contributor - rather than a Pants-level rule that tries to blanket an explanation of generated code commentary. |
As discussed at the October meeting.
Meta Disclosure: I used an LLM to review this policy.