Skip to content

More expensive and fun things to do #4

@vessenes

Description

@vessenes

How much does a win matter to an LLM? What's their "cash?"

I don't know. But some proposals out there include a genepool type concept -- more copies of the LLM -> success.

With this in mind, I wonder if winning LLMs could be told there would be a second copy of themselves in the next round if they win. And then you could tell LLMs that one of them is a copy, and either winning will get a copy made next time. Perhaps you say the model type perhaps not.

This opens up a bunch of metastrategy, avenues for lying, ("Yes, I am in fact Claude 3.5!" -- phi4) coordination that makes this a sort of doubly iterated game theoretic evaluation.

This could go with leaving notes for their future self -- you have 100 words to give yourself private advice.

Adding these in would allow us to evaluate how well the LLMs can self-program with external reward, and see if any can create sustained success, or how and when they can do that.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions