Skip to content

Add language translations to card templates#725

Merged
aplaice merged 17 commits intoanki-geo:masterfrom
matssson:master
Jan 8, 2026
Merged

Add language translations to card templates#725
aplaice merged 17 commits intoanki-geo:masterfrom
matssson:master

Conversation

@matssson
Copy link
Copy Markdown
Contributor

@matssson matssson commented Jan 1, 2026

Fix #724 and discussion #709 (reply in thread).

This commit adds language translations to card templates to match the language you're using.

Image Image

It works by adding the translations to src/note_models/translations.csv and then generating the template files from the base template with src/note_models/generate_templates.py. Right now I only have Swedish and English, but we should probably add more translations before if we want to merge this.

I tested the builds and everything works, although there aren't any visible changes to lanugages not in the csv file, I also didn't touch the experimental builds, they are not affected.

Automatically generate the card templates for each language translation.

Uses a base template and a translation.csv file to generate the anki card templates.
Run the generator in a seperate commit not to clutter the previous one.
@aplaice
Copy link
Copy Markdown
Collaborator

aplaice commented Jan 1, 2026

Wow! Thanks very much for looking into this! This looks great! (The mass of new, generated files is messy, but inevitable.)


The built CrowdAnki files (in build/) are indeed unchanged (compared to master), other than for Swedish (as expected) and for the names of the note templates themselves (as expected).

Unfortunately, we also need to have the id of the note templates to be changed for different languages (e.g. here for Czech, non-extended). (i.e. have (2 × languages) unique, stable ids). This is necessary, because otherwise CrowdAnki will recognise the different language templates as the same, leading to issues for people using multiple language decks (the template for the last imported language would overwrite all previous ones).


On a cursory look, other than the id issue I don't see anything crucial (and given that the build/ files currently conform to expectations I don't expect anything critical). I'll have a closer-look asap!

but we should probably add more translations before if we want to merge this

Yes, makes sense!

I also didn't touch the experimental builds, they are not affected.

Keeping the experimental note models untranslated, (at least for now?) is IMO fine.

Each deck template now gets its own uuid, CR from aplaice.
@matssson
Copy link
Copy Markdown
Contributor Author

matssson commented Jan 1, 2026

Thank you!

I generated a bunch of UUIDs for the different languages, and tested it on my Anki and it works.

The only thing to keep in mind is that if you're like me and already have a deck, you'll get a new notetype with some suffix since the last uuid was the english (and only) one. But you can rename it if you go to Tools > Manage Note Types.

@aplaice
Copy link
Copy Markdown
Collaborator

aplaice commented Jan 4, 2026

I've opened a PR to your fork adding the remaining translations. Unfortunately, it was non-trivial, especially for "flag similar to", because our current flag similarity field is in the nominative case (for languages that have cases), without any prepositions, so often the most natural translation didn't work. I've worked around this for some of the translations, by slightly rephrasing and using a colon. I've also added a "full stop" field to handle the different full stop in Chinese.

It would be useful, though, if these were checked by our translators. TBH, though, I feel awkward pinging a dozen people for a single line each... Unfortunately, we don't have a good process for incrementally having our translations verified.


The only thing to keep in mind is that if you're like me and already have a deck, you'll get a new notetype with some suffix since the last uuid was the english (and only) one. But you can rename it if you go to Tools > Manage Note Types.

I think that this only occurred because you already had a notetype called Ultimate Geography [#$LANGCODE]( [Extended]) (presumably from testing the newly generated deck), with the "original" ("English") uuid, so when you imported the new version with a new uuid, but also called Ultimate Geography [#$LANGCODE], Anki had to generate a new name for the new version. Most people will only have the original "Ultimate Geography" note model, so importing the new, translated note model, with a different name and different uuid won't require an automated rename of the new note model. (Existing users of the translated decks will however get a scary-ish pop-up due to the migration of their notes from the old to the new note model!)


Nitpicks

english_base_case = [
('name: Ultimate Geography [EN]', 'name: Ultimate Geography'),
('name: Ultimate Geography [EN] [Extended]', 'name: Ultimate Geography [Extended]')
]

        rendered = apply_replacements(rendered, english_base_case)

Maybe we should rename the current/"base" Ultimate Geography note templates to include [EN]? It would be clearer for people who have both the English version and a translated one. I believe that given that the uuid for the note model will be unchanged CrowdAnki should handle this transparently (without any scary pop-ups) (though I need to test it). I think that new versions of Anki should also be able to handle the renaming for an APKG import of the standard deck.


The raw text replacement system feels brittle (it relies on the tokens not occurring anywhere else in the templates and not being subsets of each other). OTOH it's simple and works (given the full regeneration of the English decks). Probably, if we ever expand the templates, it might need replacing, but for now I think it's OK.


Should we split up the generated templates in src/note_models/templates/generated/ into sub-dirs? On the one hand, they're all generated anyway; on the other it's 135 files in a single dir...


Thanks again!

@matssson
Copy link
Copy Markdown
Contributor Author

matssson commented Jan 4, 2026

I've opened a PR to your fork adding the remaining translations. Unfortunately, it was non-trivial, especially for "flag similar to", because our current flag similarity field is in the nominative case (for languages that have cases), without any prepositions, so often the most natural translation didn't work. I've worked around this for some of the translations, by slightly rephrasing and using a colon. I've also added a "full stop" field to handle the different full stop in Chinese.

Awesome! I'll reply about the stuff relating to the changes in that PR thread, especially about the "flag similar to" case, I think there's a nice way to solve it.

It would be useful, though, if these were checked by our translators. TBH, though, I feel awkward pinging a dozen people for a single line each... Unfortunately, we don't have a good process for incrementally having our translations verified.

Do you guys have a discord or something?

Maybe we should rename the current/"base" Ultimate Geography note templates to include [EN]? It would be clearer for people who have both the English version and a translated one. I believe that given that the uuid for the note model will be unchanged CrowdAnki should handle this transparently (without any scary pop-ups) (though I need to test it). I think that new versions of Anki should also be able to handle the renaming for an APKG import of the standard deck.

I agree with this completely, but I didn't want to be the one to take a decision about that since I'm not a maintainer.

The raw text replacement system feels brittle (it relies on the tokens not occurring anywhere else in the templates and not being subsets of each other). OTOH it's simple and works (given the full regeneration of the English decks). Probably, if we ever expand the templates, it might need replacing, but for now I think it's OK.

One fast and easy improvement would be that the python script actually checks the tags to make sure tokens not being subsets of eachother - if you think that sounds like a good idea I can add it.

Should we split up the generated templates in src/note_models/templates/generated/ into sub-dirs? On the one hand, they're all generated anyway; on the other it's 135 files in a single dir...

You're right, I can fix that tomorrow!

@aplaice
Copy link
Copy Markdown
Collaborator

aplaice commented Jan 5, 2026

Do you guys have a discord or something?

No, we handle everything here.


One fast and easy improvement would be that the python script actually checks the tags to make sure tokens not being subsets of eachother - if you think that sounds like a good idea I can add it.

Yes, good idea! It should be easy to implement and might save someone some time in the future if/when we expand the template and aren't careful...


One more thing that I forgot, is that I think it'd be cleaner for the script to be in utils/.

aplaice and others added 8 commits January 5, 2026 20:01
For ease of comparison this doesn't yet change the templates
themselves for better handling of colons.
This is primarily for the Chinese colon and full stop.
This causes {{Flag similarity}} to be repeated, but is clearer.

The generated templates are unchanged.
And move the script to utils
@matssson
Copy link
Copy Markdown
Contributor Author

matssson commented Jan 5, 2026

Yes, good idea! It should be easy to implement and might save someone some time in the future if/when we expand the template and aren't careful...

One more thing that I forgot, is that I think it'd be cleaner for the script to be in utils/.

Done! I merged all your changes and fixed the things in the thread - see what you think.

Copy link
Copy Markdown
Collaborator

@aplaice aplaice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks very much for the update! I also greatly appreciate the split into self-contained commits!

fieldnames, rows = load_translations(translations_path)
base_files = [path for path in base_dir.iterdir()]
output_dir.mkdir(parents=True, exist_ok=True)
english_base_case = [
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pro-actively changing this, but unfortunately we need to restore your fallback code for the English case!


Sorry, this my fault for being unclear — my question about renaming the English note models had been intended as the start of a discussion rather than an immediate suggestion.

I've now tested import of the deck with a changed note model name (but same uuid) with:

  1. CrowdAnki,
  2. an APKG directly into Anki,

Unfortunately, while 1, as expected, works seamlessly (i.e. without a migration pop-up), 2 unfortunately doesn't — you can get it to work (by choosing to merge the note models, as an option before running the import), but it's not quite straightforward.

Hence, given that renaming the note models is only a very mild benefit to people who have several language versions in parallel, and little-no-benefit for people who only have the English deck, but at the very least a minor annoyance for users of AUG on AnkiWeb, I think we should keep the old names for the English note models, for now. (We might end up with some "breaking change" at some point and we can do the rename then.)

(We could just rename the extended version, since it's not imported via APGK/AnkiWeb anyway, but IMO that'd be just confusing.)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I got too ahead of myself there, we should probably open a seperate issue about it

@@ -2,18 +2,28 @@
from pathlib import Path


def validate_subtokens(tokens):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the validation, but I don't think it works in all cases — it'll only check whether one is a prefix of another, but not whether one is a subset of the other. (e.g. it wouldn't catch _location_translation and _capital_location_translation.)

Maybe something like:

    length_sorted_tokens = sorted(tokens, key=len)

    for i, a in enumerate(length_sorted_tokens):
        for b in length_sorted_tokens[i + 1:]:
            if b.find(a) >= 0:
                raise ValueError(f"Translation key '{a}' is a subset of key '{b}'!")

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, I blame my tired brain

Copy link
Copy Markdown
Collaborator

@aplaice aplaice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes! I'll leave this open for a while in case anybody has any other thoughts.

Regarding checking the translations, we'll probably have them verified later in a batch. (In the worst case scenario users will complain :))

Copy link
Copy Markdown
Collaborator

@axelboc axelboc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really cool, thanks!! I've only had a quick glance at the discussions you've had until this point, so apologies if I suggest something that you've already discussed:

  1. I would strongly recommend adding the generated folder to .gitignore.
  2. Building the decks now requires running the new generate_templates script, so pipenv run build (and build_experimental) should be updated to run this new script first.
  3. The generate_templates script should also be mentioned somehow in the Getting started section of the CONTRIBUTING guide.
  4. It might be worth documenting this new templates translation set-up a little bit somewhere in the CONTRIBUTING guide as well.
  5. Very minor: in translation.csv, I would maybe put the _deck_id and _deck_extended_id columns in second and third position respectively (right after the language code), since they have fixed-length values. I would also rename language-tag to language_tag for consistency.
  6. Very minor also: for the tokens in the base templates, like _capital_translation, would it be possible to make them a bit more distinctive, like %%CAPITAL_TRANSLATION%%? The column headers in translation.csv can still be lower-case and without the special prefix/suffix (e.g. capital_translation) — the conversion from capital_translation to %%CAPITAL_TRANSLATION%% could be done dynamically by your script.

Since pipenv doesn't have a good cross-platform way of
running multiple commands in sequence (such as simply &&),
we'll have generate_and_build.py be our main entrypoint.
And delete them from git.
@matssson
Copy link
Copy Markdown
Contributor Author

matssson commented Jan 6, 2026

Thanks for the feedback! The main thing I want to comment on is that I tried the naive way of just changing build to:

python utils/generate_templates.py && brain_brew run recipes/source_to_anki.yaml

And then I found out that pipenv is really bad at chaining commands in a cross-platform way. Even though generate_templates would return 0, brain_brew wouldn't run - and we still want to run it only when we successfully generate the templates. An option would be to wrap it in a call to bash, but that wouldn't work on Windows for example.

What I thought was a better idea was to rename the script to generate_and_build.py and have it generate the templates and make calls to brain_brew. It also has the benefit of being a lot more extendable for anything we want to add in the future.

I thought l'd wait with changing the documentation until you guys approved of this idea, since that also had to be added to the docs.


Regarding the tokens I hope you think it's good with the __DUNDER_AND_CAPS__, I think it looks a lot better now - and I reorganized the columns but kept the rows in the order the languages were added in, since that's how they show up everywhere else in the codebase.

@axelboc
Copy link
Copy Markdown
Collaborator

axelboc commented Jan 6, 2026

Awesome, thanks a lot for the updates! 😊

No worries about the build script, then. 👌 Calling the generate_templates script separately is perfectly fine too. It just means we have to remember to mention it in the Maintainer's guide section of CONTRIBUTING.md, and maybe run it in the CI workflow.

Happy with __DUNDER_AND_CAPS__ as well, thanks! 💯

@matssson
Copy link
Copy Markdown
Contributor Author

matssson commented Jan 6, 2026

No worries about the build script, then. 👌 Calling the generate_templates script separately is perfectly fine too. It just means we have to remember to mention it in the Maintainer's guide section of CONTRIBUTING.md, and maybe run it in the CI workflow.

Sorry for being unclear, I already changed the build script, it's in this commit 0c1347c, if you don't like it I can undo it but I think it works pretty well now.

Another thing I should mention is that if we don't have a build script that does both steps, then we need to track the generated files in git and couldn't .gitignore them - otherwise the Github actions runner would fail the build.

@axelboc
Copy link
Copy Markdown
Collaborator

axelboc commented Jan 6, 2026

Sorry for being unclear, I already changed the build script, it's in this commit 0c1347c, if you don't like it I can undo it but I think it works pretty well now.

Oh great, then!

@axelboc axelboc added translation Translating the deck to a new language structure Templates, tags, generated decks, etc. labels Jan 6, 2026
@axelboc axelboc added this to the v5.4 milestone Jan 6, 2026
Write them in caps and with __dunder__.

Also reorganize the columns.
And update the section about adding a new translation,
as well as how the builds are generated.
@matssson
Copy link
Copy Markdown
Contributor Author

matssson commented Jan 7, 2026

Oh great, then!

I fixed the last step now which was to add all the new info about generating templates, building, and adding a new language to CONTRIBUTING. When reading through it I realized the proper name for the lanugage tags should be language codes, so I amended that commit.

Other than that I think that's all for this PR unless there's more feedback from the maintainers or I missed something.

Copy link
Copy Markdown
Collaborator

@axelboc axelboc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Brilliant, love the documentation thanks! It's definitely ready to go on my end. @aplaice want to have one last look and merge?

@axelboc axelboc removed translation Translating the deck to a new language structure Templates, tags, generated decks, etc. labels Jan 7, 2026
@axelboc axelboc removed this from the v5.4 milestone Jan 7, 2026
@aplaice
Copy link
Copy Markdown
Collaborator

aplaice commented Jan 8, 2026

Thanks very much for the changes and for the updated docs! The dunder caps are indeed much clearer!

I don't have any other comments!

@aplaice aplaice merged commit 64ddd86 into anki-geo:master Jan 8, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

Add translations to card templates

3 participants