Skip to content

Add i18n support for the NiceGUI website (focused)#5848

Open
evnchn wants to merge 1 commit intozauberzeug:mainfrom
evnchn:i18n-claude-v2
Open

Add i18n support for the NiceGUI website (focused)#5848
evnchn wants to merge 1 commit intozauberzeug:mainfrom
evnchn:i18n-claude-v2

Conversation

@evnchn
Copy link
Copy Markdown
Collaborator

@evnchn evnchn commented Mar 4, 2026

Motivation

Adds internationalization (i18n) infrastructure to the NiceGUI website, starting with the main landing page and core UI — the highest-value surface area for non-English speakers.

Driven by community interest: #4025 (comment)

This is the focused follow-up to #5825, which served as a proof-of-concept and drew valuable early feedback. This PR is deliberately scoped down to address the key concern raised by @falkoschindler: the original was too large (+8 200 lines) to review in good conscience. This version is 15 files, ~670 insertions.

Implementation

Core i18n module (website/i18n.py):

  • t(english) function that returns the translated string for the current language, falling back to English
  • set_language(lang) / get_language() using contextvars.ContextVar for async-safe, per-request language state
  • Internal documentation links are rewritten to include the language prefix (e.g. /documentation/…/zh/documentation/…)
  • Translations loaded from per-language JSON files in website/translations/

Scope: main_page.py, header.py, search.py, star.py, style.py, examples_page.py, and the documentation intro — the most user-visible surface area, deliberately excluding the full API reference docs

Language-prefixed routes (main.py):

  • Routes registered for each language (e.g. /de/, /zh/documentation/…)
  • Language selector dropdown added to the header

Translation files (website/translations/*.json): de, ja, ko, zh — 70 strings each, covering the complete translated surface

Validation tooling (website/check_translations.py + pre-commit hook):

  • Extracts all t() call keys via AST
  • Validates that every key has an entry in every translation file
  • Exits non-zero if any keys are missing

What is intentionally excluded (vs #5825):

  • Full documentation API reference translation (docstrings, doc.text() prose)
  • Helper scripts (i18n_bootstrap.py, i18n_salvage.py, etc.)
  • The website/translate.csv single-file approach (replaced by per-language JSON)
  • <html lang> / <link rel="alternate" hreflang> SEO tags (can follow separately, see TODO in commit)

Progress

  • I chose a meaningful title that completes the sentence: "If applied, this PR will..."
  • The implementation is complete.
  • If this PR addresses a security issue, it has been coordinated via the security advisory process.
  • Pytests have been added (or are not necessary).
  • Documentation has been added (or is not necessary).

- Add website/i18n.py with ContextVar-based translation function t()
- Add translation files for de, ja, ko, zh with full coverage
- Register language-prefixed routes (e.g. /de/, /zh/documentation)
- Add language selector dropdown to header
- Wrap user-facing strings in t() across main_page, header, search,
  star, style, examples_page, and documentation intro
- Add pre-commit hook (check_translations.py) to validate that all
  t() keys have entries in every translation file
- Rewrite all internal links (not just /documentation) with language
  prefix in translated text

TODO: Add <html lang> attribute and <link rel="alternate" hreflang>
meta tags for SEO (needs coordination with SEO strategy).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@falkoschindler
Copy link
Copy Markdown
Contributor

Thanks, @evnchn!
Two things before I'll look into it in more detail:

  • There's a conflict with main.py. Can you, please, look into it?

  • Most multi-line strings have been converted into multiple strings and line breaks have been added. I don't see why. The only diff should be one t( at the beginning and one ) at the end, don't you think?

    Screenshot 2026-03-05 at 09 56 11

@falkoschindler falkoschindler added this to the 3.10 milestone Mar 5, 2026
@falkoschindler falkoschindler added documentation Type/scope: Documentation, examples and website review Status: PR is open and needs review labels Mar 5, 2026
@falkoschindler
Copy link
Copy Markdown
Contributor

Oh, if splitting the strings is to handle indentation, we should improve the t() function to auto-dedent the input string.

Comment on lines +68 to +69
if prefix and '](' in text:
text = text.replace('](/', f']({prefix}/')
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like it could go wrong easily.

I believe it's to handle Markdown links ([Example](example)), but I can see it going wrong unpredictably.

At the same time, I don't know of a better way to implement it. Maybe some RegEx?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Example](example) seems like a good idea actually. What's wrong with it? Let's rewrite all links to not use relative-to-/ and we should be good to go.

Copy link
Copy Markdown

@MicaelJarniac MicaelJarniac Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few scenarios where this str.replace("](/") approach could break:

1. Markdown images → broken assets
If a translated string contains an image reference:

![Screenshot](/screenshots/demo.png)

It becomes ![Screenshot](/de/screenshots/demo.png) → 404, since static assets don't have language-prefixed routes.

2. Links to non-translated paths (static, API, etc.)

[Download the PDF](/static/nicegui-cheatsheet.pdf)

](/de/static/nicegui-cheatsheet.pdf) → 404.

3. Double-prefixing on repeated calls
If t() is ever called on already-translated text, links get double-prefixed: ](/de/docs)](/de/de/docs).

A regex that excludes known static prefixes (e.g. /static, /assets, /_) would be safer, or handling link rewriting at the route/template level instead of string-munging the translations.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what my AI agent had to say about this.
I won't say this is certainly going to break, but it feels fragile in my opinion, and would need special care to work correctly.

with ui.button(icon='language').props('flat color=white round').classes('max-[470px]:hidden'):
with ui.menu().classes('bg-primary text-white'):
for lang, name in SUPPORTED_LANGUAGES.items():
ui.menu_item(name, on_click=lambda lang=lang: switch_language(lang))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lambda inside a loop gives me flashbacks to bugs I introduced, where the value of the variable is always the last iteration of the loop, or something like that.

Not sure if this is the case here, but I'd suggest testing it thoroughly.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the bug I coded went something like this:

my_list = ["foo", "bar"]
my_dict = {}

for my_item in my_list:
    def print_item() -> None:
        print(my_item)
    my_dict[my_item] = print_item

my_dict["foo"]()  # bar
my_dict["bar"]()  # bar

So again, double-check and be careful.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hence the lang=lang. Without it, the last lang is used.

Comment on lines +13 to +19
SUPPORTED_LANGUAGES: dict[str, str] = {
'en': 'English',
'de': 'Deutsch',
'ja': '日本語',
'ko': '한국어',
'zh': '中文',
}
Copy link
Copy Markdown

@MicaelJarniac MicaelJarniac Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be automated by scanning the list of translations inside the website/translations directory.

That way, to add a new language, one just has to add a <lang>.json to that folder, without needing to touch this i18n.py file at all, making diffs simpler and avoiding conflicts.

That's more or less what I'm doing in my own web app.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this, you could either use a lib (like Babel) to go from lang abbreviation (de) to lang name (Deutsch), or you could add a field to the JSON files themselves for the language name, something like "__LANG_NAME": "Deutsch".

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only language that would need to be "hard-coded" to the list would then be English itself, as it doesn't have a JSON file.

@MicaelJarniac
Copy link
Copy Markdown

Oh, if splitting the strings is to handle indentation, we should improve the t() function to auto-dedent the input string.

You're both probably already aware, but Python comes with a textwrap.dedent.

@falkoschindler
Copy link
Copy Markdown
Contributor

You're both probably already aware, but Python comes with a textwrap.dedent.

Yes, but the behavior is slightly different. For textwrap.dedent to work you need to add \ behind the opening '''. NiceGUI elements like ui.markdown or ui.html, on the other hand, automatically ignore this empty line when determining the common indentation. This is the behavior I'd prefer for t().

@MicaelJarniac
Copy link
Copy Markdown

You're both probably already aware, but Python comes with a textwrap.dedent.

Yes, but the behavior is slightly different. For textwrap.dedent to work you need to add \ behind the opening '''. NiceGUI elements like ui.markdown or ui.html, on the other hand, automatically ignore this empty line when determining the common indentation. This is the behavior I'd prefer for t().

Now that you mentioned it, I finally noticed there's even a comment on the link I sent about this behavior.

I suppose it's fair to want to avoid the \ trick.

@MicaelJarniac
Copy link
Copy Markdown

Does Google (and other search engines) not use the language header to probe pages for multi language content?

As in, if I had a website with /foo, but depending on the user's accepted language header, it displayed translated content, will Google never pick it up?

Because, as far as I can tell, the main reason you're going with /foo and /de/foo is for SEO.

I personally find it nicer for a website to have the same route (always /foo), and provide internationalized content using other mechanisms (cookies, accepted language headers, etc), but I'm super new to this stuff and don't really know how SEO works.

@MicaelJarniac
Copy link
Copy Markdown

To answer my own question — no, Google does not use the Accept-Language header to discover translated content. Googlebot crawls with fixed headers and won't vary them to probe for alternate languages.

Google's official guidance is explicit: use different URLs for different languages (subdirectories, subdomains, or separate domains), and link them together with <link rel="alternate" hreflang="..."> tags.

Without distinct URLs:

  • Google only indexes whichever language it gets on the single URL — the rest are invisible to search.
  • You can't use hreflang (it requires distinct URLs per language).
  • Shared links give unpredictable results depending on the recipient's browser.
  • CDN caching becomes fragmented (Vary: Accept-Language).

So the /de/foo approach in this PR is the right call for SEO. However — this PR is currently missing the <link rel="alternate" hreflang="..."> meta tags, which are essential for search engines to actually understand the relationship between the language variants. Without them, Google may not associate /foo and /de/foo as the same page in different languages. The PR's TODO mentions this, but it's worth flagging as a priority before merge.

@evnchn
Copy link
Copy Markdown
Collaborator Author

evnchn commented Mar 7, 2026

If rewriting the structure is so difficult, can we do /XXXXXX?lang=zh instead?

@MicaelJarniac
Copy link
Copy Markdown

Query params (?lang=zh) are technically easier to implement, but they largely defeat the SEO purpose that motivated separate URLs in the first place:

  • Google treats query params inconsistently — they can be ignored, treated as duplicates, or not indexed reliably
  • hreflang tags work best with clean, distinct URL paths
  • Google's own docs explicitly recommend subdirectories, subdomains, or separate domains — not query params
  • Users can't easily tell the language from the URL, and sharing links becomes less intuitive

If SEO is a goal here (and for the NiceGUI website it probably should be), /zh/documentation is the right approach.

@MicaelJarniac
Copy link
Copy Markdown

By the way, I'm personally not opposed to /<lang>/<path>, it's just that I wish we didn't need to change the URL for SEO to work.

Also, I know it's likely obvious, but I'm alternating between me and my AI agent answering. If it gets annoying, just let me know.

@falkoschindler
Copy link
Copy Markdown
Contributor

Thanks for the work on this, @evnchn — the architecture here (ContextVar-based t(), URL prefix routing, JSON translation files, AST validation) is solid and well thought out.

Since the website has been completely redesigned in the meantime (the monolithic main_page.py was split into 9 component files, header/footer/design system all changed), I think rebasing this would be harder than starting fresh. What do you think — would it make sense to close this and start a new PR that builds on the learnings from this discussion?

Here's roughly what I have in mind:

Scope: Static strings on landing page, header, footer, imprint, examples page, and documentation overview. Documentation content and code examples stay English for now.

Decisions:

  • URL prefix (/zh/...), English at / — as discussed, this is the right call for SEO
  • English fallback when no translation exists
  • JSON files, one per language (website/translations/zh.json)
  • Starting with Chinese (zh)

Incorporating feedback from this PR:

  1. Auto-dedent in t() — wrapping triple-quoted strings should just work without splitting them up; t() will strip the leading blank line and dedent like ui.markdown does
  2. No naive link rewriting — the text.replace('](/', ...) approach breaks images and static assets; we'll handle URL prefixing more carefully
  3. Auto-discover languages — scan translations/ directory instead of hardcoding SUPPORTED_LANGUAGES; display name can live inside the JSON file itself

Thank you to @MicaelJarniac for the thorough review comments that helped shape this direction. Of course, @evnchn, if you'd prefer to rework this PR yourself against the new website structure, that's also welcome — happy to discuss either way!

@evnchn
Copy link
Copy Markdown
Collaborator Author

evnchn commented Mar 28, 2026

Whatever it is, it's not in 3.10. I have said:

Right, with docs getting a huge update, I think for 3.10 we can shift all the SEO/Perf/Accessibility PRs to 3.11. Easier to bisect if something goes wrong, and lightens the review workload for 3.10 as well.

I'm kinda open to perf / accessibility (we already did the keyboard nav), but i18n and SEO is definitely not it. Also the GoogleBot would need its chill time to acclimatize to the redesigned NiceGUI homepage anyways.

@evnchn evnchn modified the milestones: 3.10, Next Mar 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Type/scope: Documentation, examples and website review Status: PR is open and needs review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants