Skip to content

Break words per-character for CJK languages #1

Description

@msikma

Hi there. This is a really nice looking library and it looks to work well. I've got a suggestion that might be of use: in the wordWrap() function, it might be nice to support CJK language wrapping rules.

Basically, languages using CJK characters don't really have word wrapping, and words are split up across multiple lines if they happen to be at the end of a line. This is because they basically don't have spaces like Latin text does. (Hangul/Korean is slightly different because it does have spaces, but words are normally still broken up per-character mid-word.)

So if we take this sentence containing both Latin and Japanese text:

Lorem ipsum dolor sit amet 7月には七夕がある。お願いだから泣かないで, non maximus magna urna maximus massa.

It should be wrapped like this (note: width 22 columns, doesn't quite look correct on Github):

----------------------
Lorem ipsum dolor sit
amet 7月には七夕があ
る。お願いだから泣かな
いで, non maximus
magna urna maximus
massa.
----------------------

At the moment, using this package, it's wrapped like this:

----------------------
Lorem ipsum dolor sit
amet
7月には七夕がある。お願いだから泣かないで,
non maximus magna urna
maximus massa.
----------------------

There might be other languages that have similar rules. I'm not entirely sure. I do know that emoji get a similar treatment in the browser—if you put a bunch of emoji next to one another they get wrapped per-character instead of per-word. I don't know if it applies to all characters of width=2.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions