There is some good discussion in https://what.thedailywtf.com/t/so-about-strings-and-unicode/52250/
Also, Perl 6 seems to be handling things in the ideal way:
https://6guts.wordpress.com/2015/04/12/this-week-unicode-normalization-many-rts/
Separate types and treatments for bytes, code points, and graphemes.