Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Actually ö, with two little round dots, (and ä, ü and their uppercase counterparts for that matter) can both be umlaut (über) and diaeresis (argüer) depending on context.

The difference is important because some collations should distinguish between the two cases and put them in different places when sorting. Another difference is that in traditional fine typography sometimes different sorts were used for the two cases. The umlaut should have the dots closer to the letter body than the diaeresis. Yannis Haralambous writes about these topics in his O'Reilly book Fonts & Encodings, especially about the fact that Unicode can't really express this nuances and if I remember correctly that some software uses the order of combining diacritical marks to encode the difference.



In addition in the case of the U with diaeresis in some languages it means the vowel is sounded and not silent as opposed to indicating independent syllable. An O with diaeresis in English means it’s a separate vowel in a double vowel orthography where traditionally the two vowels represent one sound (chicken coop vs chicken farm coöp.)


That's interesting! Never thought that the meaning of the two dots should affect the sort order and that it is not possible to represent with Unicode.

If there are previous examples of the two rendered differently, I think that makes a pretty good case for extending Unicode.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: