CTL/CJK format character previews

As Lior Kaplan demonstrated at LibreOffice 2011 Paris, our format character preview really sucks for CTL and CJK users. If no CTL/CJK text is selected then no CTL sample text is shown, and the CJK sample text is from the fontname itself. Many font names are just Latin text, so give no indication what they look like in the actual script/language that is being written in.

e.g. Old dialog for CTL, will only preview some Western text if no text is selected, no attempt to show any sample CTL text, or even the CTL fontname. For CJK it will additional show the fontname of the CJK font in the preview, which isn’t helpful if the CJK fontname contains no CJK glyphs.

Simply adding the CTL fontname wouldn’t help much, seeing as the fontname is David CLM. So, currently reusing the preview text used in the font-dropdown first stab at “doing the right thing” gives me…

Code for all this is mostly in svtools/source/misc/sampletext.cxx where there is now some hugely over-engineered set of heuristics to guess the best script a font is tuned for and various functions to generate suitable text when all we have is the font, versus the font+language vs just the language and if we want a short identifier to classify what script a font might be good to render vs a longer sequence of sample text for a font preview.

Probably best to drop rendering the fontname in the Western case for the text preview and use some sample text there too, at least for the mixed Western+CTL+CJK case as its confusing to have a font name rendered and some sample text in another font.

After initial posting, there was some comments about the hideous rendering of the Hebrew text, which appears to be an artefact or using David CLM. Here’s what it looks like with David, i.e. its the rendering using that font that misplaces the Nikud, not me. Whether this is an interesting bug in our renderer, or maybe glyph fallback, or the font itself it probably worth of investigation.

9 Responses to “CTL/CJK format character previews”

  1. Elad Alfassa says:

    The Nikud of ??? ??? ???? seems to be rendered incorrectly. the Kamatz (http://en.wikipedia.org/wiki/Kamatz) should be under the Alef, not under the right side of it.

  2. Elad Alfassa says:

    Err, seems that your WordPress installation changes every Hebrew character I right into question marks.

  3. It’s not directly related to the issue you discuss here, but while you’re at it, consider putting an end to the preposterous practice of dividing all the world’s languages into “Western”, “CJK” and “CTL”. First of all, most people don’t know what “CTL” and “CJK” are–even the people who write in the so-called “complex” languages.

    Much more importantly, these labels are over-generic, badly outdated and just plain wrong. Hebrew and Malayalam are both “complex” according to this division, but they are complex for entirely different reasons and they are usually written using different fonts. However, they may be used in one document, and even in one sentence. Believe it or not, right now i am reading–and making little edits to–such a document myself: it is a grammar book of the Malayalam language written for Hebrew-speaking students, and it uses “western fonts” for phonetic transcription of Malayalam words. There’s no easy way to specify styles for this document and it drives me and its author nuts. There are more documents of this kind than many “western” people might think.

    This dialog should not be oriented to outdated concepts like “CTL” and “Western”, but to languages and scripts. Looking at ISO 15924, ISO 639 and the IANA Language subtag registry is the right way to start with this.

    There is a reasonable way to implement this without giving the users a very long list of languages, but since this comment is getting too long and since this problem pestered me since about 1997, i am going to write a post in my own blog about it.

  4. [...] with Ophira Gamliel; and Lior Kaplan‘s and Caolán McNamara‘s questions about the font selection dialog in LibreOffice. Thank you, Santhosh, Ophira, Lior and Caolán for making me finally write this post, which i [...]

  5. [...] with Ophira Gamliel; and Lior Kaplan‘s and Caolán McNamara‘s questions about the font selection dialog in LibreOffice. Thank you, Santhosh, Ophira, Lior and Caolán for making me finally write this post, which i [...]

  6. [...] with Ophira Gamliel; and Lior Kaplan‘s and Caolán McNamara‘s questions about the font selection dialog in LibreOffice. Thank you, Santhosh, Ophira, Lior and Caolán for making me finally write this post, which i [...]

  7. Caolán says:

    Sure, the three categories are effectively arbitrary and bizarre, but they’re not my categories. The problem with changing them is that ODF has the three categories built into it of LATIN, ASIAN, and COMPLEX. OpenXML has three, or four?, equally crazed categories built into it as well, where the division between categories is fuzzy and ill-defined. In LibreOffice/ODF its also really annoying that the fontsize and italic/bold is different between script categories as well, not just the fontname, which makes interoperability somewhat painful.

    If I was starting from scratch I’d either not have different categories at all in the first place, or have different categories based on ISO-15924 and share fontsize/fontstyle settings between scripts.

  8. Milan Bouchet-Valat says:

    “some hugely over-engineered set of heuristics to guess the best script a font is tuned”
    Do you think this could be used to hide fonts that are clearly not relevant to the scripts the user is currently typing in? It would be nice to hide all these Chinese, Arabic, etc. fonts that clutter the fonts menu on every Linux distribution (in Fedora, we have these Lohi fonts, plus a few Chinese at the end of the list). I guess the current script(s) can be chosen from the language of the document.

  9. Caolán says:

    Probably would be nice to in the format->character dialog have another column which groups together fonts by the scripts they support, and/or some other natural shared features. Could have some sort of toggle in the drop-down font dialog to only show fonts which support the scripts currently used in the document or something of that nature. Not working on that myself though,

Leave a Reply