|
|
On Language Tags and Font Selection
in KOffice
Sivan Toledo
This document proposes the addition of language tags to text in Koffice applications and explains how the language information should be used by Koffice. Many documents contain text in more than one language. This screenshot shows part of an article in an Israeli daily newspaper. As you can see, English names (in this case, names of computer games) are printed in Latin letters, not translated or transliterated into Hebrew. This situation is quite common. In documents that contain text in two or more languages, it is helpful to tag text with language information, so applications can determine the language for each piece of the text. This is similar to the way application tag text with style (e.g., normal or heading, etc.). Documents and document templates can have default languages associated with particular styles, so most of the time is it not necessary to set the language manually. For example, I should be able to create a template in which all the styles default to Hebrew. When I use this template to create a document, all the text is implicitly tagged as Hebrew, except for parts that I explicitly mark as other languge (say by selecting, right-clicking, and selecting a language from a little menu). The default language association should mean that most of the time users do not need to bother with setting the language. Language tagging is useful for several purposes:
Therefore, I think that Koffice 1.2 should add language tags to text (in the DTD’s) and a mechanism to set the language of text. I also think that the “view” menu should have an option that marks text according to language using, say, the background color, in order to assist in language tagging and to allow fixing tagging errors. Although some of the features that language tagging enables may be quite far off for Koffice (e.g., glyph selection using OpenType features), I think that it is important to build the tagging mechanism into Koffice as soon as possible. I would also like to note that GNOME/GTK 2.0 applications now use pango to render text, and pango uses language or script specific rendering engines, so these applications probably already tag text with language. Finally, I want to show how one selects fonts in MS Word, to convince readers that language tagging is really necessary:
As you can see, for Latin text I chose Georgia, but since Georgia does not support Hebrew, I get to choose another font for non-latin scripts. StarOffice has a similar font selection dialog for the non-latin versions. One may think that the solution is to use fonts with sufficient Unicode coverage for the document, but this is not really a desirable solution, since it restricts font selection too much. I often use Hadassah for Hebrew and Bookman or Raleigh for English, since these fonts look good and look good together, and I would not be able to do so if I had to choose one font. I propose an interface that is both simpler and more general than the MS Word interface. I propose to leave the font selection menu pretty much as is, but to add a button “add font for other languages”. Only if I click on it, I get another tab in the dialog. The new tab is another font selection menu, but with a list where I can mark languages. The first font is used by default, but for text in languages that I marked in the second font, the application uses the second font. This mechanism is better since most users in Latin countries will never need to click this button and to deal with multiple languages for a style, and because it allows users to define more than two fonts (e.g., for Eglish/Hebrew/Arabic documents, which are commonly used in Israel in packaging and government forms); the MS Word solution does not permit this. Another note on language tagging in MS Office documents: it is not explicit, and I am not sure exactly how they do it. It is clearly related to the keyboard setting. I think that this would be difficult to do using XKB (which is what people in Israel use for Hebrew/English entry), but perhaps would be possible using KDE’s international keyboard utility. But I think that Koffice should also enable explicit tagging, not only keyboard-related implicit tagging. Last updated on Sunday, May 05, 2002 |
|