UEFI News and Commentary

Monday, October 26, 2009

UEFI HII (Part 5): Strings API

Last time we learned that strings in a specific language are grouped together in packages and packages are grouped together in package lists. Strings not only have text and a language, but they also have an associated font, font size and font style.

To get to a specific string, you need three things: the package list handle (EFI_HII_HANDLE), the string identifier (EFI_STRING_ID) and the language. Since you are normally using the platform's current language, you generally don't have to worry about that and use the default setting.

There are two sets of string-related protocols: (a) those which add, modify and remove strings from the HII Database and (b) those which create a bitmap using the strings text.

With EFI_HII_STRING_PROTOCOL, you can create new strings, retrieve the string's text, change the string's text and find out which languages are supported.
  • NewString() lets you create a new string within a specific package list and returns back a new string identifier. You can specify (if you want), the language and the font information. If you don't specify the language, the platform's current language is used. If you don't specify a font, then the platform's standard font is used.
  • GetString() retrieves information about a specific string, including the string's text and the associated font. If you don't specify the language, the platform's current language is used.
  • SetString() changes information about a specific string, including the string's text and (optionally), the string's font information. If you don't specify the language, the platform's current language is used.
  • GetLanguages() reports the languages that are supported by a specific package list.
  • GetSecondaryLanguages() reports the regional versions of a specific language supported by a specific package list. For example, if you passed in "en", it might return "en-US", "en-UK" "en-PH" if there were specific translations for those regional variations in English.
The EFI_HII_FONT_PROTOCOL deals with fonts and we'll return to that subject in a later article, in all its proportional font glory. Right now, we'll just focus on the two functions which deal with strings:
  • StringToImage() draws a string's text onto a bitmap or the screen at a specific location. If the font is not specified, then the platform's standard font is used instead. If the language is not specified, then the platform's current language is used.  This function also allows you to clip the text to a rectangle in a variety of ways, wrap the text, draw transparently and handle multi-line text. Even more useful to sophisticated users, it also reports the positioning of each character in the string in the bitmap so that user-interface code can use it for cursor movement and mouse selection.
  • StringIdToImage() does the same thing, except that you pass in a string identifier and a package list handle.
Next time we'll start looking into the wonderful world of bitmapped fonts.

Tim

Thursday, October 15, 2009

UEFI HII (Part 4): Strings

Up to this point, we've discussed the HII Database and how individual drivers can contribute resources (strings, fonts, images, forms, etc.) in the form of packages to that database. Groups of packages are called package lists. Later the form browser extracts these package lists from the database to use in constructing the user interface for platform configuration and other user-interface tasks.

One type of package that drivers can contribute to the HII Database is the strings package. A string package is a collection of strings associated with a specific language. Each string has a number (called the string identifier) that uniquely identifies it within the package list, and default font information, such as font name, size and style.

Languages.
Each string package only contains text in one language. For example, one string package contains English (U.S.), another Korean, another Pilipino and another French. By separating the languages into separate string packages, it is easy to delete support for a particular language: just delete the associated package.

The UEFI firmware maintains the system user-interface language information in two special EFI variables:
  1. PlatformLangCodes. The list of languages that the platform supports.
  2. PlatformLang. The current platform language.
These EFI variables are also used by other UEFI protocols, such as the EFI Driver Configuration protocol, the EFI Driver Diagnostics protocol, the Component Name protocol and the Unicode Collation protocol.

Languages are encoded according to RFC 4646, which specifies two and three letter codes for each language, along with additional modifiers representing a specific geographic location. For example:
  • en-US (English, United States)
  • fr (French)
  • fr-FR (French, France)
  • zh-CN (Chinese, mainland China)
  • sr-CS (Serbian, Serbia/Montenegro)
The rules by which a form browser might substitute an alternate language (say, Portuguese from Portugal if there was no Portuguese from Brazil, or English if there was no French), is specific to the implementation.

It is possible to find out which languages are supported by iterating through all of the package lists in the system (using ListPackageLists()) and then using GetLanguages() and GetSecondaryLanguages().

Fonts
Each string is associated with a specific font family, font size and font style. The form browser may choose to use this information, ignore it completely or substitute a similar but different font. So this font information might be considered more of a suggestion, rather than a command. In many cases, firmware implementations may use this information to cull the fonts that are included in the BIOS ROM (for space reasons) so that the font only contains the characters used.

The HII-related protocols (such as HII Font or HII String) will use the font information associated with the string to select the display font unless an alternate is provided by the caller.

There are three font attributes associated with each string.
  • Font Name. The font name the family of font (Arial, Helvetica, Times Roman, Courier), which identifies, in broad terms, the visual style of the font.  
  • Font Size. This is the cell height, in pixels. To give some perspective, the "standard" UEFI font size is 19 pixels high.
  • Font Style. The font style indicates how the basic font should be modified. The following styles can be described: bold, italic, emboss, outline, shadow, underline and double-underline.
If the form browser does not have access to the exact font specified by a string, it might substitute a different font or it might synthesize using an algorithm. An example of substitution would be using Helvetica instead of Arial. An example of sythesizing would be if a doubled 12-size font were used for a 24-size font or if the italic style were simulated by shifting the successive lines of a glyph over by one pixel so that it would slant.

Identifiers
Strings are uniquely identified in the system by a string identifier (EFI_STRING_ID), a package list handle (EFI_HII_HANDLE) and a language. If the system's display language is English ('en-US') and you ask for string #1, you would get string #1 from the English string package. If the system's display language is French ('fr') and you ask for string #1, you would get string #1 from the French string package. Likewise, for Japanese ('jp'). As a rule, UEFI drivers don't hand around pointers to null-terminated strings. Instead, they pass around string identifiers and package list handles.

Can you actually examine and modify the text? Of course. GetString() retrieves the actual text and SetString() lets you modify it. NewString() let's you create a brand new string, with a unique string identifier.

Encoding
Each string package begins with the standard header (EFI_HII_PACKAGE_HEADER) with the Type set to EFI_HII_PACKAGE_STRINGS. Following the standard header is the string-package-specific header:

typedef struct _EFI_HII_STRING_PACKAGE_HDR {

  EFI_HII_PACKAGE_HEADER Header;
  UINT32 HdrSize;
  UINT32 StringInfoOffset;
  CHAR16 LanguageWindow[16];
  EFI_STRING_ID LanguageName;
//CHAR8 Language[ … ];
} EFI_HII_STRING_PACKAGE_HDR; 

The actual string data begins at StringInfoOffset bytes from the start of this structure. The LanguageWindow array is used for setting up the default "windows" used for the compression algorithm. More on this later. Language is the RFC 4646 null-termined language string which identifies which language this package is for. For example, "en-US" or "fr" or "jp". LanguageName is the string identifier of the string that gives the user-readable name of the language this package is for. For example, "English" or "French" or "Japanese" These can be used when presenting choices to a user.

The string information consists of a series of records, which can be broken down into three categories:
  1. String Records. These records assign the current string identifier value to specific string text .
  2. Identifier Records. These records change the current string identifier value.
  3. Font Records. These records describe the fonts used by later strings.
String Records
String records are broken down into three types:
  1. Use compressed text or uncompressed text. Text can be compressed using the Standard Compression Scheme for Unicode (SCSU), which is described in Unicode's Technical Report #6. Optimized for reducing the number of bytes required to describe Korean, Japanese and Chinese characters, this scheme uses the concept of "windows" of 127 characters than can be selected for a sub-string of characters. The default settings for these windows are specified in the report. But they can also be optimized for the exact strings in the package by altering the values in the LanguageWindow array in the header. Uncompressed text is simply listed as a null-termined UCS-2 string.
  2. List single string or multiple strings. Strings which have string identifiers which are sequential can be listed in a single record. Or a single string can be listed.
  3. Use the default font or a specific font. Strings which use the default font require fewer bytes to encode because the font is implied.
  4. Use text provided or duplicate text. One of the record types simply implies that the text for the new string is a copy of the text for a string which was previously defined.
As mentioned before, strings are associated with a specific font. However, the fonts can be changed in the middle of a string using a series of special control characters. The character values used are marked as "implementation-specific" in the Unicode specification:

For example, characters 0xF7xx (where xx is the font identifier assigned by a font record, below) can be used to switch fonts. Characters 0xF8xx (where xx is the font size) can be used to change just the font's size. Characters 0xF620 and 0xF621 turn bold on and off. Characters 0xF622 and 0xF623 turn italic on and off. Characters 0xF624 and 0xF625 turn underline on and off. Characters 0xF626 and 0xF627 turn emboss on and off. Characters 0xF628 and 0xF629 turn outline on and off. Characters 0xF62A and 0xF62B turn double underline on and off.

Identifier Records
Identifier records are used to adjust the current string identifier value without assigning any string text. This can be useful when there are gaps in the string identifiers. When processing the string records, the current string identifier is always set to 1 and is incremented each time a string record is processed. So, normally, the first string is assigned identifier #1, the second #2, etc. But if the identifiers are not sequential (i.e. 1,2, 16) you can use a skip record so that after the second string, you just skip the next 13 identifiers.

Font Records
The font records must appear before the first instance of a string that uses them. The exception for this is the "default" font which is initialize to the system's default font. Each font is assigned a number that is only valid within the package, starting with 0 (for the default font) and going upwards. That means there is a theoretical maximum of 256 fonts used wihtin a string package. In addition to the identifier, each font has the usual attributes (name, size, style).

Conclusion
Strings are an important part of any user-interface. The ability and the flexibility to display strings in multiple languages, using a variety of font styles, sizes and families is important in making a rich user interface.

Next time, we begin to delve into the wicked world of HII fonts.

Tuesday, October 13, 2009

UEFI @ Intel IDF.

I just saw that they posted publicly the information from Intel's IDF, including the UEFI track, here. You can see a number of interesting articles from the industry heavyweights, like Dell, IBM, Microsoft and Intel. Not to mention the co-authors of a book (Vincent Zimmer, Mike Rothman) about the UEFI Shell. Good reading.

Tuesday, October 06, 2009

UEFI HII (Part 3)


The UEFI Human Interface Infrastructure (or HII) provides a means for drivers provided by 3rd party hardware and software vendors to expose their configuration settings. Then, a browser or provisioning application can store, restore or change those configuration settings. The configuration settings are encoded as packages. There are several different types of packages defined in the UEFI specification: fonts, strings, images, animations and, most importantly, forms. Each package has the following header:

typedef struct {
  UINT32 Length:24;
  UINT32 Type:8;
//UINT8 Data[…];
} EFI_HII_PACKAGE_HEADER;

This structure contains both the package Type (font, form, image, etc.) and the package Length (including the header), in bytes. Each driver can group these packages (called a package list) and install them in the HII Database using NewPackageList(). Then later an application can find them and display them. The package list has the following header:

typedef struct {
  EFI_GUID PackageListGuid;
  UINT32 PackagLength;
} EFI_HII_PACKAGE_LIST_HEADER;

So that leaves two big questions:
  1. How do you create the packages and package lists?
  2. How does the driver find them?
Well, the first question is a bit tricky and each BIOS vendor probably answers this question differently. The EDK (published by Intel at http://www.tianocore.org/), uses a script language called Visual Forms Representation (or VFR) that is compiled down into IFR. It also specifies strings in special .UNI files, which associate a label with a string and language. At the end of the day, these different package types get built into a binary. And that binary can be packaged four ways:
  1. Built-Into The Driver. In this method, a tool takes the bytes that make up the package list and generates an assembly-language (.ASM) source file, which is then linked together with the rest of the driver. This method makes it easy to find the package list, since it has a normal label that resolves during the linking process.
  2. Separate File. With this method, the binary is packaged up as a normal firmware file with a special file type or, more commonly, a special file name. The driver then searches the firmware volume in which it resides for a file with the special name, loads it into memory and then gets a pointer to the first byte. This method takes a little more work, but it allows the package list to be generated separately. Since the package list can be easily located, it can then be edited at some point after the driver has been compiled but before the final flash image is created. This is very useful when you want to process the package lists without ever touching the source code. For example, if your driver ships supporting 30 languages, but you only have ROM space for 3, you could either recompile the driver or you could just edit the binary information. Or perhaps you want to allow a downstream VAR to substitute their logo for the generic logo. By making the binary form of the package list easy to locate, these changes can be made easily. This method also allows files to be on separate media, such as a disk.
  3. Same File, Separate Section. This method is similar to the method described above, in that the binary information is packaged separately from the EXE. However, here, the binary is included as a separate section in the same file. The Firmware File Specification (either the older Intel Tiano version or the newer UEFI PI version) allows certain file types to be broken up into sections. One section contains the driver itself and, in this case, another section contains the package list. This shares many of the advantages of the previous method, but creates a direct association between the driver and its related forms, fonts, strings and images.
  4. Same File, Resources. This method embeds the binary into the EXE portion of the file as a resource with the resource type "HII" (see the LoadImage()). When LoadImage() runs, it looks for the resource and, if its is found, installs a protocol on image's handle with the GUID EFI_HII_PACKAGE_LIST_PROTOCOL_GUID that contains a pointer to the package list. A driver uses HandleProtocol on its own image handle to find a pointer. This has an advantage over the previous methods in that it does not require a UEFI PI image or the firmware file system. So therefore it is suitable for pure UEFI drivers or drivers that are loaded from a plug-in card's option ROM.
After finding the resources, the driver simply passes a pointer to the package list into NewPackageList().

Next time we'll look in greater detail at the different types of packages, starting with the strings.

Saturday, October 03, 2009

Phoenix Demos 1 Second UEFI Boot

Several news sources have posted coverage of IDF 2009, where Phoenix (that's who I work for) booting a UEFI-based system in 1+ seconds and Windows 7 in 7-10 seconds. You can read about it (for example) here.

The UEFI Shell: Moving The Platform Beyond DOS

Hey, just a plug for a book that I co-wrote with three other UEFI experts (Mike Rothman, Vincent Zimmer and Bob Hale) about the UEFI Shell. You can read an excerpt from the book on the Intel Press web site here.