UEFI News and Commentary

Wednesday, October 31, 2012

HOW TO: Disassembling the UEFI HII Database (Part 3)

Now we'll start to get into the real details. The next section of code disassembles the Forms package. The Forms package is encoded as a series of variable-length data structures called opcodes. This encoding is called Internal Forms Representation or IFR.


Each opcode has the following structure:

First, there is the Opcode, which is an enumerated value that specifies what type of object is being described. Then there is a single Scope bit, which describes whether there are any nested objects. Then there is the Length of the whole opcode (in bytes), including the opcode header and any Optional Opcode Data. Some opcodes do not have any optional data. Others always have it. In all cases, the next opcode is always Length bytes from the current opcode, and the last byte of the last opcode will always align with the end of the Forms package.

Pretty simple. Even if you don't know what an opcode does or what it means, you can just skip it. But there are two wrinkles: GUIDed opcodes and scoped opcodes.

GUIDed Opcodes

Opcode 0x5f does not have any specified meaning, but it does have a specified structure.

This was designed as a get-out-of-jail-free card for forms browsers, so that they could implement extended functionality without risking compatibility problems. Each vendor that would like to provided extended functionality simply creates their own GUID and then modify their browser to do something different. Other browsers will not recognize the GUID and simply skip the opcode.

In fact, this capability is already used in the EDK2 implementation of UEFI found on tianocore.org. If you are interested, look at MdeModulePkg/Include/Guid/MdeModuleHii.h.

Scoped Opcodes

Then there are scoped opcodes. The Scope bit in the opcode header says that all opcodes which follow are nested within--or, in the scope of--this opcode until a matching END opcode (0x29) is found. Theoretically, there is no limit to the number of levels of nesting, but practically it is limited to 10 or so.

When one opcode is nested (or in the scope) of another opcode, it is called a child opcode and the other is called its parent opcode. Child opcodes somehow modify or augment the parent opcode. So, question-style opcodes (numeric, string, etc.) are found in the scope of a form opcode.

When parsing, it is important to keep track of the different scopes, because some opcodes can be used in different contexts. For example, an image opcode (which provides a bitmap) can be found as a child of a form set opcode, a form opcode, various statement opcodes and the one-of-option opcode.


Form Packages are parsed by the function UefiHiiParseFormPkg(), which creates a Form Package object and then goes into a big loop that processes all all of the opcode structures, one by one.When processing the opcodes, one of three things happens:
  1. A new container is created for the object that opcode represents
  2. An existing container is updated with new information or
  3. The Form Package parsing state is updated.
Here is the function with the less interesting bits hidden for clarity:
UefiHiiParseFormPkg (
  IN UINT32 Size,
  OUT SYS_OBJ **PkgData
  UINT8 *OpData;
  ...do error checking on the input...
  OpData = (UINT8 *)Op;
  *PkgData = NULL;
  // Create the Form Package container.
  FormPkgP = SysNew (UefiHiiFormPackageP);
  if (FormPkgP == NULL) {
  *PkgData = &FormPkgP->Obj;
  FormPkgP->Pkg = PkgP;                 // track the parent package.
  ...initialize the Form Package parsing state to defaults...
  while (Size >= sizeof (EFI_IFR_OP_HEADER) && Size >= Op->Length) {
    S.OpOffset = (UINT32)((UINT8 *)Op - OpData);
    ...dump out debug information...
    if (UefiFormPkgParse[Op->OpCode].OpCodeName != NULL) {
      if (UefiFormPkgParse[Op->OpCode].StartParse != NULL) {
        if (Op->Length < UefiFormPkgParse[Op->OpCode].OpCodeMinSize) {
          SYSINFO (
            "S2006 : 0x%08x : '%s' : Opcode size error. "
            "Expected at least %d bytes. Found %d bytes.\n",
            UefiHiiDumpFormOp (Op),
        } else {
          UefiFormPkgParse[Op->OpCode].StartParse (Op, &S);
    } else {
      SysTrace ("Unknown IFR Opcode: 0x%02x\n", Op->OpCode);
    // If the opcode has scope, then push the current parent opcode pointer on
    // the stack. If the opcode is an IFR END opcode, then process the end of
    // scope and pop the current parent opcode pointer from the stack.
    // There are some opcodes that *could* have scope, but did not this time.
    // We treat this as if we had found an END opcode, so that clean up is
    // consistent.
    if (Op->OpCode == EFI_IFR_END_OP) {
      if (Op->Scope) {
        SYSINFO ("END: Opcode cannot have Scope bit set. Ignored\n");
      if (SysListIsEmpty (&S.Scopes)) {
        SYSINFO ("END: Unexpected without matching Scoped opcode.\n");
        goto exit;

      TempOp = S.ParentOp;
      S.ParentOp = (EFI_IFR_OP_HEADER *)SysListRemoveTail (&S.Scopes);

      if (UefiFormPkgParse[TempOp->OpCode].OpCodeName != NULL &&
          UefiFormPkgParse[TempOp->OpCode].EndParse != NULL) {
        s = UefiFormPkgParse[TempOp->OpCode].EndParse (TempOp, &S);
        if (EFI_ERROR (s)) {
          goto exit;

    } else if (Op->Scope) {
      SysListAddTail (&S.Scopes, S.ParentOp);
      S.ParentOp = (EFI_IFR_OP_HEADER *)Op;
    } else {
      if (UefiFormPkgParse[Op->OpCode].OpCodeName != NULL && 

          UefiFormPkgParse[Op->OpCode].EndParse != NULL) {
        s = UefiFormPkgParse[Op->OpCode].EndParse (Op, &S);
        if (EFI_ERROR (s)) {
          goto exit;
    // Move to the next opcode.
    Size -= Op->Length;
    Op = (EFI_IFR_OP_HEADER *)((UINT8 *)Op + Op->Length);
  ...clean up parsing data structures...
  return EFI_SUCCESS;

There are three key data structures used in this function:
  1. S. S is a structure of type UEFI_HII_FORM_PKG_STATE, which contains the current parsing state. It keeps track of the current scope and important current objects, like the current question or the current form. It gets passed around to the parsing functions and updated as the opcodes are processed. We will talk about this a bit more, below.
  2. UefiFormPkgParse[]. This array contains one entry for every possible opcode value, from 0x00 to 0xff. It contains a pointer to the name of the opcode, a pointer the function to call when an opcode is first processed and a function to call when an opcode's scope is closed. For the purposes of this function, an opcode's scope is closed when (a) it has the Scope bit set and then a matching END opcode is found or (b) it does not have the Scope bit set. We do it this way because many opcodes can have child opcodes or not, but some processing has to wait until all child opcodes (if any) have been processed.
  3. FormPkgP. Pointer to the Form Package container object. This is the pointer that is placed into the PkgDataP member of the current Package container for Forms Packages.   


This structure holds the current parsing state:

typedef struct _UEFI_HII_FORM_PKG_STATE {
  UINT32 OpOffset;                     

  EFI_IFR_OP_HEADER *ParentOp;         

  SYS_LIST_O DisableIfP;             
  SYS_LIST_O SuppressIfP;            
  SYS_LIST_O GrayOutIfP;               

  UEFI_HII_FORM_SET_P *FormSetP;     
  UEFI_HII_FORM_P *FormP;            
  UEFI_HII_STMT_P *StmtP;            
  UEFI_HII_OPTION_P *OptionP;        
  UEFI_HII_DEFAULT_P *DefaultP;        

  SYS_LIST Scopes;                   
  SYS_LIST_O ExprP;                  
  SYS_LIST_O ValueP;                   

The OpOffset field records the offset of the current opcode from the beginning of the forms package. This is used to help display errors or debug information. The ParentOp points to the parent opcode whose scope contains the current opcode, or NULL if it is at the top level. The FormPkgP points to the Form Package container associated with the forms package being parsed.

The next three object lists are used to hold expressions for disabling, suppressing or graying-out various other IFR objects. In IFR, these are actually parent objects and objects like questions and statements and one-of-options are inside their scope. But I prefer to think of these expressions as attributes of the question, so what I do is keep a running list of all of the active parent expressions. Then, when I find a question or a statement or some other IFR object, I make a copy of the active expressions in my container. This is not only my preference (to make these attributes), but it also simplifies some sorts of parent-child error checking.

The next five members point to the containers for important objects that are currently active. For example, once we process the form set opcode and create the Form Set container, we place the pointer in FormSetP. When we leave the form set opcode's scope, FormSetP is set to NULL. It so happens that forms must be in form sets, statements must be in forms and defaults/options must be in certain statements.

The Scopes list is a first-in-last-out stack that contains the pointers to parent IFR opcodes. Each time we enter a scope, the current parent opcode (ParentOp) is pushed on to the stack and each time we exit a scope, the top element in the stack is popped into ParentOp.

The ExprP object list contains the current expression stack. Expressions in IFR operate on an expression stack and are encoded in prefix-notation (operands are encountered before operators). I prefer to store these in a tree structure instead (that is, an operator with zero or more operands). So, when we are parsing expressions, we pop zero or more expression containers from the expression stack and push the newly created expression container.

The ValueP object list contains the results of the most recent IFR value opcode, which will then be attached either to a one-of container or a statement container.


This structure holds information about the Form Package:

typedef struct _UEFI_HII_FORM_PACKAGE_P {
   SYS_OBJ Obj;            

   SYS_LIST_O FormSets;    


typedef UEFI_HII_FORM_PACKAGE_P UefiHiiFormPackageP;

The top-most encoded IFR objects in form package are always form sets. So FormSets is an object list containing form set containers for all IFR form set objects, in the order which they were encountered in the Form Package.


Now that we've made it to actually parsing opcodes, things are pretty straight forward. Now there will be either a container object for whatever is encountered in the IFR, or the IFR contents will modify previously existing container objects. There are containers for form sets, forms, statements, one-of options, values, expressions, variable stores, default stores and defaults. In the next article, we will delve into form sets and forms.

No comments: