UEFI: Writing an IFR Assembler for UEFI

This is part 4 of Writing an IFR Assembler for UEFI.

To set up the IFR Assembler, follow the instructions in Part 1.
To learn about the higher-level parsing process of the IFR Assembler, view Part 2.
To learn about the token-parsing process of the IFR Assembler, view Part 3.

This post will walk you through the expression-parsing process of the IFR Assembler.

Understanding Expression Hierarchy

Each expression is a series of operators and operands. Operators are actions that can be performed on values. These actions include addition and subtraction, multiplication and division, and comparisons along with many other actions. Operands are all of the values that the operators act upon.

In any mathematical equation, there is an order of precedence. Order of precedence is the order in which the different operators in an equation are executed. This is why multiplication is done before addition in every equation. Expressions in C, like mathematic equations, have their own order of precedence to organize the order in which the operators are processed.

To see the complete order of precedence for the C programming language, click here.

The Expression-Parsing Process

Part 2 of this series discussed ParseOperand(), which calls a different parsing function depending on what type of operand is found.

If an expression is found, ParseExprOperand() is called, and ParseExprOperand() calls ParseExpr().

ParseExpr() is the beginning of a large recursive loop. Before anything is even parsed, ParseExpr()
calls ParseExpr2(), and ParseExpr2() calls ParseExpr3(), and this process goes on until ParseExpr12().

The functions ParseExpr() through ParseExpr12() determine the order of precedence for the expression parsing. ParseExpr12() hold the operands and operators with the highest precedence, and ParseExpr() holds those with the lowest. These functions tell the parser what to do with the each operator and operand and what type of token to expect next. See table 4.1 for a full list of operands and operators parsed by each function.

Table 4.1 - Operands and Operators Parsed by Each Function

When operators within an expression are parsed, they are turned into their own sub-expression. Depending on the number of operands that the operator requires, a different function is called. For example, if the operator has only one operand, like ++ (T_P__PLUS_PLUS), it is parsed with the function CreateExprUnary().

On the other hand, if the operator is like + (T_P__PLUS), which has two operands, the function CreateExprBinary() is called.

If the parser finds a question mark (T_P__QUESTION), that means that there is a conditional statement involved (EXPR_O_CONDITIONAL). Conditional statements are the only trinary expressions, which requires three operands, so if a question mark is found, CreateExprTrinary() is called. These functions create expressions for the operators and expect a specific number of operands.

If the operands in the expression are unsigned integers (T_P_UINT), CreateExprUint() is called. This function creates an expression from that integer.

If the operand is either true (T_P_TRUE) or false (T_P_FALSE), the function CreateExprBoolean() is called to make the expression with a Boolean response.

If the operand found turns out to be an IFR opcode, it is treated like a function and function parameters are added to the operand. These operands are found within the parentheses of the opcode and are not written as a separate expression. Instead, their information is recorded as operands of the opcode's expression. A different function is called depending on the number of operands the opcode can hold. If the opcode has a set number of operands, then CreateFn() is called.

If the number of operands within an opcode is variable, then CreateVarFn() is called. In each of these function, all the operands of the opcode are written into the expression with the opcode itself.

If found within an expression, a left parenthesis (T_P_LPAREN) represents the beginning of another expression nested within the first. For example, in the expression x * (y - z), the section y - z can stand alone as its own expression since it is inside parentheses. Because the expression within the parentheses is, in fact, a separate expression, ParseExpr() is called recursively to parse the nested expression until it finds the expression's end, which is marked by a right parenthesis (T_P_RPAREN). After being parsed, the entire nested expression is returned and is treated as a single operand. In our example, the (y - z) would be treated as a single operand, w, so that the equation would read x * w.

The Expression Tree

Once the parser begins going through the expression, it begins to organizes the operands and operators by order of precedence. The way they are organized can be described as a type of expression tree. The higher the precedence of a specific token, the closer it will be to the leaves of the expression tree. Inversely, the lower the precedence of a token, the closer the token will be to the roots of the tree. In the expression a * b + c + d, the expression-parser would begin parsing the expression. First, the parser would point to a. Since a is an operand and operands receive the highest precedence, a will be set as a leaf in the tree.

The parser would then point to the asterisk, which serves as the multiplication operator. Since the asterisk is an operator, it has a lower precedence than a, so the multiplication operator is moved up a level. The multiplication operator requires two operands, each one represented by a branch coming off of it in the tree. a is set to one of operands of the multiplication operator.

When b is reached, the parser sees that the multiplication operator still has another required operand, so b is set as that operand.

The parser then moves to the first plus sign, which is the addition operator. Since it is not an operand, the plus sign is moved up a level. Since the plus sign has less precedence than the asterisk, it is moved a level higher than the asterisk. The asterisk is set as one of the operands of the plus sign. The next token, c, is set as the other operand for the plus sign.

The parser moves on to the second plus sign. Since the plus sign has less precedence then an operand, the plus sign moves up a level from c. Though both plus signs are on the same level of precedence, precedence goes left to right for plus signs, so the first plus sign takes precedence. The second plus moves a level higher than the first plus sign, becoming the root of the expression tree. The parser then moves onto d, which becomes the second operand of the second plus sign.

Writing Expressions

All expression are put within an opcode, usually a conditional. When an expression is finished being parsed, it is returned to ParseForm(), which writes the opcode's information into the output buffer. In order to write the opcode's expression to the output buffer, the function WriteExpr() is called.

WriteExpr() calls WriteExprHelper(), which then takes the expression and creates an opcode out of the expression and its operands. After creating the opcode, the opcode is returned to WriteExpr(), which writes the opcode to the output buffer.

Normally, operands are written to the buffer before the operator is written, so if a + b is found, the opcode, a b + would be written to the output buffer. If an expression does has a specific order for its operators and operands to be written in, an expression-specific function is called. See Table 4.2 to view these expression-specific functions.

Table 4.2 - The table below shows expression-specific expression-writing functions called by WriteExprHelper().

This finishes our series on Writing the IFR Assembler for UEFI. The IFR Assembler used does not several situation. The IFR Assembler does not handle any packages other than form packages nor does it have an optimization process. Also, the IFR Assembler is not free from bugs. This Assembler was simply made to as an example of how to set up an IFR Assembler for UEFI.

Table 4.3 - Operands and Operators Parsed by ParseExpr12()

UEFI

Wednesday, August 07, 2013

Writing an IFR Assembler for UEFI - Part 4

Understanding Expression Hierarchy

The Expression-Parsing Process

The Expression Tree

Writing Expressions

No comments:

Harnessing The UEFI Shell

Beyond BIOS

Blog Archive

Contributors