UEFI News and Commentary

Monday, July 29, 2013

Writing an IFR Assembler for UEFI - Part 2

This is part two of Writing the IFR Assembler for UEFI. This article discusses the IFR Assembler set up in Part 1. To set up the IFR Assembler, follow the instructions in Part 1.

This blog post will walk you through the basic parsing process of the IFR Assembler from the parsing of the Command Line to the parsing of individual forms.

Understanding the Parsing System


Normally, the IFR Assembler is given a .pl file from the command line to parse. The .pl file holds the information about the packages, form sets, forms, and everything that needs to be parsed by the Assembler. There are two types of places to put information for each of these items to be parsed. First, information that relates specifically to that item, like operands of a form, are put within parentheses right after the item. If there is information nested inside an item, like a form nested within a form set, the information is put into the curly braces after.

When parsing a .pl file, the IFR Assembler starts with the first item, which is usually the Package List, the item is parsed, then anything nested within is operated, which would be the Form Package. After the Form Package is parsed, everything nested in front of the Form Package is parsed, and this cycle goes on until the Assembler has parsed everything. When it has, the Assembler writes out the parsed information to the output file. The following two sections will serve as a basic walk through of how text on a command line is parsed and written to the output file.

Command Line Parsing

The IFR Assembler application is entered through the function main() in the program IfrAsm.c. One of the important jobs of main() is to translate information given to it from the command line so that it can be parsed by the IFR Assembler.












To begin parsing the command line, InitCmdLine() is called to initialize command line global variables.







After the initialization is finished, main() calls ParseCmdLine(). ParseCmdLine() and ParseCmdLineOption() go through the command line and create an array of the actions and locations written into the command line.










main() uses this array to get information from the locations. This information is then made into a source file.







Parsing Within Packages



The source file is sent to the function ConvertSourceFileToPackages(), which is within ParsePackage.c. This function, like the name suggests, converts the source files into packages. ConvertSourceFileToPackages() puts all the packages in to a package list which is sent down through ParsePackage.c to get parsed.











The package list is sent to ParsePackage(). This parses the contents of the packages and depending on the type of package found, a different parsing function is called. Currently, only form packages are recognized by the Assembler, and if the packages found are not form packages, an error is produced.






Form packages are sent to ParseFormPackages(). This function takes the form package and goes throught its contents. Normally, a Form Package carries Form Sets. Each set is sent to ParseForms() to be parsed further.










The contents of the Form Package are sent to ParseForms() which goes through the contents and parses each form set within the package.









ParseForm() is given a single form from ParseForms(), and it reads the op-codes that are held within the form. When ParseForm() reaches an opcode, it checks to see if the op-code can have any operands (This information on op-codes is found in Opcodes[] in Parse Package. If the op-code does, ParseForm() checks for a left parenthesis). If the operand requires an operand, the left parenthesis is required, otherwise, the left parenthesis is optional. If a left parenthesis is found, ParseForm() sends the operands within the parentheses to ParseOperand().










ParseOperand() decides what type of operand it was passed and tests to see if the operand type found matches the operand type expected. If the operand type found matches what is expected, ParseOperand()  decides which function to send the operand to. There are specific functions for each type of operand, like ParseUint8() or ParseExprOperand(). Each operand-specific function has a unique set of tasks that it performs in order to parse the operand properly.








After the operands are all parsed, the form's information is returned back to ParseForm() in a buffer to be written to the output file. . When all the forms, lists, and packages have been parsed and written to the output file, the program returns back to main(). main() frees information and clears out any remaining information.









The table below gives the basic information of each parsing function discussed in this article, including its name, what item it parses, its location in the assembler, and which section of the UEFI Specification it refers.
 
 

 
When a test case like the one used in Part 1 is entered into the command line, this chain of parsing functions is what takes the commands on the command line, translates it, and writes it to the output file. In the next article, you will learn about token parsing in the IFR Assembler.

No comments: