{ "version": "https://jsonfeed.org/version/1.1", "title": "Jim Calabro", "home_page_url": "https://calabro.io", "description": "My personal website including blog posts mostly on computing, but perhaps sometimes music, fitness, video games, etc.", "author": { "name": "Jim Calabro" }, "authors": [ { "name": "Jim Calabro" } ], "items": [ { "id": "", "url": "https://calabro.io/dwarf/die", "title": "How DWARF Works: Debug Information Entries", "content_html": "\n\u003c!DOCTYPE html\u003e\n\u003chtml lang=\"en\"\u003e\n \u003chead\u003e\n \u003cmeta charset=\"utf-8\"\u003e\n \u003cmeta name=\"viewport\" content=\"width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=no\" /\u003e\n \u003cmeta http-equiv=\"Cache-control\" content=\"public\"\u003e\n \u003cmeta name=\"description\" content=\"Jim Calabro\"\u003e\n \u003clink rel=\"canonical\" rel=\"noreferrer\" href=\"https://www.calabro.io\" /\u003e\n \u003clink rel=\"icon\" type=\"image/svg\" href=\"/static/favicon.svg\"\u003e\n \u003ctitle\u003eHow DWARF Works: Debug Information Entries - Jim Calabro\u003c/title\u003e\n \u003clink rel=\"stylesheet\" type=\"text/css\" charset=\"utf-8\" href=\"/static/lit.min.css\"\u003e\n \u003clink rel=\"stylesheet\" type=\"text/css\" charset=\"utf-8\" href=\"/static/style.css\"\u003e\n \u003clink href='/static/Raleway.css' rel='stylesheet' type='text/css' async defer\u003e\n \u003clink rel=\"stylesheet\" href=\"/static/font-awesome.min.css\"\u003e\n \u003cscript async defer src=\"/static/font-awesome.min.js\" data-auto-replace-svg=\"nest\"\u003e\u003c/script\u003e\n \u003cscript type=\"text/javascript\"\u003e\n async function emailSignup() {\n const emailSubmit = document.getElementById('email-submit');\n emailSubmit.disabled = true;\n\n const email = document.getElementById('email-input').value;\n const resp = await fetch('/v1/email', {\n method: 'POST',\n body: JSON.stringify({ email: email }),\n });\n\n emailSubmit.disabled = false;\n\n const emailStatus = document.getElementById('email-status');\n if (resp.status === 200) {\n emailStatus.innerText = 'Success! Thank you for subscribing.'\n } else {\n emailStatus.innerText = 'Hmm, that didn\\'t work. Please ensure you entered a valid email address and try again later.';\n }\n }\n\n function emailSignupKeyEvent() {\n if (event.key === 'Enter') {\n emailSignup();\n }\n }\n\n const mobileWidth = 560;\n\n function toggleMenu(event) {\n if (window.innerWidth \u003e= mobileWidth) {\n return;\n }\n\n const links = document.querySelector('.links');\n const display = links.style['display'];\n if (!display || display === 'none') {\n links.style.display = 'block';\n } else {\n links.style.display = 'none';\n }\n }\n\n addEventListener('resize', (event) =\u003e {\n if (window.innerWidth \u003e= mobileWidth) {\n document.querySelector('.links').style.display = 'block';\n } else {\n document.querySelector('.links').style.display = 'none';\n }\n });\n \u003c/script\u003e\n \u003c/head\u003e\n \u003cbody\u003e\n \u003cdiv class=\"row\"\u003e\n \u003cdiv class=\"col 2\"\u003e\u003c/div\u003e\n \u003cdiv class=\"col 2\"\u003e\n \u003cdiv class=\"collapsed-links\"\u003e\n \u003ch3 class=\"name-header\"\u003e\u003ca href=\"/\" style=\"cursor: pointer\"\u003eJim Calabro\u003c/a\u003e\u003c/h3\u003e\n \u003cdiv onclick=\"toggleMenu(this)\"\u003e\u003ci class=\"hamburger-menu fa-solid fa-bars\"\u003e\u003c/i\u003e\u003c/div\u003e\n \u003c/div\u003e\n \u003cdiv class=\"links\"\u003e\n \u003cdiv\u003e\u003ca href=\"/about\"\u003eAbout\u003c/a\u003e\u003c/div\u003e\n \u003cdiv\u003e\n \u003ca href=\"/rss\"\u003eRSS\u003c/a\u003e\n \u003cspan class=\"divider\"\u003e | \u003c/span\u003e\n \u003ca href=\"/atom\"\u003eAtom\u003c/a\u003e\n \u003cspan class=\"divider\"\u003e | \u003c/span\u003e\n \u003ca href=\"/json\"\u003eJSON\u003c/a\u003e\n \u003c/div\u003e\n \u003cdiv\u003e---\u003c/div\u003e\n \u003cdiv\u003e\u003ca href=\"https://github.com/jcalabro\" target=\"_blank\"\u003eGitHub \u003ci class=\"icon fa fa-arrow-up-right-from-square\"\u003e\u003c/i\u003e\u003c/a\u003e\u003c/div\u003e\n \u003cdiv\u003e\u003ca href=\"https://www.chess.com/member/jcalabro\" target=\"_blank\"\u003eChess \u003ci class=\"icon fa fa-arrow-up-right-from-square\"\u003e\u003c/i\u003e\u003c/a\u003e\u003c/div\u003e\n \u003cdiv\u003e\u003ca href=\"https://octodon.social/@jcalabro\" target=\"_blank\"\u003eMastodon \u003ci class=\"icon fa fa-arrow-up-right-from-square\"\u003e\u003c/i\u003e\u003c/a\u003e\u003c/div\u003e\n \u003cdiv\u003e\u003ca href=\"https://www.last.fm/user/thekingofping\" target=\"_blank\"\u003eLast.fm \u003ci class=\"icon fa fa-arrow-up-right-from-square\"\u003e\u003c/i\u003e\u003c/a\u003e\u003c/div\u003e\n \u003c/div\u003e\n \u003c/div\u003e\n \u003cdiv class=\"col 4\"\u003e\n \n \u003cdiv class=\"post-title\"\u003e\n \u003ch4\u003eHow DWARF Works: Debug Information Entries\u003c/h4\u003e\n \u003ci\u003eSep 28, 2024\u003c/i\u003e\n \u003c/div\u003e\n \n \n\n\u003cdiv class=\"margin-bottom\"\u003e\n \u003ci\u003e\n This is part of the \u003ca href=\"/dwarf\"\u003eseries on DWARF\u003c/a\u003e.\n \u003c/i\u003e\n\u003c/div\u003e\n\n\u003cdiv\u003e\n \u003ca name=\"debug-information-entries\" href=\"#debug-information-entries\"\u003e\u003ch4\u003eDebug Information Entries\u003c/h4\u003e\u003c/a\u003e\n \u003cp\u003e\n So far, we've parsed enough of the contents of an \u003ca href=\"/dwarf/elf\" target=\"_blank\"\u003eELF executable\u003c/a\u003e to get the various debug info sections as raw byte arrays. Now, we'll use all these sections together to make progress towards properly inspecting a running instance of the target program.\n \u003c/p\u003e\n \u003cp\u003e\n This post's goal is to construct the Debug Information Entry (DIE) tree. This is a long task with several side quests along the way.\n \u003c/p\u003e\n \u003cp\u003e\n A DIE is a piece of information about a particular node of a compiled program, and they are stored in a tree. When compiler authors write parsers/lexers, the first step in the pipeline is generally attempting to turn the source text of a program in to an Abstract Syntax Tree, and you can sort of think of the DIE tree as a similar thing for DWARF debuggers. All your compile units, functions, variables, struct definitions, etc. each get an entry in the DIE tree. A DIE tree is really more like a forest in that each compile unit in your executable gets its own tree.\n \u003c/p\u003e\n \u003cp\u003e\n We will use these DIEs later on to ascertain key facts about a running process such as \"what is the data type of the variable named \u003ccode\u003ex\u003c/code\u003e, and where in memory/registers are its bytes?\", which is critical for building a debugger.\n \u003c/p\u003e\n \u003cp\u003e\n Perhaps it's easiest to show what it looks like. Here's the relevant section of \u003ca href=\"/dwarf/elf#our-test-program\" target=\"_blank\"\u003ecloop\u003c/a\u003e's DIE tree using \u003ccode\u003edwarfdump cloop\u003c/code\u003e:\n \u003c/p\u003e\n \u003cpre\u003e\u003ccode\u003e\u0026lt; 1\u0026gt;\u0026lt;0x000002dd\u0026gt; DW_TAG_subprogram\n DW_AT_external yes(1)\n DW_AT_name main\n DW_AT_decl_file 0x00000001 /home/jcalabro/local/data/cloop/main.c\n DW_AT_decl_line 0x00000004\n DW_AT_decl_column 0x00000005\n DW_AT_type \u0026lt;0x00000058\u0026gt;\n DW_AT_low_pc 0x00401156\n DW_AT_high_pc \u0026lt;offset-from-lowpc\u0026gt; 86 \u0026lt;highpc: 0x004011ac\u0026gt;\n DW_AT_frame_base len 0x0001: 0x9c:\n DW_OP_call_frame_cfa\n DW_AT_call_all_tail_calls yes(1)\n DW_AT_sibling \u0026lt;0x0000031c\u0026gt;\n\u0026lt; 2\u0026gt;\u0026lt;0x000002ff\u0026gt; DW_TAG_variable\n DW_AT_name pid\n DW_AT_decl_file 0x00000001\n DW_AT_decl_line 0x00000005\n DW_AT_decl_column 0x0000000b\n DW_AT_type \u0026lt;0x0000009d\u0026gt;\n DW_AT_location len 0x0002: 0x9164:\n DW_OP_fbreg -28\n\u0026lt; 2\u0026gt;\u0026lt;0x0000030d\u0026gt; DW_TAG_variable\n DW_AT_name ndx\n DW_AT_decl_file 0x00000001\n DW_AT_decl_line 0x00000006\n DW_AT_decl_column 0x00000018\n DW_AT_type \u0026lt;0x0000031c\u0026gt;\n DW_AT_location len 0x0002: 0x9168:\n DW_OP_fbreg -24\n\u003c/code\u003e\u003c/pre\u003e\n \u003cp\u003e\n There's not a ton to see in our \u003ccode\u003ecloop\u003c/code\u003e program because it's so simple, but there is a function (aka \u003ccode\u003esubprogram\u003c/code\u003e) named \u003ccode\u003emain\u003c/code\u003e and two variables, \u003ccode\u003endx\u003c/code\u003e and \u003ccode\u003epid\u003c/code\u003e, just as we'd expect. It tells us on what lines they appear, and what their data types are, and \u003ccode\u003eDW_AT_location\u003c/code\u003e actually tells us how to find the value of a variable at runtime! There's a lot of good stuff in here.\n \u003cp\u003e\n \u003c/p\u003e\n You can see that all these entries have a single \u003ca href=\"https://github.com/ziglang/zig/blob/57b2b3df520cd8c243cb974636880c9f5ca3f117/lib/std/dwarf/TAG.zig\" target=\"_blank\"\u003etag\u003c/a\u003e that indicates the type of entry (each prefixed with \u003ccode\u003eDW_TAG_\u003c/code\u003e), then some number of \u003ca href=\"https://github.com/ziglang/zig/blob/085cc54aadb327b9910be2c72b31ea046e7e8f52/lib/std/dwarf/AT.zig\" target=\"_blank\"\u003eattributes\u003c/a\u003e, or metadata describing that particular entry (each prefixed with \u003ccode\u003eDW_AT_\u003c/code\u003e). If you look at the full output, you can see a lot of DIEs for primitive C types and the standard library.\n \u003c/p\u003e\n \u003cp\u003e\n It's a bit tough to tell because we're only one level deep, but this data is stored in a tree. You can see it in the indentation of the tags and attributes if you look carefully, or you can use the numbers on the far left as reference of depth (\u003ccode\u003e\u0026lt; 1\u0026gt;\u003c/code\u003e and \u003ccode\u003e\u0026lt; 2\u0026gt;\u003c/code\u003e). This is just how \u003ccode\u003edwarfdump\u003c/code\u003e renders the DIE tree; the actual format on disk is not of this form.\n \u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv\u003e\n \u003ca name=\"abbrev-tables\" href=\"#abbrev-tables\"\u003e\u003ch4\u003eAbbrev Tables\u003c/h4\u003e\u003c/a\u003e\n \u003cp\u003e\n All that being said, the very first thing we have to parse actually is not the DIE tree. No, we will start by reading the full set of \u003ccode\u003eabbreviation tables\u003c/code\u003e. The abbrev tables give you a lot of information that's used in constructing the DIE tree. It tells you, for each potential entry in the tree, what is its tag (AST node type), and what attributes each node has. For each attribute, it tells you its data type via its form (is it a string, is it a constant integer value, and so on). The set of abbrev tables lives in the \u003ccode\u003e.debug_abbrev\u003c/code\u003e ELF section.\n \u003c/p\u003e\n \u003cp\u003e\n Parsing the abbrev table is pretty straightforward: it's an array of declarations (decls), each with a code, a tag, and some number of attributes. To see it in action, you run \u003ccode\u003edwarfdump --print-abbrev cloop\u003c/code\u003e:\n \u003c/p\u003e\n \u003cpre\u003e\u003ccode\u003e\u0026lt; 1\u003e\u0026lt;0x00000000\u003e\u003ccode: 1\u003e DW_TAG_member DW_children_no\n \u0026lt;0x00000003\u003e DW_AT_name DW_FORM_strp\n \u0026lt;0x00000005\u003e DW_AT_decl_file DW_FORM_implicit_const \u0026lt;4 (0x4)\u003e\n \u0026lt;0x00000008\u003e DW_AT_decl_line DW_FORM_data1\n \u0026lt;0x0000000a\u003e DW_AT_decl_column DW_FORM_data1\n \u0026lt;0x0000000c\u003e DW_AT_type DW_FORM_ref4\n \u0026lt;0x0000000e\u003e DW_AT_data_member_location DW_FORM_data1\n\u0026lt; 2\u003e\u0026lt;0x00000012\u003e\u003ccode: 2\u003e DW_TAG_base_type DW_children_no\n \u0026lt;0x00000015\u003e DW_AT_byte_size DW_FORM_data1\n \u0026lt;0x00000017\u003e DW_AT_encoding DW_FORM_data1\n \u0026lt;0x00000019\u003e DW_AT_name DW_FORM_strp\n\u0026lt; 3\u003e\u0026lt;0x0000001d\u003e\u003ccode: 3\u003e DW_TAG_pointer_type DW_children_no\n \u0026lt;0x00000020\u003e DW_AT_byte_size DW_FORM_implicit_const \u0026lt;8 (0x8)\u003e\n \u0026lt;0x00000023\u003e DW_AT_type DW_FORM_ref4\n\u0026lt; 4\u003e\u0026lt;0x00000027\u003e\u003ccode: 4\u003e DW_TAG_typedef DW_children_no\n \u0026lt;0x0000002a\u003e DW_AT_name DW_FORM_strp\n \u0026lt;0x0000002c\u003e DW_AT_decl_file DW_FORM_data1\n \u0026lt;0x0000002e\u003e DW_AT_decl_line DW_FORM_data1\n \u0026lt;0x00000030\u003e DW_AT_decl_column DW_FORM_data1\n \u0026lt;0x00000032\u003e DW_AT_type DW_FORM_ref4\n\n etc...\n\u003c/code\u003e\u003c/pre\u003e\n \u003cp\u003e\n We can write a parser for this with the following methodology:\n \u003c/p\u003e\n \u003cul\u003e\n \u003cli\u003eCreate a new abbrev table, storing our current offset (the number of bytes we've read so far through the section)\u003c/li\u003e\n \u003cli\u003eIf we've read to the end of the section, we're done\u003c/li\u003e\n \u003cli\u003eIf we're not done, read N decls for this table via:\u003c/li\u003e\n \u003cul\u003e\n \u003cli\u003eCreate a new decl and read its code\u003c/li\u003e\n \u003cli\u003eIf the code is zero, we're done with this table (all tables end with a \"null entry\" whose code is zero)\u003c/li\u003e\n \u003cli\u003eRead the decl's tag enum value\u003c/li\u003e\n \u003cli\u003eRead a bool that indicates whether or not this node in the tree has children\u003c/li\u003e\n \u003cli\u003eRead all attributes for this decl\u003c/li\u003e\n \u003cul\u003e\n \u003cli\u003eRead the attribute name enum value\u003c/li\u003e\n \u003cli\u003eRead the attribute form enum value\u003c/li\u003e\n \u003cli\u003eIf the name and the form are both zero, we've read all of this decl's attributes\u003c/li\u003e\n \u003cli\u003eIf the form is DW_FORM_implicit_const, read and store the implicit const value (we'll see how this is used later in this post)\u003c/li\u003e\n \u003c/ul\u003e\n \u003c/ul\u003e\n \u003c/ul\u003e\n \u003cp\u003e\n Check the DWARF documentation for what each of those enum values should be, and see the \u003ca href=\"/dwarf/go#leb128\" target=\"_blank\"\u003eappendix\u003c/a\u003e for more information on LEB128. Translated in to code, that looks like:\n \u003c/p\u003e\n \u003cpre\u003e\u003ccode\u003etype abbrevTable struct {\n offset int\n decls map[uint64]abbrevDecl\n}\n\ntype abbrevDecl struct {\n code uint64\n hasChildren uint8\n attrs []abbrevAttribute\n}\n\ntype abbrevAttribute struct {\n name uint64\n form uint64\n implicitConstVal int64\n}\n\nfunc parseAbbrevTables(abbrevSecion []byte) []abbrevTable {\n buf := bytes.NewBuffer(abbrevSecion)\n tables := []abbrevTable{}\n reader := NewBinaryReader(buf, binary.NativeEndian)\n\n for {\n table := abbrevTable{\n offset: BufferOffset(len(abbrevSecion), buf),\n decls: map[uint64]abbrevDecl{},\n }\n if table.offset \u003e= len(abbrevSecion) {\n break // nothing left to read in the section\n }\n\n for {\n decl := abbrevDecl{}\n decl.code, _ = leb128.DecodeU64(reader)\n if decl.code == 0 {\n break // we've finished with this table\n }\n\n decl.tag, _ = leb128.DecodeU64(reader)\n decl.hasChildren, _ = Read[uint8](reader)\n\n for {\n attr := abbrevAttribute{}\n attr.name, _ = leb128.DecodeU64(reader)\n attr.form, _ = leb128.DecodeU64(reader)\n if attr.name == 0 \u0026\u0026 attr.form == 0 {\n break\n }\n\n if attr.form == 0x21 { // DW_FORM_implicit_const\n attr.implicitConstVal, _ = leb128.DecodeS64(reader)\n }\n\n decl.attrs = append(decl.attrs, attr)\n }\n\n table.decls[decl.code] = decl\n }\n\n tables = append(tables, table)\n }\n\n return tables\n}\n\u003c/code\u003e\u003c/pre\u003e\n \u003cp\u003e\n That's it with abbrev tables. We'll use these quite a lot as we parse the DIE tree. You can see that we already parsed something that resembles a tree, but we're not at the point where the data is meaningful yet. Note that \u003ccode\u003ecloop\u003c/code\u003e only has one table, but there will probably be many in real-world programs.\n \u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv\u003e\n \u003ca name=\"compilation-unit-headers\" href=\"#compilation-unit-headers\"\u003e\u003ch4\u003eCompilation Unit Headers\u003c/h4\u003e\u003c/a\u003e\n \u003cp\u003e\n Now we're ready to start tackling the \u003ccode\u003e.debug_info\u003c/code\u003e section, which by is the section that contains most of the DIE information. We're going to parse one or more compilation units and their DIEs in a loop, starting with the compilation unit header. The header isn't itself a DIE, but it does give us some critical information in dealing with the remainder of the tree. So our outer loop looks like:\n \u003c/p\u003e\n \u003cpre\u003e\u003ccode\u003etype CU struct {\n header *CUHeader\n dies []DIE\n}\n\ntype CUHeader struct {\n length uintptr\n version uint16\n unitType uint8 // added in v5\n debugAbbrevOffset int\n addrSize uint8\n\n // not a real field in the DWARF standard,\n // but helpful for bookkeeping\n is32Bit bool\n}\n\ncus := []*CU{}\ninfoReader := NewBinaryReader(bytes.NewBuffer(sections.info))\nfor {\n if infoReader.Offset() \u003e= len(sections.info) {\n break\n }\n\n cuHeader := parseCUHeader(infoReader)\n\n // choose the correct abbrev table for this CU\n abbrev := abbrevTables[0]\n for _, a := range abbrevTables {\n if a.offset == cuHeader.debugAbbrevOffset {\n abbrev = a\n break\n }\n }\n\n cu := parseCU(infoReader, cuHeader, sections, abbrev)\n cus = append(cus, cu)\n}\n\u003c/code\u003e\u003c/pre\u003e\n \u003cp\u003e\n Then, we can implement \u003ccode\u003eparseCUHeader\u003c/code\u003e. Each compilation unit header contains these fields in order, but note that the order of \u003ccode\u003edebug_abbrev_offset\u003c/code\u003e and \u003ccode\u003eaddr_size\u003c/code\u003e is reversed as of DWARF v5.\n \u003c/p\u003e\n \u003cp\u003e\n Additionally, see the \u003ca href=\"/dwarf/go#initial-length\" target=\"_blank\"\u003eappendix\u003c/a\u003e for more information on fields of type \u003ccode\u003einitial length\u003c/code\u003e.\n \u003c/p\u003e\n \u003cul\u003e\n \u003cli\u003e\u003ccode\u003elength, initial length\u003c/code\u003e: the number of bytes taken by this CU header in the \u003ccode\u003e.debug_info\u003c/code\u003e section (this does not include the number of bytes it takes to store the length itself, so either 4 or 12 bytes)\u003c/li\u003e\n \u003cli\u003e\u003ccode\u003eversion, uint16\u003c/code\u003e: DWARF version number\u003c/li\u003e\n \u003cli\u003e\u003ccode\u003eunit_type, uint8\u003c/code\u003e: type of compilation unit (this field was added in DWARF v5 and there is no such field in v4 or below, see DWARF v5, section 3.1 for more details, and for our purposes we only care about \u003ccode\u003eDW_UT_compile\u003c/code\u003e)\u003c/li\u003e\n \u003cli\u003e\u003ccode\u003edebug_abbrev_offset, uint32 if is_32_bit, else uint64\u003c/code\u003e: how far to seek in to the abbrev section in order to find this CU's abbrev table\u003c/li\u003e\n \u003cli\u003e\u003ccode\u003eaddr_size, uint8\u003c/code\u003e: how large an address is in this CU\u003c/li\u003e\n \u003c/ul\u003e\n \u003cp\u003e\n So we can parse a CU header with something like:\n \u003c/p\u003e\n \u003cpre\u003e\u003ccode\u003efunc parseCUHeader(reader *BinaryReader) *CUHeader {\n header := \u0026CUHeader{}\n\n header.length = readInitialLength(reader)\n\n // we know it's a 32 bit binary if our initial length\n // was 4 bytes, not 12\n header.is32Bit = reader.Offset() == 4\n\n header.version, _ = Read[uint16](reader)\n\n if header.version \u003e= 5 {\n // this field was added in DWARF v5\n header.unitType, _ = Read[uint8](reader)\n\n // DWARF v5 changes the order of fields (switches abbrev_offset and addr_size)\n header.addrSize, _ = Read[uint8](reader)\n header.debugAbbrevOffset = parseAbbrevOffset(reader, header)\n } else {\n header.debugAbbrevOffset = parseAbbrevOffset(reader, header)\n header.addrSize, _ = Read[uint8](reader)\n }\n\n return header\n}\n\nfunc parseAbbrevOffset(reader *BinaryReader, header *CUHeader) int {\n if header.is32Bit {\n offset, _ := Read[uint32](reader)\n return int(offset)\n }\n\n offset, _ := Read[uint64](reader)\n return int(offset)\n}\n\u003c/code\u003e\u003c/pre\u003e\n \u003cp\u003e\n You can check your work against \u003ccode\u003ereadelf --debug-dump=info cloop\u003c/code\u003e:\n \u003c/p\u003e\n \u003cpre\u003e\u003ccode\u003eContents of the .debug_info section:\n\n Compilation Unit @ offset 0:\n Length: 0x320 (32-bit)\n Version: 5\n Unit Type: DW_UT_compile (1)\n Abbrev Offset: 0\n Pointer Size: 8\n\netc...\n\u003c/code\u003e\u003c/pre\u003e\n\u003c/div\u003e\n\u003cdiv\u003e\n \u003ca name=\"compilation-units\" href=\"#compilation-units\"\u003e\u003ch4\u003eCompilation Units\u003c/h4\u003e\u003c/a\u003e\n \u003cp\u003e\n Now, we're ready to begin parsing the DIE tree in this compilation unit! What we really want at the end of this section is actually just an array of DIEs, but we do need to interpret them as a tree in order to parse the \u003ccode\u003e.debug_info\u003c/code\u003e section correctly.\n \u003c/p\u003e\n \u003cp\u003e\n First, we should pick an abbrev table by searching the list of abbrev table offsets for the one that matches with the \u003ccode\u003edebug_abbrev_offset\u003c/code\u003e field from the CU header. This is the table we'll use for the rest of parsing this CU. This code is already handled in the \"outer loop\" example above.\n \u003c/p\u003e\n \u003cp\u003e\n Next, we'll begin reading through the \u003ccode\u003e.debug_info\u003c/code\u003e section to determine the list of DIEs, their depth in the tree, and the \u003ccode\u003eform\u003c/code\u003e of each of their attributes. DWARF forms are basically just data types, so this is just saying that we want to figure out how to interpret the bytes of each attribute (should we read a \u003ccode\u003euint8\u003c/code\u003e, a \u003ccode\u003estring\u003c/code\u003e, etc.). Forms are an enum with the prefix \u003ccode\u003eDW_FORM_\u003c/code\u003e.\n \u003c/p\u003e\n \u003cp\u003e\n Additionally, based on which form we're looking at, we should also be sure to store its associated attribute class for more information on how to interpret this data. We won't be using this much since our code is just a demo, but real-world, robust parsers will use this a lot. Read the spec to learn more.\n \u003c/p\u003e\n \u003cp\u003e\n To complete this, we should:\n \u003c/p\u003e\n \u003cul\u003e\n \u003cli\u003eRead a ULEB128, that's the code of an item in the abbrev table that we will look up\u003c/li\u003e\n \u003cli\u003eIf the code is zero, we are either done reading DIEs for the entire compilation unit, or we're done reading child DIEs for a node in the tree\n \u003cul\u003e\n \u003cli\u003eIf you are at the root level of the DIE tree, you're done parsing this compile unit\u003c/li\u003e\n \u003cli\u003eElse, pop up one level in the tree and continue\u003c/li\u003e\n \u003c/ul\u003e\n \u003cli\u003eUse the code to look up the abbrev decl we care about in the abbrev table\u003c/li\u003e\n \u003cli\u003eFor each attribute in the decl we're going to want to choose which type of form to use to interpret the bytes of this attribute, and read that many bytes from the \u003ccode\u003e.debug_info\u003c/code\u003e section\u003c/li\u003e\n \u003cul\u003e\n \u003cli\u003eNote that list of form types is very long, so I'll only implement enough to parse \u003ccode\u003ecloop\u003c/code\u003e on my machine in the code sample below. Refer to the documentation for the version(s) of DWARF you care about to see the full list.\u003c/li\u003e\n \u003c/ul\u003e\n \u003cli\u003eIf \u003ccode\u003eDW_CHILDREN_yes\u003c/code\u003e was set on the decl, push an item on to the DIE tree\u003c/li\u003e\n \u003c/ul\u003e\n \u003cp\u003e\n Easy to say, but somewhat tedious to actually do. First, let's write that algorithm:\n \u003c/p\u003e\n \u003cpre\u003e\u003ccode\u003efunc parseCU(\n reader *BinaryReader,\n header *CUHeader,\n sections *DWARFSections,\n abbrev abbrevTable,\n) *CU {\n dies := []DIE{} // list of all DIEs\n dieTree := []uint64{} // stack of abbrev codes\n\n for {\n dieOffset := reader.Offset()\n\n abbrevCode, _ := leb128.DecodeU64(reader)\n if abbrevCode == 0 {\n if len(dieTree) == 1 {\n break // we're done\n }\n\n // this DIE has no more children, pop the stack\n dieTree = dieTree[:len(dieTree)-1]\n continue\n }\n\n abbrevDecl := abbrev.decls[abbrevCode]\n die := DIE{\n offset: dieOffset,\n depth: len(dieTree),\n tag: abbrevDecl.tag,\n }\n\n for _, attr := range abbrevDecl.attrs {\n form := chooseFormAndAdvanceBySize(reader, header, sections, attr)\n die.forms = append(die.forms, form)\n }\n\n dies = append(dies, die)\n if abbrevDecl.hasChildren == 1 {\n dieTree = append(dieTree, abbrevCode)\n }\n }\n\u003c/code\u003e\u003c/pre\u003e\n \u003cp\u003e\n Next, let's define a few types and \"enums\" (Go doesn't really have enums):\n \u003c/p\u003e\n \u003cpre\u003e\u003ccode\u003etype DIE struct {\n offset int\n depth int\n tag uint64\n forms []Form\n}\n\ntype Form struct {\n data any\n class Class\n}\n\ntype DWARFForm int\n\nconst (\n // we're not going to use all of these today, but a real parser should\n DW_FORM_addr DWARFForm = 0x01\n DW_FORM_block2 DWARFForm = 0x03\n DW_FORM_block4 DWARFForm = 0x04\n DW_FORM_data2 DWARFForm = 0x05\n DW_FORM_data4 DWARFForm = 0x06\n DW_FORM_data8 DWARFForm = 0x07\n DW_FORM_string DWARFForm = 0x08\n DW_FORM_block DWARFForm = 0x09\n DW_FORM_block1 DWARFForm = 0x0a\n DW_FORM_data1 DWARFForm = 0x0b\n DW_FORM_flag DWARFForm = 0x0c\n DW_FORM_sdata DWARFForm = 0x0d\n DW_FORM_strp DWARFForm = 0x0e\n DW_FORM_udata DWARFForm = 0x0f\n DW_FORM_ref_addr DWARFForm = 0x10\n DW_FORM_ref1 DWARFForm = 0x11\n DW_FORM_ref2 DWARFForm = 0x12\n DW_FORM_ref4 DWARFForm = 0x13\n DW_FORM_ref8 DWARFForm = 0x14\n DW_FORM_ref_udata DWARFForm = 0x15\n DW_FORM_indirect DWARFForm = 0x16\n DW_FORM_sec_offset DWARFForm = 0x17\n DW_FORM_exprloc DWARFForm = 0x18\n DW_FORM_flag_present DWARFForm = 0x19\n DW_FORM_strx DWARFForm = 0x1a\n DW_FORM_addrx DWARFForm = 0x1b\n DW_FORM_ref_sup4 DWARFForm = 0x1c\n DW_FORM_strp_sup DWARFForm = 0x1d\n DW_FORM_data16 DWARFForm = 0x1e\n DW_FORM_line_strp DWARFForm = 0x1f\n DW_FORM_ref_sig8 DWARFForm = 0x20\n DW_FORM_implicit_const DWARFForm = 0x21\n DW_FORM_loclistx DWARFForm = 0x22\n DW_FORM_rnglistx DWARFForm = 0x23\n DW_FORM_ref_sup8 DWARFForm = 0x24\n DW_FORM_strx1 DWARFForm = 0x25\n DW_FORM_strx2 DWARFForm = 0x26\n DW_FORM_strx3 DWARFForm = 0x27\n DW_FORM_strx4 DWARFForm = 0x28\n DW_FORM_addrx1 DWARFForm = 0x29\n DW_FORM_addrx2 DWARFForm = 0x2a\n DW_FORM_addrx3 DWARFForm = 0x2b\n DW_FORM_addrx4 DWARFForm = 0x2c\n\n // extensions, etc...\n)\n\ntype Class int\n\nconst (\n address Class = iota\n addrptr\n block\n constant\n exprloc\n flag\n lineptr\n loclist\n loclistptr\n macptr\n rnglist\n rnglistptr\n reference\n str\n stroffsetpt\n)\n\u003c/code\u003e\u003c/pre\u003e\n \u003cp\u003e\n Then, we can write our \u003ccode\u003echooseFormAndAdvanceBySize\u003c/code\u003e function and a couple helpers:\n \u003c/p\u003e\n \u003cpre\u003e\u003ccode\u003efunc chooseFormAndAdvanceBySize(\n reader *BinaryReader,\n header *CUHeader,\n sections *DWARFSections,\n attr abbrevAttribute,\n) Form {\n form := Form{}\n\n dwarfForm := DWARFForm(attr.form)\n switch dwarfForm {\n\n // read N bytes of data as a constant value\n case DW_FORM_data1:\n form.data, _ = Read[uint8](reader)\n form.class = constant\n case DW_FORM_data2:\n form.data, _ = Read[uint16](reader)\n form.class = constant\n case DW_FORM_data4:\n form.data, _ = Read[uint32](reader)\n form.class = constant\n case DW_FORM_data8:\n form.data, _ = Read[uint64](reader)\n form.class = constant\n case DW_FORM_sdata:\n form.data, _ = leb128.DecodeS64(reader)\n form.class = constant\n case DW_FORM_udata:\n form.data, _ = leb128.DecodeU64(reader)\n form.class = constant\n\n // read an address\n case DW_FORM_addr:\n form.data, _ = Read[uintptr](reader)\n form.class = address\n\n // read a reference of N bytes\n case DW_FORM_ref_addr:\n form.data = readOffset(header, reader)\n form.class = reference\n case DW_FORM_ref1:\n form.data, _ = Read[uint8](reader)\n form.class = reference\n case DW_FORM_ref2:\n form.data, _ = Read[uint16](reader)\n form.class = reference\n case DW_FORM_ref4:\n form.data, _ = Read[uint32](reader)\n form.class = reference\n case DW_FORM_ref8:\n form.data, _ = Read[uint64](reader)\n form.class = reference\n\n // flags\n case DW_FORM_flag:\n form.data, _ = Read[uint8](reader)\n form.class = flag\n case DW_FORM_flag_present:\n form.data = []byte{1} // just indicates true\n form.class = flag\n\n // strings\n case DW_FORM_string:\n // read a string from the .debug_info section\n form.data = readNullTerminatedString(reader)\n form.class = str\n case DW_FORM_strp:\n // read a string from the .debug_str section\n offset := readOffset(header, reader)\n strSection := sections.str[offset:]\n strReader := NewBinaryReader(bytes.NewBuffer(strSection))\n form.data = readNullTerminatedString(strReader)\n form.class = str\n case DW_FORM_line_strp: // first introduced in DWARF v5\n // read a string from the .debug_line_str section\n offset := readOffset(header, reader)\n strSection := sections.line_str[offset:]\n strReader := NewBinaryReader(bytes.NewBuffer(strSection))\n form.data = readNullTerminatedString(strReader)\n form.class = str\n\n // read a DWARF expression as an N byte buffer\n // (much more on these in a later post!)\n case DW_FORM_exprloc:\n length, _ := leb128.DecodeU64(reader)\n buf := make([]byte, length)\n reader.Read(buf)\n form.data = buf\n form.class = exprloc\n\n // offset in to one of many sections based on the attribute\n case DW_FORM_sec_offset:\n form.data = readOffset(header, reader)\n\n // there are many more of these, and this is a real pain\n // when writing a real parser, so be sure to RTFM\n switch attr.form {\n case 0x10: // DW_AT_stmt_list\n form.class = lineptr\n }\n\n // this is a hard-coded constant value that comes from\n // the attribute itself in the .debug_abbrev section\n case DW_FORM_implicit_const:\n form.data = attr.implicitConstVal\n form.class = constant\n }\n\n return form\n}\n\nfunc readOffset(header *CUHeader, reader *BinaryReader) uint64 {\n if header.is32Bit {\n val, _ := Read[uint32](reader)\n return uint64(val)\n }\n\n val, _ := Read[uint64](reader)\n return val\n}\n\nfunc readNullTerminatedString(reader *BinaryReader) string {\n buf := []byte{}\n ch, _ := Read[uint8](reader)\n for ch != 0 {\n buf = append(buf, ch)\n ch, _ = Read[uint8](reader)\n }\n return string(buf)\n}\n\u003c/code\u003e\u003c/pre\u003e\n \u003cp\u003e\n Phew! That was a lot. However, now we have a full DIE tree! Be sure to check your work against \u003ccode\u003edwarfdump cloop\u003c/code\u003e because a lot of these are subtle and it's easy to make mistakes, plus a single mistake tends to lead to cascading failures (i.e. if you read one byte instead of two, all subsequent reads are now off by one).\n \u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv\u003e\n \u003ca name=\"summary\" href=\"#summary\"\u003e\u003ch4\u003eSummary\u003c/h4\u003e\u003c/a\u003e\n \u003cp\u003e\n Today, we parsed the \u003ccode\u003e.debug_abbrev\u003c/code\u003e and \u003ccode\u003e.debug_info\u003c/code\u003e sections to ultimately construct a tree of debug information entries.\n \u003c/p\u003e\n \u003cp\u003e\n As was mentioned throughout the article, there's a lot more nuance writing a \u003ccode\u003e.debug_info\u003c/code\u003e parser that's able to handle any binary you'd encounter in the wild. Refer to the DWARF documentation for more info, but this hopefully was a helpful jumping off point for understanding the structure of the section.\n \u003c/p\u003e\n \u003cp\u003e\n Stay tuned for the next part where we'll learn how to read line number information so we can map addresses in the program text back to their source location!\n \u003c/p\u003e\n\u003c/div\u003e\n\n\u003cdiv class=\"margin-top\"\u003e\n \u003ci\u003e\n Thank you for reading the \u003ca href=\"/dwarf\"\u003eseries on DWARF\u003c/a\u003e. Please don't hesitate to reach out with comments, questions, or errata to jim at this domain dot com.\n \u003c/i\u003e\n\u003c/div\u003e\n\n\n \u003c/div\u003e\n \u003cdiv class=\"col 4\"\u003e\u003c/div\u003e\n \u003c/div\u003e\n \u003cdiv class=\"row footer\"\u003e\n \u003cdiv\u003e\n Subscribe for updates via email (your data will never be shared, ever)\n \u003cdiv class=\"margin-top-small\"\u003e\n \u003cinput id=\"email-input\" type=\"email\" onkeydown=\"emailSignupKeyEvent(this)\" /\u003e\n \u003cbutton id=\"email-submit\" onclick=\"emailSignup()\"\u003eSign Up\u003c/button\u003e\n \u003cp id=\"email-status\"\u003e\u003c/p\u003e\n \u003c/div\u003e\n \u003c/div\u003e\n \u003cdiv class=\"margin-top\"\u003e\n Copyright © 2024 Jim Calabro. All rights reserved.\n \u003c/div\u003e\n \u003c/div\u003e\n \u003c/body\u003e\n\u003c/html\u003e\n", "summary": "What are DIEs? How do we write code to load them from a binary?", "date_published": "2024-09-28T00:00:00Z" }, { "id": "", "url": "https://calabro.io/dwarf", "title": "How DWARF Works: Table of Contents and Introduction", "content_html": "\n\u003c!DOCTYPE html\u003e\n\u003chtml lang=\"en\"\u003e\n \u003chead\u003e\n \u003cmeta charset=\"utf-8\"\u003e\n \u003cmeta name=\"viewport\" content=\"width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=no\" /\u003e\n \u003cmeta http-equiv=\"Cache-control\" content=\"public\"\u003e\n \u003cmeta name=\"description\" content=\"Jim Calabro\"\u003e\n \u003clink rel=\"canonical\" rel=\"noreferrer\" href=\"https://www.calabro.io\" /\u003e\n \u003clink rel=\"icon\" type=\"image/svg\" href=\"/static/favicon.svg\"\u003e\n \u003ctitle\u003eHow DWARF Works: Table of Contents and Introduction - Jim Calabro\u003c/title\u003e\n \u003clink rel=\"stylesheet\" type=\"text/css\" charset=\"utf-8\" href=\"/static/lit.min.css\"\u003e\n \u003clink rel=\"stylesheet\" type=\"text/css\" charset=\"utf-8\" href=\"/static/style.css\"\u003e\n \u003clink href='/static/Raleway.css' rel='stylesheet' type='text/css' async defer\u003e\n \u003clink rel=\"stylesheet\" href=\"/static/font-awesome.min.css\"\u003e\n \u003cscript async defer src=\"/static/font-awesome.min.js\" data-auto-replace-svg=\"nest\"\u003e\u003c/script\u003e\n \u003cscript type=\"text/javascript\"\u003e\n async function emailSignup() {\n const emailSubmit = document.getElementById('email-submit');\n emailSubmit.disabled = true;\n\n const email = document.getElementById('email-input').value;\n const resp = await fetch('/v1/email', {\n method: 'POST',\n body: JSON.stringify({ email: email }),\n });\n\n emailSubmit.disabled = false;\n\n const emailStatus = document.getElementById('email-status');\n if (resp.status === 200) {\n emailStatus.innerText = 'Success! Thank you for subscribing.'\n } else {\n emailStatus.innerText = 'Hmm, that didn\\'t work. Please ensure you entered a valid email address and try again later.';\n }\n }\n\n function emailSignupKeyEvent() {\n if (event.key === 'Enter') {\n emailSignup();\n }\n }\n\n const mobileWidth = 560;\n\n function toggleMenu(event) {\n if (window.innerWidth \u003e= mobileWidth) {\n return;\n }\n\n const links = document.querySelector('.links');\n const display = links.style['display'];\n if (!display || display === 'none') {\n links.style.display = 'block';\n } else {\n links.style.display = 'none';\n }\n }\n\n addEventListener('resize', (event) =\u003e {\n if (window.innerWidth \u003e= mobileWidth) {\n document.querySelector('.links').style.display = 'block';\n } else {\n document.querySelector('.links').style.display = 'none';\n }\n });\n \u003c/script\u003e\n \u003c/head\u003e\n \u003cbody\u003e\n \u003cdiv class=\"row\"\u003e\n \u003cdiv class=\"col 2\"\u003e\u003c/div\u003e\n \u003cdiv class=\"col 2\"\u003e\n \u003cdiv class=\"collapsed-links\"\u003e\n \u003ch3 class=\"name-header\"\u003e\u003ca href=\"/\" style=\"cursor: pointer\"\u003eJim Calabro\u003c/a\u003e\u003c/h3\u003e\n \u003cdiv onclick=\"toggleMenu(this)\"\u003e\u003ci class=\"hamburger-menu fa-solid fa-bars\"\u003e\u003c/i\u003e\u003c/div\u003e\n \u003c/div\u003e\n \u003cdiv class=\"links\"\u003e\n \u003cdiv\u003e\u003ca href=\"/about\"\u003eAbout\u003c/a\u003e\u003c/div\u003e\n \u003cdiv\u003e\n \u003ca href=\"/rss\"\u003eRSS\u003c/a\u003e\n \u003cspan class=\"divider\"\u003e | \u003c/span\u003e\n \u003ca href=\"/atom\"\u003eAtom\u003c/a\u003e\n \u003cspan class=\"divider\"\u003e | \u003c/span\u003e\n \u003ca href=\"/json\"\u003eJSON\u003c/a\u003e\n \u003c/div\u003e\n \u003cdiv\u003e---\u003c/div\u003e\n \u003cdiv\u003e\u003ca href=\"https://github.com/jcalabro\" target=\"_blank\"\u003eGitHub \u003ci class=\"icon fa fa-arrow-up-right-from-square\"\u003e\u003c/i\u003e\u003c/a\u003e\u003c/div\u003e\n \u003cdiv\u003e\u003ca href=\"https://www.chess.com/member/jcalabro\" target=\"_blank\"\u003eChess \u003ci class=\"icon fa fa-arrow-up-right-from-square\"\u003e\u003c/i\u003e\u003c/a\u003e\u003c/div\u003e\n \u003cdiv\u003e\u003ca href=\"https://octodon.social/@jcalabro\" target=\"_blank\"\u003eMastodon \u003ci class=\"icon fa fa-arrow-up-right-from-square\"\u003e\u003c/i\u003e\u003c/a\u003e\u003c/div\u003e\n \u003cdiv\u003e\u003ca href=\"https://www.last.fm/user/thekingofping\" target=\"_blank\"\u003eLast.fm \u003ci class=\"icon fa fa-arrow-up-right-from-square\"\u003e\u003c/i\u003e\u003c/a\u003e\u003c/div\u003e\n \u003c/div\u003e\n \u003c/div\u003e\n \u003cdiv class=\"col 4\"\u003e\n \n \u003cdiv class=\"post-title\"\u003e\n \u003ch4\u003eHow DWARF Works: Table of Contents and Introduction\u003c/h4\u003e\n \u003ci\u003eSep 25, 2024\u003c/i\u003e\n \u003c/div\u003e\n \n \n\u003cdiv\u003e\n \u003ca name=\"table-of-contents\" href=\"#table-of-contents\"\u003e\u003ch4\u003eTable of Contents\u003c/h4\u003e\u003c/a\u003e\n \u003ch5\u003e\u003ci\u003ePart One: Parsing Debug Info\u003c/i\u003e\u003c/h5\u003e\n \u003cul\u003e\n \u003cli\u003e\u003ca href=\"/dwarf/elf\"\u003eParsing ELF Files\u003c/a\u003e\u003c/li\u003e\n \u003cli\u003e\u003ca href=\"/dwarf/die\"\u003eDebug Information Entries\u003c/a\u003e\u003c/li\u003e\n \u003cli\u003eLine Number Information\u003c/li\u003e\n \u003cli\u003eAddress Ranges\u003c/li\u003e\n \u003cli\u003eFrame Tables\u003c/li\u003e\n \u003c/ul\u003e\n \u003ch5 class=\"margin-top\"\u003e\u003ci\u003ePart Two: Using Debug Info at Runtime\u003c/i\u003e\u003c/h5\u003e\n \u003cul\u003e\n \u003cli\u003eInspecting a Child Process\u003c/li\u003e\n \u003cli\u003eStack Unwinding\u003c/li\u003e\n \u003cli\u003eFinding Variable Values\u003c/li\u003e\n \u003c/ul\u003e\n \u003ch5 class=\"margin-top\"\u003e\u003ci\u003eAppendix\u003c/i\u003e\u003c/h5\u003e\n \u003cul\u003e\n \u003cli\u003e\u003ca href=\"/dwarf/go\"\u003eGo Code Reference\u003c/a\u003e\u003c/li\u003e\n \u003c/ul\u003e\n\u003c/div\u003e\n\u003cdiv class=\"margin-top\"\u003e\n \u003ca name=\"introduction\" href=\"#introduction\"\u003e\u003ch4\u003eIntroduction\u003c/h4\u003e\u003c/a\u003e\n \u003cp\u003e\n Welcome to the series on parsing and using DWARF debug info!\n \u003c/p\u003e\n \u003cp\u003e\n Its purpose is to provide a user-friendly starting point for learning about how debug information and debuggers work on Linux. It's written from the perspective of a debugger author, not a compiler author.\n \u003c/p\u003e\n \u003cp\u003e\n Non-goals include being a 100% comprehensive guide, providing details on specific languages or compilers, or giving details on other platforms.\n \u003c/p\u003e\n \u003cp\u003e\n If you want to learn more beyond what's contained in these posts, I highly recommend reading all relevant versions of the source documentation as well as other reference implementations that are pretty decent (links below).\n \u003c/p\u003e\n \u003cp\u003e\n If you have questions, comments, or corrections, please don't hesitate to reach out via email to jim at this domain dot com. I love hearing from you.\n \u003c/p\u003e\n \u003cp\u003e\n Thanks for reading!\n \u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv class=\"margin-top\"\u003e\n \u003ca name=\"why-are-you-writing-this\" href=\"#why-are-you-writing-this\"\u003e\u003ch4\u003eWhy Are You Writing This?\u003c/h4\u003e\u003c/a\u003e\n \u003cp\u003e\n I feel strongly that the state of the art in debuggers needs to be improved on Linux. We have \u003ca href=\"https://www.gnu.org/software/gdb/gdb.html\" target=\"_blank\"\u003egdb\u003c/a\u003e and \u003ca href=\"https://lldb.llvm.org/\" target=\"_blank\"\u003elldb\u003c/a\u003e (plus about a million graphical front ends, none of which are good). All of these tools take a long time to learn, are slow to use, and they don't make the data you need readily apparent.\n \u003c/p\u003e\n \u003cp\u003e\n There's also \u003ca href=\"https://rr-project.org/\" target=\"_blank\"\u003err\u003c/a\u003e and \u003ca href=\"https://pernos.co/\" target=\"_blank\"\u003ePernosco\u003c/a\u003e, which are astounding technical achievements, but suffer the same issues of taking a long time to get from \"bug\" to \"not bug\" (especially for simple issues). They do a great job of helping solve some of the hardest bugs much faster, but they're not tools I use every day.\n \u003c/p\u003e\n \u003cp\u003e\n Comparatively, Windows has \u003ca href=\"https://visualstudio.microsoft.com/\" target=\"_blank\"\u003eVisual Studio\u003c/a\u003e, \u003ca href=\"https://learn.microsoft.com/en-us/windows-hardware/drivers/debugger/\" target=\"_blank\"\u003eWinDbg\u003c/a\u003e, \u003ca href=\"https://remedybg.itch.io/remedybg\" target=\"_blank\"\u003eRemedyBG\u003c/a\u003e, and \u003ca href=\"https://github.com/EpicGamesExt/raddebugger\" target=\"_blank\"\u003ethe RAD Debugger\u003c/a\u003e. Each of these tools do the job of making the information I need about my program available at my fingertips as fast as possible, but unfortunately, they all are Windows-only. Despite its incredible bloat, it's awesome to be able to just press F5 in Visual Studio and have a ton of helpful information at your disposal.\n \u003c/p\u003e\n \u003cp\u003e\n Graphical debuggers are hands-down better than terminal based ones because debugging is fundamentally a problem of visualization. I want to see lots of relevant information all at once every time I step, and the terminal doesn't facilitate that nearly as well as a purpose-built GUI.\n \u003c/p\u003e\n \u003cp\u003e\n Additionally, terminal based debuggers are inherently worse than graphical ones because they dont allow me to get at the information I need quickly. Starting a debugging session by writing a \u003ci\u003e.gdbinit\u003c/i\u003e file and using the REPL is much slower than using Visual Studio, where you just press F5.\n \u003c/p\u003e\n \u003cp\u003e\n I find that most people simply don't use gdb and instead reach for printf-debugging since it's quicker and easier for the vast majority of use-cases. As a practitioner, that's the correct choice, because what is the point of a debugger that doesn't save you time? However, it leaves a ton of power on the table and is a local maximum that we must push through.\n \u003c/p\u003e\n \u003cp\u003e\n I am certain this situation can be improved. The tools to write a robust grapical debugger for Linux already exist; it's just a long, hard road to actually get it done.\n \u003c/p\u003e\n \u003cp\u003e\n Given that Linux is the most widely deployed operating system on the planet\u003csup\u003e\u003ca href=\"https://gs.statcounter.com/os-market-share\" target=\"_blank\"\u003e1\u003c/a\u003e, \u003ca href=\"https://www.fortunebusinessinsights.com/server-operating-system-market-106601\" target=\"_blank\"\u003e2\u003c/a\u003e\u003c/sup\u003e, further investment in tooling in this area is a no-brainer. Hopefully this series can contribute towards a bright future of Linux development tools in some small way.\n \u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv class=\"margin-top\"\u003e\n \u003ca name=\"further-reading\" href=\"#further-reading\"\u003e\u003ch4\u003eFurther Reading\u003c/h4\u003e\u003c/a\u003e\n \u003cp\u003eAll of the code I link to is non-GPL.\u003c/p\u003e\n \u003cul\u003e\n \u003cli\u003e\u003ca href=\"https://dwarfstd.org/\" target=\"_blank\"\u003eDWARF Documentation\u003c/a\u003e (be sure to download all the versions that are relevant to your needs, I typically use versions 3, 4, and 5)\u003c/li\u003e\n \u003cli\u003e\u003ca href=\"https://refspecs.linuxfoundation.org/elf/gabi4+/contents.html\" target=\"_blank\"\u003eELF Documentation\u003c/a\u003e\u003c/li\u003e\n \u003cli\u003e\u003ca href=\"https://pkg.go.dev/debug/elf#NewFile\" target=\"_blank\"\u003eGo's ELF Parser\u003c/a\u003e\u003c/li\u003e\n \u003cli\u003e\u003ca href=\"https://github.com/ziglang/zig/blob/master/src/link/Elf.zig\" target=\"_blank\"\u003eElf.zig\u003c/a\u003e\u003c/li\u003e\n \u003cli\u003e\u003ca href=\"https://github.com/gimli-rs/gimli\" target=\"_blank\"\u003egimli-rs\u003c/a\u003e\u003c/li\u003e\n \u003cli\u003eIan Lance Taylor's \u003ca href=\"https://www.airs.com/blog/\" target=\"_blank\"\u003eBlog\u003c/a\u003e, in particular these posts on stack unwinding: \u003ca href=\"https://www.airs.com/blog/archives/460\" target=\"_blank\"\u003e1\u003c/a\u003e, \u003ca href=\"https://www.airs.com/blog/archives/462\" target=\"_blank\"\u003e2\u003c/a\u003e, \u003ca href=\"https://www.airs.com/blog/archives/464\" target=\"_blank\"\u003e3\u003c/a\u003e\u003c/li\u003e\n \u003cli\u003eEli Bendersky's \u003ca href=\"https://eli.thegreenplace.net/tag/debuggers\" target=\"_blank\"\u003ewriting on debuggers\u003c/a\u003e\u003c/li\u003e\n \u003cli\u003eSy Brand's \u003ca href=\"https://blog.tartanllama.xyz/writing-a-linux-debugger-setup/\" target=\"_blank\"\u003eBuilding a Debugger series\u003c/a\u003e\u003c/li\u003e\n \u003c/ul\u003e\n\u003c/div\u003e\n\u003cdiv class=\"margin-top\"\u003e\n \u003ch4\u003eUseful Tools\u003c/h4\u003e\n \u003cul\u003e\n \u003cli\u003e\u003ca href=\"https://man7.org/linux/man-pages/man1/readelf.1.html\" target=\"_blank\"\u003ereadelf\u003c/a\u003e\u003c/li\u003e\n \u003cli\u003e\u003ca href=\"https://man7.org/linux/man-pages/man1/objdump.1.html\" target=\"_blank\"\u003eobjdump\u003c/a\u003e\u003c/li\u003e\n \u003cli\u003e\u003ca href=\"https://llvm.org/docs/CommandGuide/llvm-dwarfdump.html\" target=\"_blank\"\u003edwarfdump\u003c/a\u003e\u003c/li\u003e\n \u003c/ul\u003e\n\u003c/div\u003e\n\n \u003c/div\u003e\n \u003cdiv class=\"col 4\"\u003e\u003c/div\u003e\n \u003c/div\u003e\n \u003cdiv class=\"row footer\"\u003e\n \u003cdiv\u003e\n Subscribe for updates via email (your data will never be shared, ever)\n \u003cdiv class=\"margin-top-small\"\u003e\n \u003cinput id=\"email-input\" type=\"email\" onkeydown=\"emailSignupKeyEvent(this)\" /\u003e\n \u003cbutton id=\"email-submit\" onclick=\"emailSignup()\"\u003eSign Up\u003c/button\u003e\n \u003cp id=\"email-status\"\u003e\u003c/p\u003e\n \u003c/div\u003e\n \u003c/div\u003e\n \u003cdiv class=\"margin-top\"\u003e\n Copyright © 2024 Jim Calabro. All rights reserved.\n \u003c/div\u003e\n \u003c/div\u003e\n \u003c/body\u003e\n\u003c/html\u003e\n", "summary": "This series answers questions such as \"what is DWARF/ELF?\" and \"How do debuggers work?\"", "date_published": "2024-09-25T00:00:00Z" }, { "id": "", "url": "https://calabro.io/dwarf/elf", "title": "How DWARF Works: Parsing Just Enough ELF", "content_html": "\n\u003c!DOCTYPE html\u003e\n\u003chtml lang=\"en\"\u003e\n \u003chead\u003e\n \u003cmeta charset=\"utf-8\"\u003e\n \u003cmeta name=\"viewport\" content=\"width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=no\" /\u003e\n \u003cmeta http-equiv=\"Cache-control\" content=\"public\"\u003e\n \u003cmeta name=\"description\" content=\"Jim Calabro\"\u003e\n \u003clink rel=\"canonical\" rel=\"noreferrer\" href=\"https://www.calabro.io\" /\u003e\n \u003clink rel=\"icon\" type=\"image/svg\" href=\"/static/favicon.svg\"\u003e\n \u003ctitle\u003eHow DWARF Works: Parsing Just Enough ELF - Jim Calabro\u003c/title\u003e\n \u003clink rel=\"stylesheet\" type=\"text/css\" charset=\"utf-8\" href=\"/static/lit.min.css\"\u003e\n \u003clink rel=\"stylesheet\" type=\"text/css\" charset=\"utf-8\" href=\"/static/style.css\"\u003e\n \u003clink href='/static/Raleway.css' rel='stylesheet' type='text/css' async defer\u003e\n \u003clink rel=\"stylesheet\" href=\"/static/font-awesome.min.css\"\u003e\n \u003cscript async defer src=\"/static/font-awesome.min.js\" data-auto-replace-svg=\"nest\"\u003e\u003c/script\u003e\n \u003cscript type=\"text/javascript\"\u003e\n async function emailSignup() {\n const emailSubmit = document.getElementById('email-submit');\n emailSubmit.disabled = true;\n\n const email = document.getElementById('email-input').value;\n const resp = await fetch('/v1/email', {\n method: 'POST',\n body: JSON.stringify({ email: email }),\n });\n\n emailSubmit.disabled = false;\n\n const emailStatus = document.getElementById('email-status');\n if (resp.status === 200) {\n emailStatus.innerText = 'Success! Thank you for subscribing.'\n } else {\n emailStatus.innerText = 'Hmm, that didn\\'t work. Please ensure you entered a valid email address and try again later.';\n }\n }\n\n function emailSignupKeyEvent() {\n if (event.key === 'Enter') {\n emailSignup();\n }\n }\n\n const mobileWidth = 560;\n\n function toggleMenu(event) {\n if (window.innerWidth \u003e= mobileWidth) {\n return;\n }\n\n const links = document.querySelector('.links');\n const display = links.style['display'];\n if (!display || display === 'none') {\n links.style.display = 'block';\n } else {\n links.style.display = 'none';\n }\n }\n\n addEventListener('resize', (event) =\u003e {\n if (window.innerWidth \u003e= mobileWidth) {\n document.querySelector('.links').style.display = 'block';\n } else {\n document.querySelector('.links').style.display = 'none';\n }\n });\n \u003c/script\u003e\n \u003c/head\u003e\n \u003cbody\u003e\n \u003cdiv class=\"row\"\u003e\n \u003cdiv class=\"col 2\"\u003e\u003c/div\u003e\n \u003cdiv class=\"col 2\"\u003e\n \u003cdiv class=\"collapsed-links\"\u003e\n \u003ch3 class=\"name-header\"\u003e\u003ca href=\"/\" style=\"cursor: pointer\"\u003eJim Calabro\u003c/a\u003e\u003c/h3\u003e\n \u003cdiv onclick=\"toggleMenu(this)\"\u003e\u003ci class=\"hamburger-menu fa-solid fa-bars\"\u003e\u003c/i\u003e\u003c/div\u003e\n \u003c/div\u003e\n \u003cdiv class=\"links\"\u003e\n \u003cdiv\u003e\u003ca href=\"/about\"\u003eAbout\u003c/a\u003e\u003c/div\u003e\n \u003cdiv\u003e\n \u003ca href=\"/rss\"\u003eRSS\u003c/a\u003e\n \u003cspan class=\"divider\"\u003e | \u003c/span\u003e\n \u003ca href=\"/atom\"\u003eAtom\u003c/a\u003e\n \u003cspan class=\"divider\"\u003e | \u003c/span\u003e\n \u003ca href=\"/json\"\u003eJSON\u003c/a\u003e\n \u003c/div\u003e\n \u003cdiv\u003e---\u003c/div\u003e\n \u003cdiv\u003e\u003ca href=\"https://github.com/jcalabro\" target=\"_blank\"\u003eGitHub \u003ci class=\"icon fa fa-arrow-up-right-from-square\"\u003e\u003c/i\u003e\u003c/a\u003e\u003c/div\u003e\n \u003cdiv\u003e\u003ca href=\"https://www.chess.com/member/jcalabro\" target=\"_blank\"\u003eChess \u003ci class=\"icon fa fa-arrow-up-right-from-square\"\u003e\u003c/i\u003e\u003c/a\u003e\u003c/div\u003e\n \u003cdiv\u003e\u003ca href=\"https://octodon.social/@jcalabro\" target=\"_blank\"\u003eMastodon \u003ci class=\"icon fa fa-arrow-up-right-from-square\"\u003e\u003c/i\u003e\u003c/a\u003e\u003c/div\u003e\n \u003cdiv\u003e\u003ca href=\"https://www.last.fm/user/thekingofping\" target=\"_blank\"\u003eLast.fm \u003ci class=\"icon fa fa-arrow-up-right-from-square\"\u003e\u003c/i\u003e\u003c/a\u003e\u003c/div\u003e\n \u003c/div\u003e\n \u003c/div\u003e\n \u003cdiv class=\"col 4\"\u003e\n \n \u003cdiv class=\"post-title\"\u003e\n \u003ch4\u003eHow DWARF Works: Parsing Just Enough ELF\u003c/h4\u003e\n \u003ci\u003eSep 25, 2024\u003c/i\u003e\n \u003c/div\u003e\n \n \n\n\u003cdiv class=\"margin-bottom\"\u003e\n \u003ci\u003e\n This is part of the \u003ca href=\"/dwarf\"\u003eseries on DWARF\u003c/a\u003e.\n \u003c/i\u003e\n\u003c/div\u003e\n\n\u003cdiv\u003e\n \u003ca name=\"what-are-elf-and-dwarf\" href=\"#what-are-elf-and-dwarf\"\u003e\u003ch4\u003eWhat are ELF and DWARF?\u003c/h4\u003e\u003c/a\u003e\n \u003cp\u003e\n \u003ca href=\"https://en.wikipedia.org/wiki/Executable_and_Linkable_Format\" target=\"_blank\"\u003eExecutable and Linkable Format\u003c/a\u003e (ELF) is a file format for executables, object files, shared libraries, and more that's used on various Unix-like systems. If you've ever downloaded and run a program on Linux, you're using an ELF executable. It's akin to an \u003ccode\u003e.exe\u003c/code\u003e file on Windows.\n \u003c/p\u003e\n \u003cp\u003e\n \u003ca href=\"https://dwarfstd.org/\" target=\"_blank\"\u003eDWARF\u003c/a\u003e is a debugging information format that is used with ELF files. Debug information allows you to do neat things with a running program such as:\n \u003c/p\u003e\n \u003cul\u003e\n \u003cli\u003eMap the compiled machine code stored inside back to the original source code\u003c/li\u003e\n \u003cli\u003eFigure out where variables are stored throughout the lifetime of a program as it executes\u003c/li\u003e\n \u003cli\u003eUnwind the callstack to generate a backtrace to find out where your program is stopped and where it came from\u003c/li\u003e\n \u003cli\u003eMuch more!\u003c/li\u003e\n \u003c/ul\u003e\n \u003cp\u003e\n In this series, we'll go in to a lot of detail on topics such as these. Let's take it from the top: parsing ELF files, which contain DWARF debug information.\n \u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv\u003e\n \u003ca name=\"our-test-program\" href=\"#our-test-program\"\u003e\u003ch4\u003eOur Test Program\u003c/h4\u003e\u003c/a\u003e\n \u003cp\u003e\n Let's parse some files! In order to do so, we'll need a program to play around with. Throughout the rest of this series, I'm going to use this dead-simple C program called \u003ccode\u003ecloop\u003c/code\u003e that gets its own process ID, then loops forever and prints it once per second. C is a good choice because it is simple, has no runtime, and it's well-supported by all the tools we'll be using. Here's the full program:\n \u003c/p\u003e\n \u003cpre\u003e\u003ccode\u003e#include \u0026lt;unistd.h\u0026gt;\n#include \u0026lt;stdio.h\u0026gt;\n\nint main() {\n pid_t pid = getpid();\n unsigned long long ndx = 0;\n while (1) {\n printf(\"c looping (pid %d): %llu\\n\", pid, ndx);\n fflush(stdout);\n ndx++;\n sleep(1);\n }\n\n return 0;\n}\n\u003c/code\u003e\u003c/pre\u003e\n \u003cp\u003e\n I'll be compiling this with gcc 14.2.1 on Manjaro Linux with kernel version 6.9.12 using this \u003ccode\u003ebuild.sh\u003c/code\u003e script, but feel free to play around with \u003ccode\u003eCC\u003c/code\u003e and \u003ccode\u003eDWARF\u003c/code\u003e as we go (it defaults to DWARF version 5):\n \u003c/p\u003e\n \u003cpre\u003e\u003ccode\u003e#!/usr/bin/env bash\n\n${CC:-gcc} -Wall -Wextra -Werror -no-pie -O0 -g -gdwarf-${DWARF:-5} -o cloop main.c\n\u003c/code\u003e\u003c/pre\u003e\n \u003cp\u003e\n Additionally, for this series, I'll give some short code examples of \u003ca href=\"https://go.dev/\" target=\"_blank\"\u003eGo\u003c/a\u003e code to help illustrate various concepts. I chose Go because it's popular, terse, simple to read, and has a large standard library to help us out. I'll intentionally omit error handling and not worry about writing effecient code to keep the examples short. I'm using Go version 1.22.7.\n \u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv\u003e\n \u003ca name=\"parsing-the-elf-file-header\" href=\"#parsing-the-elf-file-header\"\u003e\u003ch4\u003eParsing The ELF File Header\u003c/h4\u003e\u003c/a\u003e\n \u003cp\u003e\n There's a lot of data contained within ELF files, but for our needs it's pretty straightforward, and we can ignore most of it. We just want to grab the raw binary of each debug info section as well as a couple facts about the executable.\n \u003c/p\u003e\n \u003cp\u003e\n Each binary file \u003ca href=\"https://refspecs.linuxbase.org/elf/gabi4+/ch4.eheader.html\" target=\"_blank\"\u003estarts with the ELF header\u003c/a\u003e, followed by various \"sections\", each of which is just a region of the file that has a distinct job. For instance, the program text (machine code) of your executable or object is in the \u003ccode\u003e.text\u003c/code\u003e section.\n \u003c/p\u003e\n \u003cp\u003e\n We first want to open and read the contents of the binary file. It starts with 16 bytes of \"ELF Identifier\" header data. The first four of those bytes are the \u003ca href=\"https://en.wikipedia.org/wiki/List_of_file_signatures\" target=\"_blank\"\u003emagic number\u003c/a\u003e 0x7f followed by 0x45, 0x4c, 0x46, or ELF in ASCII. So using our \u003ca href=\"/dwarf/go#binary-reader\" target=\"_blank\"\u003eBinaryReader\u003c/a\u003e, we'd do:\n \u003c/p\u003e\n \u003cpre\u003e\u003ccode\u003efileBuf, _ := os.ReadFile(filePath)\nreader := NewBinaryReader(bytes.NewBuffer(fileBuf), binary.NativeEndian)\n\nmagic := []byte{0x7f, 'E', 'L', 'F'}\n\nmagicBuf := make([]byte, len(magic))\nreader.Read(magicBuf)\n\nif !slices.Equal(magic, magicBuf) {\n panic(\"incorrect ELF magic number\")\n}\n\u003c/code\u003e\u003c/pre\u003e\n \u003cp\u003e\n Next comes a the \u003ccode\u003ee_ident\u003c/code\u003e header section, which contains several one-byte flags, each prefixed with \u003ccode\u003eEI_\u003c/code\u003e, then some padding, which you should skip over. They are, in order:\n \u003c/p\u003e\n \u003cul\u003e\n \u003cli\u003e\u003ccode\u003eEI_CLASS\u003c/code\u003e: address size (1 for a 32-bit binary, 2 for a 64-bit binary)\u003c/li\u003e\n \u003cli\u003e\u003ccode\u003eEI_DATA\u003c/code\u003e: byte order (1 for 2's compliment little-endian, 2 for 2's compliment big-endian)\u003c/li\u003e\n \u003cli\u003e\u003ccode\u003eEI_VERSION\u003c/code\u003e: file format version (should always be 1 as of time of writing)\u003c/li\u003e\n \u003cli\u003e\u003ccode\u003eEI_OSABI\u003c/code\u003e: operating system and ABI (see Go's implementation for a \u003ca href=\"https://cs.opensource.google/go/go/+/master:src/debug/elf/elf.go;drc=315b6ae682a2a4e7718924a45b8b311a0fe10043;l=122\" target=\"_blank\"\u003elist of values\u003c/a\u003e)\u003c/li\u003e\n \u003cli\u003e\u003ccode\u003eEI_ABIVERSION\u003c/code\u003e: often ignored on Linux\u003c/li\u003e\n \u003cli\u003epadding: 7 bytes (gives us 16 total bytes in the e_indent section)\u003c/li\u003e\n \u003c/ul\u003e\n \u003cp\u003e\n Next up is the rest of the ELF file headers, again in order. Refer to the documentation or a robust implementation such as \u003ca href=\"https://cs.opensource.google/go/go/+/refs/tags/go1.23.1:src/debug/elf/file.go;drc=08e73e61521d7b83198407211aa232ed4f572f18;l=278\" target=\"_blank\"\u003eGo\u003c/a\u003e or \u003ca href=\"https://github.com/ziglang/zig/blob/085cc54aadb327b9910be2c72b31ea046e7e8f52/lib/std/elf.zig\" target=\"_blank\"\u003eZig\u003c/a\u003e for more information on each field and their values.\n \u003c/p\u003e\n \u003cul\u003e\n \u003cli\u003e\u003ccode\u003ee_type, uint16\u003c/code\u003e: file type\u003c/li\u003e\n \u003cli\u003e\u003ccode\u003ee_machine, uint16\u003c/code\u003e: machine type\u003c/li\u003e\n \u003cli\u003e\u003ccode\u003ee_version, uint32\u003c/code\u003e: file format version\u003c/li\u003e\n \u003cli\u003e\u003ccode\u003ee_entry, uintptr\u003c/code\u003e: virtual address at which the start of the program resides\u003c/li\u003e\n \u003cli\u003e\u003ccode\u003ee_phoff, uintptr\u003c/code\u003e: byte offset from the start of the file at which the program header table is located\u003c/li\u003e\n \u003cli\u003e\u003ccode\u003ee_shoff, uintptr\u003c/code\u003e: byte offset from the start of the file at which the section header table is located\u003c/li\u003e\n \u003cli\u003e\u003ccode\u003ee_flags, uint32\u003c/code\u003e: processor-specific flags\u003c/li\u003e\n \u003cli\u003e\u003ccode\u003ee_ehsize, uint16\u003c/code\u003e: the number of bytes in this ELF header\u003c/li\u003e\n \u003cli\u003e\u003ccode\u003ee_phentsize, uint16\u003c/code\u003e: the number of bytes in one entry in the program header table (all entries are the same size)\u003c/li\u003e\n \u003cli\u003e\u003ccode\u003ee_phnum, uint16\u003c/code\u003e: the number of entries in the program header table\u003c/li\u003e\n \u003cli\u003e\u003ccode\u003ee_shentsize, uint16\u003c/code\u003e: the number of bytes in one entry in the section header table (all entries are the same size)\u003c/li\u003e\n \u003cli\u003e\u003ccode\u003ee_shnum, uint16\u003c/code\u003e: the number of entries in the section header table\u003c/li\u003e\n \u003cli\u003e\u003ccode\u003ee_shstrndx, uint16\u003c/code\u003e: the section header table index of the entry associated with the section name string table\u003c/li\u003e\n \u003c/ul\u003e\n \u003cp\u003e\n It's giving us a few facts about the binary, then a list of offsets from the start of the file that indicate where each secion is located (everything that starts with \u003ccode\u003esh\u003c/code\u003e). We'll use these fields to look up the section header table, read each entry in the table, and use those entries to find the debug sections we care about.\n \u003c/p\u003e\n \u003cp\u003e\n Note that in Go, \u003ccode\u003euintptr\u003c/code\u003e is the built-in data type for an int of your machine's address size, meaning 4 bytes on 32-bit systems, and 8 bytes on 64-bit systems.\n \u003c/p\u003e\n \u003cp\u003e\n Also, In digging through the docs, you may have noticed some values such as \u003ccode\u003eLOPROC = 0xff00; HIPROC = 0xffff;\u003c/code\u003e. Both ELF and DWARF commonly reserve large ranges of high values for each processor, programming language, OS, etc. to define their own custom values for various enums. We won't be using these too much, but be aware that GNU, Go, Zig, and others commonly make use of these. You'll be able to get more information on each by reading through various compilers.\n \u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv\u003e\n \u003ca name=\"parsing-the-section-header-table\" href=\"#parsing-the-section-header-table\"\u003e\u003ch4\u003eParsing The Section Header Table\u003c/h4\u003e\u003c/a\u003e\n \u003cp\u003e\n Next up, we need to parse each \u003ca href=\"https://refspecs.linuxbase.org/elf/gabi4+/ch4.sheader.html\" target=\"_blank\"\u003esection header\u003c/a\u003e contained within the file. The \"table\" is just a fancy word for \"an array of section header entries\". So once we're done, we'll have a list of where all sections start and end within the binary, the name of each section, and some other data.\n \u003c/p\u003e\n \u003cp\u003e\n The section header table starts at the \u003ccode\u003ee_shoff\u003c/code\u003e'th byte in the file, and is \u003ccode\u003ee_shentsize * e_shnum\u003c/code\u003e bytes long.\n \u003c/p\u003e\n \u003cp\u003e\n The fields of each section header are:\n \u003c/p\u003e\n \u003cul\u003e\n \u003cli\u003e\u003ccode\u003esh_name, uint32\u003c/code\u003e: name of the section as an index in to the string table\u003c/li\u003e\n \u003cli\u003e\u003ccode\u003esh_type, uint32\u003c/code\u003e: section type \u003ca href=\"https://refspecs.linuxbase.org/elf/gabi4+/ch4.sheader.html#sh_type\" target=\"_blank\"\u003eenum\u003c/a\u003e\u003c/li\u003e\n \u003cli\u003e\u003ccode\u003esh_flags, uintptr\u003c/code\u003e: \u003ca href=\"https://refspecs.linuxbase.org/elf/gabi4+/ch4.sheader.html#sh_flags\" target=\"_blank\"\u003eflags\u003c/a\u003e for this section\u003c/li\u003e\n \u003cli\u003e\u003ccode\u003esh_addr, uintptr\u003c/code\u003e: the address at which this section should reside within the address space of the process, if relevant\u003c/li\u003e\n \u003cli\u003e\u003ccode\u003esh_offset, uintptr\u003c/code\u003e: offset from the first byte of the ELF file to where the start of this section resides\u003c/li\u003e\n \u003cli\u003e\u003ccode\u003esh_size, uintptr\u003c/code\u003e: the number of bytes in the section\u003c/li\u003e\n \u003cli\u003e\u003ccode\u003esh_link, uint32\u003c/code\u003e: \u003ca href=\"https://refspecs.linuxbase.org/elf/gabi4+/ch4.sheader.html#sh_link\" target=\"_blank\"\u003eenum\u003c/a\u003e indicating the linkage of this section\u003c/li\u003e\n \u003cli\u003e\u003ccode\u003esh_info, uint32\u003c/code\u003e: \u003ca href=\"https://refspecs.linuxbase.org/elf/gabi4+/ch4.sheader.html#sh_link\" target=\"_blank\"\u003eenum\u003c/a\u003e indicating extra information about this section\u003c/li\u003e\n \u003cli\u003e\u003ccode\u003esh_addralign, uintptr\u003c/code\u003e: contraints on the alignment of addresses on the target platform (0 and 1 mean no constraints)\u003c/li\u003e\n \u003cli\u003e\u003ccode\u003esh_entsize, uintptr\u003c/code\u003e: if the section contains a table of fixed-size elements (i.e. a symbol table), this is the size of each element\u003c/li\u003e\n \u003c/ul\u003e\n \u003cp\u003e\n Read \u003ccode\u003ee_shnum\u003c/code\u003e entries, which should be exactly enough bytes. To give an example of how this might look in code, consider:\n \u003c/p\u003e\n \u003cpre\u003e\u003ccode\u003etype ELFSectionHeader struct {\n sh_name uint32\n sh_type uint32\n sh_flags uintptr\n sh_addr uintptr\n sh_offset uintptr\n sh_size uintptr\n sh_link uint32\n sh_info uint32\n sh_addralign uintptr\n sh_entsize uintptr\n\n // this is not part of the standard, but we'll\n // look up and store the name on this struct later\n name string\n}\n\nsectionHeaderTable := fileBuf[shOff : shOff+uintptr(shentSize*shNum)]\nsectionHeaderTableReader := NewBinaryReader(\n bytes.NewBuffer(sectionHeaderTable),\n binary.NativeEndian,\n)\n\nsectionHeaders := []*ELFSectionHeader{}\nfor ndx := 0; ndx \u0026lt; int(shNum); ndx++ {\n header := \u0026ELFSectionHeader{}\n header.sh_name, _ = Read[uint32](sectionHeaderTableReader)\n header.sh_type, _ = Read[uint32](sectionHeaderTableReader)\n header.sh_flags, _ = Read[uintptr](sectionHeaderTableReader)\n header.sh_addr, _ = Read[uintptr](sectionHeaderTableReader)\n header.sh_offset, _ = Read[uintptr](sectionHeaderTableReader)\n header.sh_size, _ = Read[uintptr](sectionHeaderTableReader)\n header.sh_link, _ = Read[uint32](sectionHeaderTableReader)\n header.sh_info, _ = Read[uint32](sectionHeaderTableReader)\n header.sh_addralign, _ = Read[uintptr](sectionHeaderTableReader)\n header.sh_entsize, _ = Read[uintptr](sectionHeaderTableReader)\n\n sectionHeaders = append(sectionHeaders, header)\n}\n\u003c/code\u003e\u003c/pre\u003e\n \u003cp\u003e\n Once we have all this information, we're going to want to use the \u003ccode\u003esh_name\u003c/code\u003e field to look up our section name in the section header string table. This is the ELF section with index \u003ccode\u003ee_shstrndx\u003c/code\u003e, named \u003ccode\u003e.shstrtab\u003c/code\u003e. In my case with the test C program, it's the 35th section, though yours may be different.\n \u003c/p\u003e\n \u003cp\u003e\n This table is a series of null-terminated strings all next to each other in one long array. You can read the entire table in to an array, then use the \u003ccode\u003esh_name\u003c/code\u003e field to find the entry at that index.\n \u003c/p\u003e\n \u003cp\u003e\n I'll use the \u003ccode\u003esh_size\u003c/code\u003e and \u003ccode\u003esh_offset\u003c/code\u003e fields of the \u003ccode\u003ee_shstrndx\u003c/code\u003e'th entry to find our location within the binary:\n \u003c/p\u003e\n \u003cpre\u003e\u003ccode\u003esectionNames := sectionHeaders[shStrTabNdx]\n\nstart := sectionNames.sh_offset\nend := start + sectionNames.sh_size\nsectionNamesBuf := fileBuf[start:end]\n\nfor _, header := range sectionHeaders {\n for ndx := header.sh_name; ; ndx++ {\n ch := sectionNamesBuf[ndx]\n if ch == 0 {\n break\n }\n header.name += string(ch)\n }\n}\n\u003c/code\u003e\u003c/pre\u003e\n \u003cp\u003e\n Now we're able to look up each debug information section by name!\n \u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv\u003e\n \u003ca name=\"debug-info-sections\" href=\"#debug-info-sections\"\u003e\u003ch4\u003eDebug Info Sections\u003c/h4\u003e\u003c/a\u003e\n \u003cp\u003e\n There's a fair number of sections in there! You may recognize some of them, but for the most part, we care about the ones that start with \u003ccode\u003e.debug_\u003c/code\u003e, though we also care about \u003ccode\u003e.eh_frame\u003c/code\u003e. If you want to check your work, you can with \u003ccode\u003ereadelf --headers cloop\u003c/code\u003e. We'll get in to what each of these sections mean over time.\n \u003c/p\u003e\n \u003cp\u003e\n There are actually a few sections that are missing from the binary on my machine that we also would want to save if they were present (there are some sections that were present in older versions of DWARF for instance, but were dropped when v5 was released). We'll want to take the content of each one of those sections and save them for parsing later:\n \u003c/p\u003e\n \u003cpre\u003e\u003ccode\u003e type DWARFSections struct {\n abbrev []byte\n line []byte\n info []byte\n addr []byte\n aranges []byte\n frame []byte\n eh_frame []byte\n line_str []byte\n loc []byte\n loclists []byte\n names []byte\n macinfo []byte\n macro []byte\n pubnames []byte\n pubtypes []byte\n ranges []byte\n rnglists []byte\n str []byte\n str_offsets []byte\n types []byte\n }\n\n getSection := func(header *ELFSectionHeader) []byte {\n start := header.sh_offset\n end := header.sh_offset + header.sh_size\n return fileBuf[start:end]\n }\n\n sections := \u0026DWARFSections{}\n for _, header := range sectionHeaders {\n switch header.name {\n case \".debug_abbrev\":\n sections.abbrev = getSection(header)\n case \".debug_line\":\n sections.line = getSection(header)\n case \".debug_info\":\n sections.info = getSection(header)\n case \".debug_addr\":\n sections.addr = getSection(header)\n case \".debug_aranges\":\n sections.aranges = getSection(header)\n case \".debug_frame\":\n sections.frame = getSection(header)\n case \".eh_frame\":\n sections.eh_frame = getSection(header)\n case \".debug_line_str\":\n sections.line_str = getSection(header)\n case \".debug_loc\":\n sections.loc = getSection(header)\n case \".debug_loclists\":\n sections.loclists = getSection(header)\n case \".debug_names\":\n sections.names = getSection(header)\n case \".debug_macinfo\":\n sections.macinfo = getSection(header)\n case \".debug_macro\":\n sections.macro = getSection(header)\n case \".debug_pubnames\":\n sections.pubnames = getSection(header)\n case \".debug_pubtypes\":\n sections.pubtypes = getSection(header)\n case \".debug_ranges\":\n sections.ranges = getSection(header)\n case \".debug_rnglists\":\n sections.rnglists = getSection(header)\n case \".debug_str\":\n sections.str = getSection(header)\n case \".debug_str_offsets\":\n sections.str_offsets = getSection(header)\n case \".debug_types\":\n sections.types = getSection(header)\n }\n }\n\u003c/code\u003e\u003c/pre\u003e\n \u003cp\u003e\n Now we're \u003ci\u003ealmost\u003c/i\u003e ready to start parsing those debug info sections in to something that allows us to inspect a running program!\n \u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv\u003e\n \u003ca name=\"pie\" href=\"#pie\"\u003e\u003ch4\u003ePIE 🥧\u003c/h4\u003e\u003c/a\u003e\n \u003cp\u003e\n The last thing we'll want to do with our ELF file for now is examine it to determine if it is a \u003ca href=\"https://en.wikipedia.org/wiki/Position-independent_code\" target=\"_blank\"\u003eposition independent executable\u003c/a\u003e (PIE, also known as position independent code or PIC). PIE means that the code can be loaded and executed at any address in the process' memory space, and is the opposite of aboslute code, which must be loaded at a fixed address in memory. You can enable PIE with the \u003ccode\u003e-fPIC\u003c/code\u003e \u003ca href=\"https://gcc.gnu.org/onlinedocs/gcc/Code-Gen-Options.html#index-fpic\" target=\"_blank\"\u003ecompiler flag\u003c/a\u003e in gcc and clang. It ultimately doesn't restrict our capabilities as a debugger at all, it just means that we need to look up where in the process' address space our code is loaded when we start the program (we'll do that much later).\n \u003c/p\u003e\n \u003cp\u003e\n For now, we can determine if we're PIC based on the value of the \u003ccode\u003eFLAGS_1\u003c/code\u003e field in the \u003ccode\u003e.dynamic\u003c/code\u003e section like so:\n \u003c/p\u003e\n \u003cpre\u003e\u003ccode\u003evar dynamicHeader *ELFSectionHeader\nfor _, header := range sectionHeaders {\n if header.name == \".dynamic\" {\n dynamicHeader = header\n break\n }\n}\n\npie := false\ndynamicBuf := getSection(dynamicHeader)\ndynamicReader := NewBinaryReader(bytes.NewBuffer(dynamicBuf), binary.NativeEndian)\nfor {\n tag, _ := Read[uintptr](dynamicReader)\n val, err := Read[uintptr](dynamicReader)\n\n if tag == 0x6fff_fffb { // DT_FLAGS_1\n if (val \u0026 0x0800_0000) \u003e 0 { // DF_1_PIE\n pie = true\n break\n }\n }\n\n if err == io.EOF {\n break\n }\n}\n\u003c/code\u003e\u003c/pre\u003e\n \u003cp\u003e\n You can check your work on this using \u003ccode\u003ereadelf --dynamic cloop\u003c/code\u003e. You may want to try compiling cloop with \u003ccode\u003e-fPIE\u003c/code\u003e and without \u003ccode\u003e-no-pie\u003c/code\u003e and re-running your parser to make sure things are looking good.\n \u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv\u003e\n \u003ca name=\"summary\" href=\"#summary\"\u003e\u003ch4\u003eSummary\u003c/h4\u003e\u003c/a\u003e\n \u003cp\u003e\n That's it for today! We learned what ELF and DWARF are as well as how to parse just enough ELF to get the debug information sections we care about. ELF is probably the easiest section in this series, so strap in.\n \u003c/p\u003e\n\u003c/div\u003e\n\n\u003cdiv class=\"margin-top\"\u003e\n \u003ci\u003e\n Thank you for reading the \u003ca href=\"/dwarf\"\u003eseries on DWARF\u003c/a\u003e. Please don't hesitate to reach out with comments, questions, or errata to jim at this domain dot com.\n \u003c/i\u003e\n\u003c/div\u003e\n\n\n \u003c/div\u003e\n \u003cdiv class=\"col 4\"\u003e\u003c/div\u003e\n \u003c/div\u003e\n \u003cdiv class=\"row footer\"\u003e\n \u003cdiv\u003e\n Subscribe for updates via email (your data will never be shared, ever)\n \u003cdiv class=\"margin-top-small\"\u003e\n \u003cinput id=\"email-input\" type=\"email\" onkeydown=\"emailSignupKeyEvent(this)\" /\u003e\n \u003cbutton id=\"email-submit\" onclick=\"emailSignup()\"\u003eSign Up\u003c/button\u003e\n \u003cp id=\"email-status\"\u003e\u003c/p\u003e\n \u003c/div\u003e\n \u003c/div\u003e\n \u003cdiv class=\"margin-top\"\u003e\n Copyright © 2024 Jim Calabro. All rights reserved.\n \u003c/div\u003e\n \u003c/div\u003e\n \u003c/body\u003e\n\u003c/html\u003e\n", "summary": "What is an ELF file? Why do we need it? How do we parse just enough of one to get the information we care about?", "date_published": "2024-09-25T00:00:00Z" }, { "id": "", "url": "https://calabro.io/dwarf/go", "title": "How DWARF Works: Go Code Reference", "content_html": "\n\u003c!DOCTYPE html\u003e\n\u003chtml lang=\"en\"\u003e\n \u003chead\u003e\n \u003cmeta charset=\"utf-8\"\u003e\n \u003cmeta name=\"viewport\" content=\"width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=no\" /\u003e\n \u003cmeta http-equiv=\"Cache-control\" content=\"public\"\u003e\n \u003cmeta name=\"description\" content=\"Jim Calabro\"\u003e\n \u003clink rel=\"canonical\" rel=\"noreferrer\" href=\"https://www.calabro.io\" /\u003e\n \u003clink rel=\"icon\" type=\"image/svg\" href=\"/static/favicon.svg\"\u003e\n \u003ctitle\u003eHow DWARF Works: Go Code Reference - Jim Calabro\u003c/title\u003e\n \u003clink rel=\"stylesheet\" type=\"text/css\" charset=\"utf-8\" href=\"/static/lit.min.css\"\u003e\n \u003clink rel=\"stylesheet\" type=\"text/css\" charset=\"utf-8\" href=\"/static/style.css\"\u003e\n \u003clink href='/static/Raleway.css' rel='stylesheet' type='text/css' async defer\u003e\n \u003clink rel=\"stylesheet\" href=\"/static/font-awesome.min.css\"\u003e\n \u003cscript async defer src=\"/static/font-awesome.min.js\" data-auto-replace-svg=\"nest\"\u003e\u003c/script\u003e\n \u003cscript type=\"text/javascript\"\u003e\n async function emailSignup() {\n const emailSubmit = document.getElementById('email-submit');\n emailSubmit.disabled = true;\n\n const email = document.getElementById('email-input').value;\n const resp = await fetch('/v1/email', {\n method: 'POST',\n body: JSON.stringify({ email: email }),\n });\n\n emailSubmit.disabled = false;\n\n const emailStatus = document.getElementById('email-status');\n if (resp.status === 200) {\n emailStatus.innerText = 'Success! Thank you for subscribing.'\n } else {\n emailStatus.innerText = 'Hmm, that didn\\'t work. Please ensure you entered a valid email address and try again later.';\n }\n }\n\n function emailSignupKeyEvent() {\n if (event.key === 'Enter') {\n emailSignup();\n }\n }\n\n const mobileWidth = 560;\n\n function toggleMenu(event) {\n if (window.innerWidth \u003e= mobileWidth) {\n return;\n }\n\n const links = document.querySelector('.links');\n const display = links.style['display'];\n if (!display || display === 'none') {\n links.style.display = 'block';\n } else {\n links.style.display = 'none';\n }\n }\n\n addEventListener('resize', (event) =\u003e {\n if (window.innerWidth \u003e= mobileWidth) {\n document.querySelector('.links').style.display = 'block';\n } else {\n document.querySelector('.links').style.display = 'none';\n }\n });\n \u003c/script\u003e\n \u003c/head\u003e\n \u003cbody\u003e\n \u003cdiv class=\"row\"\u003e\n \u003cdiv class=\"col 2\"\u003e\u003c/div\u003e\n \u003cdiv class=\"col 2\"\u003e\n \u003cdiv class=\"collapsed-links\"\u003e\n \u003ch3 class=\"name-header\"\u003e\u003ca href=\"/\" style=\"cursor: pointer\"\u003eJim Calabro\u003c/a\u003e\u003c/h3\u003e\n \u003cdiv onclick=\"toggleMenu(this)\"\u003e\u003ci class=\"hamburger-menu fa-solid fa-bars\"\u003e\u003c/i\u003e\u003c/div\u003e\n \u003c/div\u003e\n \u003cdiv class=\"links\"\u003e\n \u003cdiv\u003e\u003ca href=\"/about\"\u003eAbout\u003c/a\u003e\u003c/div\u003e\n \u003cdiv\u003e\n \u003ca href=\"/rss\"\u003eRSS\u003c/a\u003e\n \u003cspan class=\"divider\"\u003e | \u003c/span\u003e\n \u003ca href=\"/atom\"\u003eAtom\u003c/a\u003e\n \u003cspan class=\"divider\"\u003e | \u003c/span\u003e\n \u003ca href=\"/json\"\u003eJSON\u003c/a\u003e\n \u003c/div\u003e\n \u003cdiv\u003e---\u003c/div\u003e\n \u003cdiv\u003e\u003ca href=\"https://github.com/jcalabro\" target=\"_blank\"\u003eGitHub \u003ci class=\"icon fa fa-arrow-up-right-from-square\"\u003e\u003c/i\u003e\u003c/a\u003e\u003c/div\u003e\n \u003cdiv\u003e\u003ca href=\"https://www.chess.com/member/jcalabro\" target=\"_blank\"\u003eChess \u003ci class=\"icon fa fa-arrow-up-right-from-square\"\u003e\u003c/i\u003e\u003c/a\u003e\u003c/div\u003e\n \u003cdiv\u003e\u003ca href=\"https://octodon.social/@jcalabro\" target=\"_blank\"\u003eMastodon \u003ci class=\"icon fa fa-arrow-up-right-from-square\"\u003e\u003c/i\u003e\u003c/a\u003e\u003c/div\u003e\n \u003cdiv\u003e\u003ca href=\"https://www.last.fm/user/thekingofping\" target=\"_blank\"\u003eLast.fm \u003ci class=\"icon fa fa-arrow-up-right-from-square\"\u003e\u003c/i\u003e\u003c/a\u003e\u003c/div\u003e\n \u003c/div\u003e\n \u003c/div\u003e\n \u003cdiv class=\"col 4\"\u003e\n \n \u003cdiv class=\"post-title\"\u003e\n \u003ch4\u003eHow DWARF Works: Go Code Reference\u003c/h4\u003e\n \u003ci\u003eSep 25, 2024\u003c/i\u003e\n \u003c/div\u003e\n \n \n\n\u003cdiv class=\"margin-bottom\"\u003e\n \u003ci\u003e\n This is part of the \u003ca href=\"/dwarf\"\u003eseries on DWARF\u003c/a\u003e.\n \u003c/i\u003e\n\u003c/div\u003e\n\n\u003cdiv\u003e\n \u003cp\u003e\n This page contains Go helper snippets that illustrate how parsing works, but are used across many sections. These snippets often intentionally ignore error handling in favor of brevity.\n \u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv\u003e\n \u003ca name=\"leb128\" href=\"#leb128\"\u003e\u003ch4\u003eLEB128\u003c/h4\u003e\u003c/a\u003e\n \u003cp\u003e\n\t\tA key integer type that is used all over the place in DWARF is signed/unsigned \u003ca href=\"https://en.wikipedia.org/wiki/LEB128\" target=\"_blank\"\u003eLEB128\u003c/a\u003e. It's a variable-width signed/unsigned integer type, though in practice, I've never seen anything use more than 64 bits, but it's perfectly reasonable to assume that's not always the case.\n \u003c/p\u003e\n \u003cp\u003e\n\t\tI wrote \u003ca href=\"https://github.com/jcalabro/leb128\" target=\"_blank\"\u003ea simple library\u003c/a\u003e to parse these a couple years ago, and I'm going to be using it throughout the series. You can import it just like you would any other Go library, or read its implementation and write your own. Zig also has a \u003ca href=\"https://github.com/ziglang/zig/blob/085cc54aadb327b9910be2c72b31ea046e7e8f52/lib/std/leb128.zig\" target=\"_blank\"\u003egood implementation\u003c/a\u003e for reference.\n \u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv\u003e\n \u003ca name=\"initial-length\" href=\"#initial-length\"\u003e\u003ch4\u003eInitial Length\u003c/h4\u003e\u003c/a\u003e\n \u003cp\u003e\n An \u003ccode\u003einitial length\u003c/code\u003e field in DWARF is a special integer encoding that takes either 4 or 12 bytes and allows you to encode either a 4 or 8 byte integer. To parse one, you read a 4-byte integer, and if it's any value other than \u003ccode\u003e0xffffffff\u003c/code\u003e, you use it as-is. If it's that special value, you read another eight bytes, discard the initial four bytes you read, and just use the final eight bytes.\n \u003c/p\u003e\n \u003cpre\u003e\u003ccode\u003efunc readInitialLength(reader *BinaryReader) uintptr {\n val32, _ := Read[uint32](reader)\n if val32 != 0xffffffff {\n return uintptr(val32)\n }\n\n val64, _ := Read[uint64](reader)\n return uintptr(val64)\n}\n\u003c/code\u003e\u003c/pre\u003e\n\u003c/div\u003e\n\u003cdiv\u003e\n \u003ca name=\"binary-reader\" href=\"#binary-reader\"\u003e\u003ch4\u003eBinary Reader\u003c/h4\u003e\u003c/a\u003e\n \u003cp\u003e\n Go has an \u003ca href=\"https://pkg.go.dev/encoding/binary\" target=\"_blank\"\u003eencoding/binary\u003c/a\u003e package, but it doesn't allow for reading arbitrary integers, nor does it allow us to know how far along the buffer we've read (our offset). We can wrap it a bit more nicely and ensure we're always reading in the correcty byte order using our own \u003ccode\u003eBinaryReader\u003c/code\u003e:\n \u003c/p\u003e\n \u003cpre\u003e\u003ccode\u003etype Signed interface {\n\t~int | ~int8 | ~int16 | ~int32 | ~int64\n}\n\ntype Unsigned interface {\n\t~uint | ~uint8 | ~uint16 | ~uint32 | ~uint64 | ~uintptr\n}\n\ntype Integer interface {\n\tSigned | Unsigned\n}\n\ntype BinaryReader struct {\n *bytes.Buffer\n\n startLen int\n}\n\n// Returns how many bytes we've read since the start of the buffer\nfunc (br *BinaryReader) Offset() int {\n return br.startLen - br.Len()\n}\n\nfunc NewBinaryReader(r *bytes.Buffer) *BinaryReader {\n return \u0026BinaryReader{Buffer: r, startLen: r.Len()}\n}\n\nfunc Read[T Integer](br *BinaryReader) (T, error) {\n empty := *new(T)\n size := unsafe.Sizeof(empty)\n enc := binary.NativeEndian\n\n // read N bytes br the reader\n buf := make([]byte, size)\n err := binary.Read(br, enc, buf)\n if err != nil {\n return empty, err\n }\n\n // convert to the appropriate type\n val := *new(T)\n switch any(val).(type) {\n case int8:\n val = T(int8(buf[0]))\n case uint8:\n val = T(uint8(buf[0]))\n\n case int16:\n val = T(int16(enc.Uint16(buf)))\n case uint16:\n val = T(enc.Uint16(buf))\n\n case int32:\n val = T(int32(enc.Uint32(buf)))\n case uint32:\n val = T(enc.Uint32(buf))\n\n case int, int64:\n val = T(int64(enc.Uint64(buf)))\n case uint, uint64:\n val = T(enc.Uint64(buf))\n case uintptr:\n if size == 4 {\n val = T(uintptr(enc.Uint32(buf)))\n } else if size == 8 {\n val = T(uintptr(enc.Uint64(buf)))\n } else {\n return empty, fmt.Errorf(\"word size %d not supported\", size)\n }\n\n default:\n return empty, fmt.Errorf(\"unknown data type during load: %T\", val)\n }\n\n return val, nil\n}\n\u003c/code\u003e\u003c/pre\u003e\n\u003c/div\u003e\n\n \u003c/div\u003e\n \u003cdiv class=\"col 4\"\u003e\u003c/div\u003e\n \u003c/div\u003e\n \u003cdiv class=\"row footer\"\u003e\n \u003cdiv\u003e\n Subscribe for updates via email (your data will never be shared, ever)\n \u003cdiv class=\"margin-top-small\"\u003e\n \u003cinput id=\"email-input\" type=\"email\" onkeydown=\"emailSignupKeyEvent(this)\" /\u003e\n \u003cbutton id=\"email-submit\" onclick=\"emailSignup()\"\u003eSign Up\u003c/button\u003e\n \u003cp id=\"email-status\"\u003e\u003c/p\u003e\n \u003c/div\u003e\n \u003c/div\u003e\n \u003cdiv class=\"margin-top\"\u003e\n Copyright © 2024 Jim Calabro. All rights reserved.\n \u003c/div\u003e\n \u003c/div\u003e\n \u003c/body\u003e\n\u003c/html\u003e\n", "summary": "Reference Go helper code used throughout the How DWARF Works series", "date_published": "2024-09-25T00:00:00Z" }, { "id": "", "url": "https://calabro.io/hello", "title": "Hello!", "content_html": "\n\u003c!DOCTYPE html\u003e\n\u003chtml lang=\"en\"\u003e\n \u003chead\u003e\n \u003cmeta charset=\"utf-8\"\u003e\n \u003cmeta name=\"viewport\" content=\"width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=no\" /\u003e\n \u003cmeta http-equiv=\"Cache-control\" content=\"public\"\u003e\n \u003cmeta name=\"description\" content=\"Jim Calabro\"\u003e\n \u003clink rel=\"canonical\" rel=\"noreferrer\" href=\"https://www.calabro.io\" /\u003e\n \u003clink rel=\"icon\" type=\"image/svg\" href=\"/static/favicon.svg\"\u003e\n \u003ctitle\u003eHello! - Jim Calabro\u003c/title\u003e\n \u003clink rel=\"stylesheet\" type=\"text/css\" charset=\"utf-8\" href=\"/static/lit.min.css\"\u003e\n \u003clink rel=\"stylesheet\" type=\"text/css\" charset=\"utf-8\" href=\"/static/style.css\"\u003e\n \u003clink href='/static/Raleway.css' rel='stylesheet' type='text/css' async defer\u003e\n \u003clink rel=\"stylesheet\" href=\"/static/font-awesome.min.css\"\u003e\n \u003cscript async defer src=\"/static/font-awesome.min.js\" data-auto-replace-svg=\"nest\"\u003e\u003c/script\u003e\n \u003cscript type=\"text/javascript\"\u003e\n async function emailSignup() {\n const emailSubmit = document.getElementById('email-submit');\n emailSubmit.disabled = true;\n\n const email = document.getElementById('email-input').value;\n const resp = await fetch('/v1/email', {\n method: 'POST',\n body: JSON.stringify({ email: email }),\n });\n\n emailSubmit.disabled = false;\n\n const emailStatus = document.getElementById('email-status');\n if (resp.status === 200) {\n emailStatus.innerText = 'Success! Thank you for subscribing.'\n } else {\n emailStatus.innerText = 'Hmm, that didn\\'t work. Please ensure you entered a valid email address and try again later.';\n }\n }\n\n function emailSignupKeyEvent() {\n if (event.key === 'Enter') {\n emailSignup();\n }\n }\n\n const mobileWidth = 560;\n\n function toggleMenu(event) {\n if (window.innerWidth \u003e= mobileWidth) {\n return;\n }\n\n const links = document.querySelector('.links');\n const display = links.style['display'];\n if (!display || display === 'none') {\n links.style.display = 'block';\n } else {\n links.style.display = 'none';\n }\n }\n\n addEventListener('resize', (event) =\u003e {\n if (window.innerWidth \u003e= mobileWidth) {\n document.querySelector('.links').style.display = 'block';\n } else {\n document.querySelector('.links').style.display = 'none';\n }\n });\n \u003c/script\u003e\n \u003c/head\u003e\n \u003cbody\u003e\n \u003cdiv class=\"row\"\u003e\n \u003cdiv class=\"col 2\"\u003e\u003c/div\u003e\n \u003cdiv class=\"col 2\"\u003e\n \u003cdiv class=\"collapsed-links\"\u003e\n \u003ch3 class=\"name-header\"\u003e\u003ca href=\"/\" style=\"cursor: pointer\"\u003eJim Calabro\u003c/a\u003e\u003c/h3\u003e\n \u003cdiv onclick=\"toggleMenu(this)\"\u003e\u003ci class=\"hamburger-menu fa-solid fa-bars\"\u003e\u003c/i\u003e\u003c/div\u003e\n \u003c/div\u003e\n \u003cdiv class=\"links\"\u003e\n \u003cdiv\u003e\u003ca href=\"/about\"\u003eAbout\u003c/a\u003e\u003c/div\u003e\n \u003cdiv\u003e\n \u003ca href=\"/rss\"\u003eRSS\u003c/a\u003e\n \u003cspan class=\"divider\"\u003e | \u003c/span\u003e\n \u003ca href=\"/atom\"\u003eAtom\u003c/a\u003e\n \u003cspan class=\"divider\"\u003e | \u003c/span\u003e\n \u003ca href=\"/json\"\u003eJSON\u003c/a\u003e\n \u003c/div\u003e\n \u003cdiv\u003e---\u003c/div\u003e\n \u003cdiv\u003e\u003ca href=\"https://github.com/jcalabro\" target=\"_blank\"\u003eGitHub \u003ci class=\"icon fa fa-arrow-up-right-from-square\"\u003e\u003c/i\u003e\u003c/a\u003e\u003c/div\u003e\n \u003cdiv\u003e\u003ca href=\"https://www.chess.com/member/jcalabro\" target=\"_blank\"\u003eChess \u003ci class=\"icon fa fa-arrow-up-right-from-square\"\u003e\u003c/i\u003e\u003c/a\u003e\u003c/div\u003e\n \u003cdiv\u003e\u003ca href=\"https://octodon.social/@jcalabro\" target=\"_blank\"\u003eMastodon \u003ci class=\"icon fa fa-arrow-up-right-from-square\"\u003e\u003c/i\u003e\u003c/a\u003e\u003c/div\u003e\n \u003cdiv\u003e\u003ca href=\"https://www.last.fm/user/thekingofping\" target=\"_blank\"\u003eLast.fm \u003ci class=\"icon fa fa-arrow-up-right-from-square\"\u003e\u003c/i\u003e\u003c/a\u003e\u003c/div\u003e\n \u003c/div\u003e\n \u003c/div\u003e\n \u003cdiv class=\"col 4\"\u003e\n \n \n\u003cp\u003eHey! I'm Jim, and this is my site.\u003c/p\u003e\n\u003cp\u003eMost of the content here will be about computing, technology, and business. There may also be some occassional stuff on cycling, games, music, or something else.\u003c/p\u003e\n\u003cp\u003eThe cadence of writing will be relateively lax; I woudln't expect more than a post every few months. I'll try my best to keep you appraised of the stuff I'm working on.\u003c/p\u003e\n\u003cp\u003eThanks for reading!\u003c/p\u003e\n\n \u003c/div\u003e\n \u003cdiv class=\"col 4\"\u003e\u003c/div\u003e\n \u003c/div\u003e\n \u003cdiv class=\"row footer\"\u003e\n \u003cdiv\u003e\n Subscribe for updates via email (your data will never be shared, ever)\n \u003cdiv class=\"margin-top-small\"\u003e\n \u003cinput id=\"email-input\" type=\"email\" onkeydown=\"emailSignupKeyEvent(this)\" /\u003e\n \u003cbutton id=\"email-submit\" onclick=\"emailSignup()\"\u003eSign Up\u003c/button\u003e\n \u003cp id=\"email-status\"\u003e\u003c/p\u003e\n \u003c/div\u003e\n \u003c/div\u003e\n \u003cdiv class=\"margin-top\"\u003e\n Copyright © 2024 Jim Calabro. All rights reserved.\n \u003c/div\u003e\n \u003c/div\u003e\n \u003c/body\u003e\n\u003c/html\u003e\n", "summary": "What is this site?", "date_published": "2024-07-11T00:00:00Z" } ] }