01 - Parse and manipulate formats

The objective of this tutorial is to give an overview of the LIEF’s API to parse and manipulate formats

By Romain Thomas - @rh0main


ELF

We start by the ELF format. To create an ELF.Binary from a file we just have to give its path to the lief.parse() or lief.ELF.parse() functions

Note

With the Python API, these functions have the same behaviour but in C++, LIEF::Parser::parse() will return a pointer to a LIEF::Binary object whereas LIEF::ELF::Parser::parse() will return a LIEF::ELF::Binary object

import lief
binary = lief.parse("/bin/ls")

Once the ELF file has been parsed, we can access its Header:

header = binary.header

Change the entry point and the target architecture (ARCH):

header.entrypoint = 0x123
header.machine_type = lief.ELF.ARCH.AARCH64

and then commit these changes into a new ELF binary:

binary.write("ls.modified")

We can also iterate over the Sections as follows:

for section in binary.sections:
  print(section.name) # section's name
  print(section.size) # section's size
  print(len(section.content)) # Should match the previous print

To modify the content of the .text section:

text = binary.get_section(".text")
text.content = bytes([0x33] * text.size)

PE

As for the ELF part, we can use the lief.parse() or lief.PE.parse() functions to create a PE.Binary

import lief
binary = lief.parse("C:\\Windows\\explorer.exe")

To access the different PE headers (DosHeader, Header and OptionalHeader):

print(binary.dos_header)
print(binary.header)
print(binary.optional_header)

One can also access the imported functions in two ways:

  1. Using the abstract layer

  2. Using the PE definition

# Using the abstract layer
for func in binary.imported_functions:
  print(func)

# Using the PE definition
for func in binary.imports:
  print(func)

To have a better granularity on the location of the imported functions in libraries or to access to other fields of the PE imports, we can process the imports as follows:

for imported_library in binary.imports:
  print("Library name: " + imported_library.name)
  for func in imported_library.entries:
    if not func.is_ordinal:
      print(func.name)
    print(func.iat_address)