[ Prev ][ Table of Contents ][ Front Page ][ Talkback ][ FAQ ][ Next ]
LINUX GAZETTE
...making Linux just a little more fun!
Process Tracing Using Ptrace, part 2
By Sandeep S


The basic features of ptrace were explained in Part I. We saw a small example too. As I said earlier, the main applications of ptrace are accessing memory or registers of a process being run (either for debugging or for some evil purposes). So first we should have some basic idea on the binary format of executables - then only we know how and where to access them. So I shall give you a small tutorial on ELF, the binary format used in Linux. In the last section of this article, we find a small program which accesses the registers and memory of another one and modifies them so as to change the output of that process, by injecting some extra code.

Note: Please don't get confused. Definitely this is an article about ptrace, not about ELF. But a basic knowledge of ELF is required for accessing the core images of processes. So it should be explained first.

1. What is ELF?

ELF stands for Executable and Linking Format. It defines the format of executable binaries used on Linux - and also for relocatable, shared object and core dump files too. ELF is used by both linkers and loaders. They view ELF from two sides, so both should have a common interface.

The structure of ELF is such that it has many sections and segments. Relocatable files have section header tables, executable files have program header tables, and shared object files have both. In the coming sections I shall explain what these headers are.

2. ELF Headers

Every ELF file has an ELF header. It always starts at offset 0 in the file. It contains the details of the binary file - should it be interpreted, what data structures are related to the file, etc.

The format of the header is given below (taken from /usr/src/include/linux/elf.h)


#define EI_NIDENT       16

typedef struct elf32_hdr{
  unsigned char e_ident[EI_NIDENT];
  Elf32_Half    e_type;
  Elf32_Half    e_machine;
  Elf32_Word    e_version;
  Elf32_Addr    e_entry;  /* Entry point */
  Elf32_Off     e_phoff;
  Elf32_Off     e_shoff;
  Elf32_Word    e_flags;
  Elf32_Half    e_ehsize;
  Elf32_Half    e_phentsize;
  Elf32_Half    e_phnum;
  Elf32_Half    e_shentsize;
  Elf32_Half    e_shnum;
  Elf32_Half    e_shstrndx;
} Elf32_Ehdr;

A small description on the fields is as follows

  1. e_ident : Contains information about how to treat the binary. Platform dependent.

  2. e_type : Contains information on the type and how to use the binary. Types are relocatable, executable, shared object and core file.

  3. e_machine : As you have guessed, this field specifies the architecture - Intel 386, Alpha, Sparc etc.

  4. e_version : Gives the version of the object file.

  5. e_phoff : Offset from start to the first program header.

  6. e_shoff : Offset from start to the first section header.

  7. e_flags : Processor specific flags. Not used in i386

  8. e_ehsize : Size of the ELF header.

  9. e_phentsize & e_shentsize : Size of program header and section header respectively.

  10. e_phnum & e_shnum : Number of program headers and section headers. Program header table will be an array of program headers (e_phnum elements). Similar is the case of section header table.

  11. e_shstrndx : In the section header table a section contains the name of sections. This is the index to that section in the table. (see below)

3. Sections And Segments

As said above, linkers treat the file as a set of logical sections described by a section header table, and loaders treat the file as a set of segments described by a program header table. The following section gives details on sections and segments/program headers.

3.1 ELF Sections and Section Headers

The binary file is viewed as a collection of sections which are arrays of bytes of which no bytes are duplicated. Even though there will be helper information to correctly interpret the contents of the section, the applications may interpret in its own way.

There will be a section header table which is an array of section headers. The zeroth entry of the table is always NULL and describe no part of the binary. Each section header has the following format: (taken from /usr/src/include/linux/elf.h)


typedef struct elf32_shdr {
  Elf32_Word sh_name;           /* Section name, index in string tbl (yes Elf32) */
  Elf32_Word sh_type;           /* Type of section (yes Elf32) */
  Elf32_Word sh_flags;          /* Miscellaneous section attributes */
  Elf32_Addr sh_addr;           /* Section virtual addr at execution */
  Elf32_Off sh_offset;          /* Section file offset */
  Elf32_Word sh_size;           /* Size of section in bytes */
  Elf32_Word sh_link;           /* Index of another section (yes Elf32) */
  Elf32_Word sh_info;           /* Additional section information (yes Elf32) */
  Elf32_Word sh_addralign;      /* Section alignment */
  Elf32_Word sh_entsize;        /* Entry size if section holds table */
} Elf32_Shdr;

Now the fields in detail.

  1. sh_name : This contains an index into the section contents of the e_shstrndx string table. This index is the start of a null terminated string to be used as the name of the section. There are many, a few are given.

  2. sh_type : Section type such as program data, symbol table, string table etc..

  3. sh_flags : Contains information such as how to treat the contents of the section.

  4. sh_addralign : Contains the alignment requirement of section contents, typically 0/1 (both meaning no alignment) or 4.

The remaining fields seem to be self explaining.

3.2 ELF Segments And Program Headers

The ELF segments are used during loading ie, when the image of the process is made in the core. Each segment is described by a program header. There will be a program header table in the file (usually near the ELF header). The table is an array of program headers. The format of the program header is as follows.


typedef struct
{
  Elf32_Word    p_type;                 /* Segment type */
  Elf32_Off     p_offset;               /* Segment file offset */
  Elf32_Addr    p_vaddr;                /* Segment virtual address */
  Elf32_Addr    p_paddr;                /* Segment physical address */
  Elf32_Word    p_filesz;               /* Segment size in file */
  Elf32_Word    p_memsz;                /* Segment size in memory */
  Elf32_Word    p_flags;                /* Segment flags */
  Elf32_Word    p_align;                /* Segment alignment */
} Elf32_Phdr;

  1. p_type : Gives information on how to treat the contents. It gives the type of program header such as

    etc ..

  2. p_vaddr : relative virtual address the segment expects to be loaded.

  3. p_paddr : physical address of the segment expects to be loaded into the memory.

  4. p_flags : Contains protection flags - read/write/execute permissions

  5. p_align : Contains the alignment for the segment in memory. If the segment is of type loadable, then the alignment will be the expected page size.

Remaining fields appear to be self explaining.

4. Loading The ELF File

We have got some idea about the structure of ELF object files. Now we have to know how and where these files are loaded for execution. Usually we just type program name at the shell prompt. In fact a lot of interesting things happen after the return key is hit.

First the shell calls the standard libc function which in turn calls the kernel routine. Now the ball is in kernel's court. The kernel opens the file and finds out the type/format of the executable. Then loads ELF and needed libraries, initializes the program's stack, and finally passes control to the program code.

The program gets loaded to 0x08048000 (you can see this in /proc/pid/maps) and the stack starts from 0xBFFFFFFF (stack grows to numerically small addresses).

5. Code Injection

We have seen the details of the programs being loaded in the memory. So when a process is given and its memory space known, we can trace it (if we have permission) and access the private data structures of the process. It is very easy to say this, but not that easy to do it. Why not make a try?

First of all, let's write a program to access the registers of another program and modify it. Here we use the following values of request.

Now we are going to inject a small piece of our code to image of the process being traced and force the process to execute our code by changing its instruction pointer.

What we do is very simple. First we attach the process, and then read the register contents of the process. Now insert the code which we want to get executed in some location of the stack and the instruction pointer of the process is changed to that location. Finally we detach the process. Now the process starts to execute and will be executing the injected code.

We have two source files, one is the assembly code to be injected and other is the one which traces the process. I shall provide a small program which we may trace.

The source files

Now compile the files.


#cc Sample.c -o loop
#cc Tracer.c Code.S -o catch

Go to another console and run the sample program by typing


#./loop

Come back and execute the tracer to catch the looping process and change its output. Type


#./catch `ps ax | grep "loop" | cut -f 3 -d ' '`

Now go to where the sample program 'loop' runs and watch what happens. Definitely your play with ptrace has begun.

6. Looking Forward

In the first part we traced a process and counted its number of instructions. In this part we studied the ELF file structure and injected a small piece of code into some process. In next part I would expect to access the memory space of some process. Till then, bye from Sandeep S.


Copyright © 2002, Sandeep S. Copying license http://www.linuxgazette.net/copying.html
Published in Issue 83 of Linux Gazette, October 2002

[ Prev ][ Table of Contents ][ Front Page ][ Talkback ][ FAQ ][ Next ]