in the name of zero

January 31, 2006

dynamic memory allocation in linux assembly

Filed under: hermetic studies

i feel so at peace with myself recently! i can take any torture and my happiness still won’t fade away. today is january 31, and i think i already know what to say to steph fourteen days starting tommorrow.

our group of three members spent the better part of the day today constructing our logic circuit for tommorrow’s digital designs class. wiring integrated circuits is actually quite fun! we seldom found ourselves talking about things that are totally unrelated to our homework. i for instance, talked mostly about steph. i didn’t touch any keyboard until about 4 o’ clock pm.

anyway, onto the subject matter at hand, dynamic memory allocation in linux assembly. it was just another one of my dumb ideas to write my own malloc() function. nothing really fancy in that .. just a simple bare bones allocator library.. error and bug prone at that. lots of reading ahead!!! (i need to get a life outside my room me thinks)

i decided to approach that goal by understanding first the sys_brk system call. which is a kernel interface to adjust the data segment boundary. i’m guessin i’ll manage to make lots and lots of memory leaks ahead! so i’ll use valgrind(don’t know if this will work) and strace too in this activity. hopefully, i can make my own simple malloc() in c after i finish this activity. wish me luck!

[ what i understand so far ]
everytime a process loads, it gets an initial allocation of memory up to a certain address and this is called the system break. if we wanna use more memory, we need to “map” more memory into our data space. therefore, when we say “allocate” memory, we are just actually telling the operating system to move a process’ system break forward so that additional memory can be mapped. in a simple world, deallocation means that we just move the system break back to where our data segment starts. but what if i wanna deallocate something specific? say, 2 bytes somewhere in the middle of the data segment? i gotta find the answer to this soon.

[ the code ]

the man page describes the syntax of sys_brk as int brk(void *end_data_segment). hmm, looks like we just call brk() in a straightforward manner. that is, just specify a new boundary. if we want to allocate, we move the system break forward. if we wanna deallocate, we move it backwards. in order to do this, first we must know our bearings. what is the current system break anyway? well, according to the manual, passing 0 as argument just returns the current location of the program break.

; /usr/include/asm/unistd.h
%define sys_brk 0x2d
	
        global _start
section .text
_start:
        mov ebx, 0
        mov eax, sys_brk
        int 0x80                ; get current location of program break
        test eax, eax           ; handle error
        js error
	
        lea ebx, [eax+0x08]
        mov eax, sys_brk
        int 0x80                ; move break 8 paces forward
        test eax, eax           ; handle error
        js error
	
.dealloc
        mov ebx, origin
        mov eax, sys_brk
        int 0x80                ; move break back to origin
	
        xor eax, eax
        inc eax
        xor ebx, ebx
        int 0x80                ; exit(0)
	
error:
        mov ebx, eax
        xor eax, eax
        inc eax
        int 0x80                ; exit(errno)
	
section .bss
        origin  resb    1
let’s analyze the program using strace. we’ll also show line numbers so we can find lines easily.

amerei@heaven ~/workdir $ strace ./dynamic_allocation 2>&1 | cat -n
     1  execve(\"./dynamic_allocation\", [\"./dynamic_allocation\"], [/* 54 vars */]) = 0
     2  brk(0)                                  = 0x804a000
     3  brk(0x804a008)                          = 0x804a008
     4  brk(0x80490d8)                          = 0x80490d8
     5  _exit(0)                                = ?
as usual, line 1 shows strace running our program with no explicit arguments. line 2 tells us that our program called the sys_brk function with zero as the argument it returned 0x804a000. we now know that this was the system break of our program when it was loaded. line 3 show’s that the system break moved 8 steps forward. this snippet explains it all.
        lea ebx, [eax+0x08]
        mov eax, sys_brk
        int 0x80
line 4 show’s the program calling sys_brk with a lesser value! but what do we make of 0x80490d8? it is actually the starting address of our data segment. and to verify it, i used objdump.
amerei@heaven ~/workdir $ objdump --disassemble-all dynamic_allocation \
> | grep -i bss
Disassembly of section .bss:
080490d8 < .bss>:
we have know reached the deallocation part of the program then, coz we set the program break back to the very address where the data segment starts.

[ notes ]

1) i tried running the assembly (without any deallocation) under valgrind but valgrind doesn’t catch anything. this _could_ mean that valgrind checks are implemented at glibc level where malloc() and free() are already defined. i’ll do more researching on this one. valgrind has no manual page.

2) i also tried calling sys_brk with the starting address of the .text segment just to see what happens. the system break didnt move backward, nor did it change value at all.

todo: set .bss segment to readonly and try allocation/deallocation on it.

steph is cute! steph is cute! steph is cute!

4 Comments »

The URI to TrackBack this entry is: http://gnurbs.blogsome.com/2006/01/31/dynamic-memory-allocation-in-linux-assembly/trackback/

  1. I see that this is a pretty old article but thanks a lot i’ve been looking all over the web for this exact example and on top of that the detail you gave with strace and objdump is just perfect :)

    I’m just curious though, is the origin always equal to 1 ? Mind explaining why ?

    Thanks again, you’ve saved my day :)

    Comment by SMassey — March 2, 2009 @ 8:31 am

  2. yo! sorry for the extremely late reply! nope. origin depends entirely on how the kernel arranges the stack (size). the 1 just means a 1 byte declaration. =)

    Comment by sleepy jenkins — March 29, 2009 @ 1:45 pm

  3. You wondered what to do if you wanted to deallocate something in the middle of the data segment, since you can’t just brk back past it. I read somewhere a long time ago, and have no references to back it up, that the C library never actually shrinks the data segment. If you free() something, that memory is just marked as free for future allocations. The rationale was that if that program used x amount of memory at one point, it would probably use it again, so why bother brking back only to have to brk forward again later? Also, that means fewer calls to the kernel, so a faster program.

    Comment by JFay — April 18, 2009 @ 5:00 am

  4. yep. you’re right. =) but then im just showing behavioral concepts, not implementations above the raw function.

    Comment by sleepy jenkins — April 18, 2009 @ 4:12 pm

RSS feed for comments on this post.

Leave a comment

Line and paragraph breaks automatic, e-mail address never displayed, HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>


Get free blog up and running in minutes with Blogsome | Theme designs available here