in the name of zero

May 26, 2006

on the gdt

Filed under: hermetic studies

[ out of topic ]
tonight i sing the happiest lines! life is slowly returning to normal. free days ahead. the blackouts seemed to have stopped for good. i’m able to watch gokusen now. but i’m more concerned about the condor hero episode. yang and xiao found each other again and things are going smoothly for them apparently.

[ back to topic ]
memory segmentation in protected mode is defined by a set of descriptor tables and each segment registers contains pointers to these tables. the first table is called the gdt (global descriptor table) and the other, ldt (local descriptor table). i’ll focus on the gdt today since this entry is about it afterall.

[ short intro ]
the gdt contains a set of information about segments that is global. that is, all applications can see and access it. it defines base address of segments, access privileges, type and information on how that particular segment is supposed to be used. that said, the gdt is a list of segment descriptors that provides the processor with details of a specific segment.

[ segment desciptor ]
a gdt is composed of segment descriptors that are 64 bits (quad word) long each. segment descriptors contain contains information that tells the processor with the size, location and other stuff that i said a while ago about what the gdt does. (this is getting pretty fuzzy and redundant right?)

see: intel developer’s manual volume 3a - 3.4.5 segment descriptors

from figure 3-8 of intel dev’s manual we construct a segment descriptor template now.

segment descriptor format and bit fields
31							          0
[ BASE | G | D/B | L | AVL | SEGLIMIT | P | DPL | S | TYPE | BASE ]
[               BASE                  |        SEGLIMIT           ]
	
or an equivalent c structure:
	
/* no compiler optimization. one byte alignment */
#define _PACK	__attribute__ ((packed))
	
struct gdt_entry
{
	unsigned long dword1;
	unsigned long dword2;
} _PACK;

unfortunately, in c, we’re stuck with having the char datatype (8 bits) as the smallest definable structure. so we use what we are more comfortable with and if necessary, do bit operations like shifting and other bitwise operations to satisfy certain bit fields inside like the G (granularity), D/B (default operation size), L, AVL, P and so on.

as an additional supplement, let’s also take a look at how linux arranged it’s global descriptor table entries. in particular, it’s kernel code and data segment and user space code and data segment

from /usr/src/linux/arch/i386/kernel/head.S (i’m using gentoo sources linux-2.6.12-gentoo-r6 btw)
i scrolled down and looked for tell tale signs of gdt entries and i found them being initialized at the ENTRY(cpu_gdt_table) block. i’m showing it here for convenience.

        .quad 0x00cf9a000000ffff        /* 0x60 kernel 4GB code at 0x00000000 */
        .quad 0x00cf92000000ffff        /* 0x68 kernel 4GB data at 0x00000000 */
        .quad 0x00cffa000000ffff        /* 0x73 user 4GB code at 0x00000000 */
        .quad 0x00cff2000000ffff        /* 0x7b user 4GB data at 0x00000000 */

notice the values 0x60, 0x68, 0x73 … they are bit offsets that are multiple of eight. each segment descriptor is quadword (2*2 bytes = 8 bytes) sized so each index must also be in multiples of 8. i’ll quote robert collins of x86 dot org fame here. in his article protected mode basics he said that “any program that loads a segment register with a value that isn’t a multiple of 8 will generate a protection error

we continue now by analyzing the kernel 4gb code segment descriptor.

.quad 0x00cf9a000000ffff        /* 0x60 kernel 4GB code at 0x00000000 */

segregate that into respective words and finally, segregate the words into it’s respective bits.

0x 00cf 9a00 0000 ffff
please look at the figure 3-8 (segment descriptor) in the intel software dev's manual.
1) 0x00cf -> 0000 0000 1100 1111
	0000 0000
		. first half-word is zero filled. and this corresponds to bits 31:24 of the
		base addr.
	1100 1111
		. the granularity bit is set, so we know that we are doing 4 kbyte units here.
		. default operation size bit is set, doing 32 bit segments instead of 16.
		. L and AVL bit fields are not set.
		. segment limit is set to all 1, and this corresponds to bits 19:16 of
		segment limit.
	
2) 0x9a00 -> 1001 1010 0000 0000
	1001 1010
		. P bit is set, therefore, _this_ segment is present.
		. DPL (2bits) field is set to zero which means this segment is doing RING0
		(privileged!)
		. S field is set, therefor, this is a code or data segment.
		. type nibble is set to 1010 (10 decimal) and now we refer to Code and data
		segment types table
			see: 3.4.5.1 code- and data-segment descriptor types table 3-1.
			the nibble arrangement corresponds to 10, which means that this
			segment is a code segment with execute and read access.
	
	0000 0000
		. last half-word is also zero filled, and this corresponds to bits 23:16 of
		base addr.
	
3) 0x0000 -> 0000 0000 0000 0000
		. nothing awesome here, this corresponds to bits 15:00 of base addr.
	
4) 0xffff -> 1111 1111 1111 1111
		. nothing awesome here either, bits 15:00 of segment limit is all set

ok, so in summary, what does that quadword tell? it says that this paricular segment :

	a) is a 32 bit segment
	b) starts at base address zero (notice that base addr bits are all zero)
	c) spans 4 gigabytes. (granularity is 4kilobytes and all bits set in segment limit)
	d) is a privileged segment running at ring0. (kernel code is privileged)
	e) we can both read and execute this segment, thus, it's a code segment.

the other gdt quadwords, i leave that for exercise. i don’t even know if i got this right!

with the information above, i made a simple gdt for the stephy kernel. perhaps more segment descriptors will be added in the future, but for now, i’m contented. stephy kernel seems happy about it. she doesn’t triple fault and cry. cry() by the way will eventually be the stephy counterpart of the linux panic(). as of now though, it’s just a simple printf(const char *fmt, …) function. not much, but it’s the start.

crystal liu fanatic,
- me

2 Comments »

The URI to TrackBack this entry is: http://gnurbs.blogsome.com/2006/05/26/on-the-gdt/trackback/

  1. hahahah cry() for panic() haha cute niel… kabar tiene image ta sale si steph ya yura wakakakakaka

    Comment by lynlyn — May 26, 2006 @ 10:55 am

  2. hahaha! bleh! :p keber chene pa gad picture di steph. hihihi.. though, very cute kel idea ha! hihihihihi. nukere ma iyo mira ta yura si steph. :D ya pone ya lang iyo cry() kasi it sounded like the most descriptive name para na un panic() function. considering the kernel is named after a girl :D and di ba girls love crying? hihihihihi

    psst.. lyntut. text me often!

    Comment by sleepy jenkins — May 27, 2006 @ 9:36 am

RSS feed for comments on this post.

Leave a comment

Line and paragraph breaks automatic, e-mail address never displayed, HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>


Get free blog up and running in minutes with Blogsome | Theme designs available here