Paging

In intel i386 and its succsession, virtual address is translated with
its address translation unit.

There is a vertual address of 32-bit.

   31                 22 21                12 11                      0
   +--------------------+--------------------+------------------------+
   |  Index for         |  Index for Page    |      Offset from       |
   |    Page Directory  |    Table Entry     |   Beginning  of Page   |
   +--------------------+--------------------+------------------------+

"Page Directory Table"(PDT) has an index of "Page Table Entry"(PTE).
Each entry of PDT and PTE is 4bytes.

Here is a depict of entry.

   31                 22 21                12 11               4 3    0
   +--------------------+--------------------+------------------+-----+
   |        These 20bit is an index          |  Attribute bits   UR/WP|
   +--------------------+--------------------+------------------+-----+
                                                                   ^
                                                                   |
                                               Present and r/w and User

Index for PDT is 10-bit, the number of entry for PDT is 1K.
Index for PTE is 10-bit, the number of entry for PTE is 1K.
And offset part (in the virtual address) has 12-bit, page size is 4K.

Total size of pages is 4K * 1K * 1K = 4Gbytes.

In Discriptor table, granuality bit is set, page size is 4K.

And each entry of PDT has 4-byte size and the number of PDT is
1K (10-bit), PDT size is 4K bytes.

Also each entry of PTE has 4-byte size and the number is 1K,
PTE size is 4K bytes.

Here is a code to setup PDT and PTE for boot up.
This code is in ${linux src}/arch/i386/kernel/setup.c file.

This code is executed after "Global Descriptor Table" is setup
to cover all of 4G-byte space in setup.S.

page_pde_offset = (__PAGE_OFFSET >> 20);

movl $(pg0 - __PAGE_OFFSET), %edi movl $(swapper_pg_dir - __PAGE_OFFSET), %edx movl $0x007, %eax /* 0x007 = PRESENT+RW+USER */ 10: leal 0x007(%edi),%ecx /* Create PDE entry */ movl %ecx,(%edx) /* Store identity PDE entry */ movl %ecx,page_pde_offset(%edx) /* Store kernel PDE entry */ addl $4,%edx movl $1024, %ecx 11: stosl addl $0x1000,%eax loop 11b

pg0 is beginning of PTE.
swapper_pg_dir is a "Page Directory Table" to point each PTE entries.

pg0 and swapper_pg_dir is both virtual kernel address so that
they should be subtracted by 0xC0000000 (__PAGE_OFFSET) to
get a physical address.

At first, store pg0 into %edi (which is inclimented by stosl instruction)
and swapper_pg_dir into %edx.

The lowest three bytes of entry are attributes as depict above,
and 0x007 is Present + R/W + User.
The part of index (rest upper part of entry) is zero-ed, which means
index is 0, physical address is 0x00000000 for PTE
because address is culcurated as 0x00000 (index part of PTE) << 12(4K).

Setting 0x007 into %eax means settting index and attributes of
first page table entry. %eax is used to contain the value for
PTE entry.

Label 10:

"leal" instruction is "Load Effective Address Long" (long is 32-bit).

leal m, register
load address of memory (specified by "m") into register.

        leal 0x007(%edi), %ecx

(%edi) is pg0 at beginning of loop.
This "leal" instructs that address (%edi)+0x007 into register %ecx.

Lowest 3-byte of entry is attributes.
So adding 0x007 to (%edi) means attribute of entry
is set to "Present + R/W + User".

        movl %ecx,(%edx)
        movl %ecx,page_pde_offset(%edx)
        addl $4,%edx

Then, this value (%ecx) store into the memory specified by %edx.
(%edx) is the swapper_pg_dir, the first entry of "Page Directory Table".

The first execusion of this code set first GDT entry to point
the first 4K page (0x00000000) entry of PTE.

But we must jump from real mode to protected mode of processor.
Before and after switch of mode, the command fetch must be consistent.
So real address and virtual address is point the same address space
of physical memory.

(%edx) is a physical address and vertual address is set to
"physical address" + __PAGE_OFFSET (0xC0000000).

But we tweak the page directory table. This entry specifies
4Mbytes space (4K of page space x 1K index of PTE).
This space is somewhere in physical memory.

Now page_pde_offset is 0xC00.

page_pde_offset(%edx) means the index is

(%edx) + 0xC00 = (%edx) + (0xC00 / 4 ) * 4(data chunk)

0xC00 / 4 = 768. This is a index for 0xC0000000,
which is a start address of kernel code in virtual memory.
(0xC0000000 = 0xC00 << 20(4Mb))

Addition by 4 to %edx is increment the address of PDE to
store the next page direcotry entry.

Label 11:

"stosl" instruction stores %eax into memory pointed by %ds:%edi
and incrimanets %edi.
%ds is "selector register" that determains which segment should be used.

Next, add 0x1000 to %eax, which means attributes is not tweaked
and index of PTE is incremented by 1.
Then go back to label 10:

This loop is done with %ecx initialized with 1024(1K).
So, each loop setup 1K enties of PTE specified by
PDT entry that is pointed by %edx.

        leal (INIT_MAP_BEYOND_END+0x007)(%edi),%ebp
        cmpl %ebp,%eax
        jb 10b
        movl %edi,(init_pg_tables_end - __PAGE_OFFSET)

Memery setup at boot, it is nessesary to have page table that
specifies up to "Page Table Entry" and bitmap space for init_boopmem().

The bitmap size is determained to cover all of 4Gbytes space.
Page size is 4K (4096) and one bit is required for one page,
the size is 2^32/4096/8 = 128K.

INIT_MAP_BEYOND_END is defined in head.S and its value is 128K.
Load the address of PTE entry now handling pulus INIT_MAP_BEYOUND_END
+ attributes(0x007) into %ebp.

And compaire with %eax that has contents of memory (index and attr)
of memory that is in PTE entry.

If %eax is below (%edi) + INIT_MAP_BEYOND_END (+ 0x007:attribute),
go back to label 10 and set the entry of PDT for next PTE and
loop again to fill the next PTE entries.

When loop finished, "temporary" initialization of memory translation
is done.

Paging Unit


Now, Global Directory Table and Page Table Entry is prepared,
enable the Paging Unit by executing the following code.

        movl $swapper_pg_dir-__PAGE_OFFSET,%eax
        movl %eax,%cr3          /* set the page table pointer.. */
        movl %cr0,%eax
        orl $0x80000000,%eax
        movl %eax,%cr0          /* ..and set paging (PG) bit */
        ljmp $__BOOT_CS,$1f     /* Clear prefetch and normalize %eip */
1:

In intel processor, cr3 (Control Register 3) is to have pointer to
Global Directory Table.

Temporary physical address of swapper_pg_dir (GDT) is stored into %eax
and loaded to %cr3.

And once upper most bit of cr0 (Control Register 0) is set,
the processor using the Paging Unit to translate linear address
to physical address.

So, temporary store the content oc %cr0 into %eax and
or-ed with 0x8000000, then put back to %cr0.

"ljmp" instruction make processor fetch instruction in virtual
address space of code segment prepared so far.
inserted by FC2 system