Memory Detection Again

After temporary memory setup is done,
the machine specific memory setup should be done.

At the bottom of the ${linux src}/arch/i386/kernel/setup.c,
There is a statement as shown

#include "setup_arch_post.h"

This header file is in ${linux src}/include/asm-i386/mach-default
directory and has one function named machine_specific_memory_setup().

This function at fist calls sanitize_e820_map() function.

Some BIOS e820h interrupt returns overlapped memory mapping for
memory space, and it may not be sorted by address.

So, sanitize_e820_map() sorts the map by address from low address
to high address and if overlapped, it selects memory map that
has the highest type of memory.

Memory type is defined in ${linux src}/include/asm-i386/e820.h.

#define E820_RAM        1
#define E820_RESERVED   2
#define E820_ACPI       3 /* usable as RAM once ACPI tables have been read */
#define E820_NVS        4

sanitize_e820_map() does the job that is not so complex but the code
to be executed is confusing at a glance.

sanitize_e820_map()

sanitize_e820_map() is defined in ${linux src}/arch/i386/kernel/setup.c
and uses the structure named change_member.
This structure is declared in some lines before in this file.

struct change_member {
        struct e820entry *pbios; /* pointer to original bios entry */
        unsigned long long addr; /* address for this change point */
};

And some arrays of change_member to sort and tweak overlaping maps.

static struct change_member change_point_list[2*E820MAX] __initdata;
static struct change_member *change_point[2*E820MAX] __initdata;
static struct e820entry *overlap_list[E820MAX] __initdata;
static struct e820entry new_bios[E820MAX] __initdata;

Memory map is prepared by e820h interruption at boot up stage
and these data are stored at E820MAP(0x2d0).

Two change_member structures are used for each memory map.
pbios member for both structure has the same pointer to the map
that is represented by e820entry.

But "long long addr" member has start of memory region for the
first one and the end of memory region for the second one.

The first "for" loop in this function does this assignemnt.
Here is a code

        for (i=0; i < old_nr; i++)      {
                if (biosmap[i].size != 0) {
                        change_point[chgidx]->addr = biosmap[i].addr;
                        change_point[chgidx++]->pbios = &biosmap[i];
                        change_point[chgidx]->addr = biosmap[i].addr + biosmap[i].size;
                        change_point[chgidx++]->pbios = &biosmap[i];
                }
        }
        chg_nr = chgidx;        /* true number of change-points */

change_point is an array of pointers for change_member.
After this "for" loop, change_point[] array is like following.

                    +---+ -> addr is start
                    | A |
                    +---+ -> pbios for A
                    +---+ -> addr is end
                    | A |
                    +---+ -> pbios for A
                    +---+ -> addr is start
                    | B |
                    +---+ -> pbios for B
                    +---+ -> addr is end
                    | B |
                    +---+ -> pbios for B
                      .
                      .
                  and so on
                      .
                      .

Next "while" loop is for sorting maps by address.

        still_changing = 1;
        while (still_changing)  {
                still_changing = 0;
                for (i=1; i < chg_nr; i++)  {
                        if ((change_point[i]->addr < change_point[i-1]->addr) ||
                                ((change_point[i]->addr == change_point[i-1]->addr) &&
                                 (change_point[i]->addr == change_point[i]->pbis->addr) &&
                                 (change_point[i-1]->addr != change_point[i-1]->pbios->addr))
                           )
                        {
                                change_tmp = change_point[i];
                                change_point[i] = change_point[i-1];
                                change_point[i-1] = change_tmp;
                                still_changing=1;
                        }
                }
        }

"for" loop inside of "while" loop is simple.
When "if" conditon returns true, swap the current change_point
with the previous change_point.

"if" condition is or-ed with "||".
The first condition is obvious.

change_point[i]->addr < change_point[i-1]->addr

When the current address (start or end address of memory region) is below of
the previous address, this condition returns true.
Two chnage_point is swapped.

This swapping is for sorting maps by address.
Non-overlapped maps are of cause sorted.

And maps overlapped each other as shown below
(in depict, no other map is between them.)

                    +---+ -> addr is start
                    | A |
                    +---+ -> pbios for A
                    +---+ -> addr is end
                    | A |
                    +---+ -> pbios for A
                    +---+ -> addr is start
                    | a |
                    +---+ -> pbios for a
                    +---+ -> addr is end
                    | a |
                    +---+ -> pbios for a

are sorted as following.

                    +---+ -> addr is start
                    | A |
                    +---+ -> pbios for A
                    +---+ -> addr is start
                    | a |
                    +---+ -> pbios for a
                    +---+ -> addr is end
                    | A |
                    +---+ -> pbios for A
                    +---+ -> addr is end
                    | a |
                    +---+ -> pbios for a

"A" and "a" are the map for the same memory region.
So, both has the same start address and end address.
(But pbios is different because it points its own memory map entry.)

If the first condition of "if" statement was

change_point[i]->addr <= change_point[i-1]->addr

As two overlapping maps that has the same adderss continues to
return true for the condition,
swap operation should be done eternally(endlessly?).

But the situation

change_point[i]->addr == change_point[i-1]->addr

could be occured when maps are continuous but not in order of address.
memory looks like

X Y Z +--------------------------+------------------+ | region A | region B | +--------------------------+------------------+

but maps for these regions are not sorted by address and when sorted (with "<")

                    +---+ -> addr is X
                    | A |
                    +---+ -> pbios for A
                    +---+ -> addr is Y (end   Y != pbios->addr)
                    | B |
                    +---+ -> pbios for a
                    +---+ -> addr is Y (start Y == pbios->addr)
                    | A |
                    +---+ -> pbios for A
                    +---+ -> addr is Z
                    | B |
                    +---+ -> pbios for B 

Middle two change_point should be swapped for after operation.
So, it has to be taken special care for continuous maps.

(change_point[i]->addr == change_point[i-1]->addr) &&
(change_point[i]->addr == change_point[i]->pbios->addr) &&
(change_point[i-1]->addr != change_point[i-1]->pbios->addr)

The first condition means current and previous change_point has the same address.

And second condition means that the current address
is start address of current memory region.

The third means the previous address is end address of
previous memory region.

Totally, these conditions means that current and previous map are
continuous and current should be below of previous.
(Middle two maps in depict above.)

In this case, two maps must be swapped.
With these two conditions for "if" statement, sorting will be done correctlly.

If at least one swap is occured, variable "still_changing" keeps having 1,
"while" loop is continued to be executed.

When sort is complete, still_changing remains zero,
"while" loop is end.

At this point, maps for memory regions are sorted by address
from low address to high address.

And if two or more overlaped map are there, change_point[]s
that have the same start address is gathered togather
followed by change_point[]s that has the same end address of the same
memory region, as shown in depict below
(in depict, one overlap is there.)

                    +---+ -> addr is start
                    | A |
                    +---+ -> pbios for A
                    +---+ -> addr is start
                    | a |
                    +---+ -> pbios for a
                    +---+ -> addr is end
                    | A |
                    +---+ -> pbios for A
                    +---+ -> addr is end
                    | a |
                    +---+ -> pbios for a

The next code is most confusing.

Upper part of the code is shown.

       overlap_entries=0;
        new_bios_entry=0;
        last_type = 0;
        last_addr = 0;

for (chgidx=0; chgidx < chg_nr; chgidx++) { if (change_point[chgidx]->addr == change_point[chgidx]->pbios->addr) { overlap_list[overlap_entries++]=change_point[chgidx]->pbios; } else { for (i=0; i<overlap_entries; i++) { if (overlap_list[i] == change_point[chgidx]->pbios) overlap_list[i] = overlap_list[overlap_entries-1]; } overlap_entries--; }

"for" loop is executed through sorted change_point[] array.

If change_point is for start address, it is set to overlap_list[] array.
If change_point is for end address, it scans overlap_list and finds the
entry that has the same pbios entry (for start address) and replaces it
with the last entry of overlap_list (,which means this entry is removed)
and index for overlap_list (overlap_entries) is decrimented.

The number of change_point for end address equals to the number
of change_point for start address.
So, overlap_entries will be zero no matter how many overlaps is occured,
when the check for the maps for the same region is over.

overlap_list[] has the number of entries that equals to the number
of overlap for the same memory region.
And each time change_point that has an end address of the same memory region,
its entry is removed.

Next "for" loop determains the type of this memory region.

                current_type = 0;
                for (i=0; i<overlap_entries; i++)
                        if (overlap_list[i]->type > current_type)
                                current_type = overlap_list[i]->type;

The higher value takes precedence.
In this loop, the type of this memory region is determained.

current_type is initialized by 0.

If code checks this memory region first time,
current_type is set to the type specified by bios.
(If there are overlap, the highest value is set.)

No matter how many overlap there are, when check of this memory region
is complete, overlap_entries is 0.
In this case "for" loop is not executed, so current_type remains 0.
current_type is used the last part of the code.
Here is the last part of this loop.

               if (current_type != last_type)  {
                        if (last_type != 0)      {
                                new_bios[new_bios_entry].size =
                                        change_point[chgidx]->addr - last_addr;

if (new_bios[new_bios_entry].size != 0) if (++new_bios_entry >= E820MAX) break; } if (current_type != 0) { new_bios[new_bios_entry].addr = change_point[chgidx]->addr; new_bios[new_bios_entry].type = current_type; last_addr=change_point[chgidx]->addr; } last_type = current_type; }

"if" statement matches when current_type does not equal to last_type.

i) When there is no overlap

If and only if the check of the memory region is done, current_type is
set to zero in "for" loop for determaining the memory type.

So, when the check for a memory region begins, last_type was set to zero
at the end of the provious check or initialization of loop.

At first, change_point that has a start address is treated.
At this time, the status of loop is

current_type != 0 and last_type == 0

Because some kind of type of memeory the region has.

So, the second "if" statement returns true.
Set new_bios[].type to current_type and new_bios[].addr to
the start address of this memory region.
Then, last address is set to the start address of this memory region.

And at last, last_type is set to current_type (!= 0).

Next, change_point that has an end address of memory region is treated.
This time, overlap_entries is down to zero so that current_type is zero.
The status is

current_type == 0 and last_type != 0

The first "if" statement returns true.
new_bios[].size is set to the size of this memory region
because change_point[]->addr is the end address of this memory region and
last_addr is set to the start address of this memory region.

As new_bios[].size != 0, the index for new_bios[] is incremented
(++new_bios_entry).

And last_type is set to current_type (= 0).
new_bios[] is setup correctly.

The check for next memory region begins in this loop.

ii) When there are several overlap.

In this case too, at the beginning last_type is zero.
And current_type is set to the type of the region
that is first entry in overlap_list[]
(in the loop for determaining the type of region).

While loop treats with change_point for the start address of this region,
pointers for each e820entry (pbios) are stored in overlap_list[] array.
(Series of entries for start address come first, sorted to be so)

In this cycle, the type of region is set to the largest type (in the
loop for determaining the type).

When the first change_point for this region,

current_type != 0 and last_type == 0

Only the second "if" statement returns true.
Set the new_bios[].type to current_type and new_bios[].addr to
the start address of this memory region.

And last_addr is also set to the start address of this memory region.

From now on, while change_point for start addresses are treated,

current_type != 0 and last_type != 0

But when first "if" statement returns true,
last_addr is the start of this memory region and change_point[].addr
is also the start of this memory region.
new_bios[].size is zero.
So, the index for new_bios[] is not tweaked.

And when the second "if" statement returns true,
current_type is kept having the largest value of type,
no chnage should be happend.

Once change_point[] enters to entry that has the end of this memory region
and the type this entry has is defferent from the largest type of this region,
status is changed.

(current_type != last_type) and (current_type != 0) and (last_type != 0)

(While current_type == last_type, nothing is happend).

At this point, new_bios[].size is calcurated by
change_point[]->address (end of region) - last_addr (start of region),
which is the size of this memory region.

And as size != 0, the index for new_bios[] is incremented for next entry.
Although the values for next new_bios[] array is set to this region's values,
last_addr is set to the end address of this memory region
and every change_point[].addr for this region is also the end address,
while this region is treated, new_biosp[].size is zero (end - end).

So, the index for new_bios[] array is not tweaked,
which means when the check for this region is done and the check for
the next memory region begin, next entry of new_bios[] array is over-written
by next region's values.

At the time when the check of this region is done,
last_type is set to zero and the index for new_bios[] was incremented
in order to point the next region (now has the values for previous region),
and current new_bios[] is setup correctlly.

This loop will be done for all regions of memory setup by
bios e820 interruption.

When loop exits, new_bios[] array has new and not overlaped memory region.
Using this array, this function creates a new array of struct e820entry.

top
inserted by FC2 system