Att. Alex Nichol -VM cont.

Guest · Dec 8, 2004

Hi Alex. You sent me off to Intel's site concerning memory management and
your recommendation that 4k cluster sizes is optimal for the pagefile.

It took me a couple of days on Intel's site to find what I was looking for.

Intel architecture involves three memory management models and they are FLAT
MODEL, SEGMENTED MODEL and REAL ADDRESS MODE MODEL
See ftp://download.intel.com/design/Pentium4/manuals/25366514.pdf
Section 3.3
Look at chapter three, page 55 where it discusses segmented memory.
"The segment selector identifies the segment to be accessed and the offset
identifies a byte in the address pace of the segment. The programs running
on a IA-32 processor can address up to 16,383 segments of diffent sizes and
types, and each segment can be as large as 2 to the 32nd power"

See here: http://www.intel.com/design/pentiumii/manuals/24319202.pdf

Section 3.6 Paging (Virtual Memory)
Second paragraph.
â€œWhen paging is used, the processor divides the linear address space into
fixed sized paged (generally 4 Kbytes in length) that is mapped into physical
memory and/or disk storage.â€

Section 3.6.1 Paging Options
Paging is controlled by three flags in the processors control registers:
PG (paging flag)
PSE (page size extensions) flag
PAE (physical address extension)
Skipping down to the PSE description
â€œThe PSE flag enables large page sizes: 4 Mbyte [thatâ€™s MEGAbytes] pages or
2 Mbyte pages (when the PAE flag is set)

If you look at table 3-3 you will note 4KB, 2MB and 4MB paging sizes.

This refutes the idea that paging is ONLY done in 4K incriments. It would
appear to me that the SEGMENTED memory model is used more often in XP memory
operations not the LINEAR MODEL.

The point I am making is when we discuss memory operations we talk about 4kb
pages [4096 bytes]. This is in its simplest form for discussion. This does
NOT relate to how data is written to pagefile.sys. It appears to me that an
assumption was made that since paging in RAM is done in 4k pages that any
page swapped out to disk is done in 4k incriments. I believe the above links
show that paging is NOT done exclusively by 4kb but also includes 2MB and 4MB
pages.

Now lets look at this from a disk write perspective.

You say 4kb from disk to RAM is optimal because â€œso that transfers may be
made direct from the file to RAM without any need for internediate
bufferingâ€. I can agree with that reasoning but reading from disk is not
done 4k at a time so you would achieve the same results if the cluster size
was 32k. What you neglect is the operation going from RAM to disk.

We know that memory operations in RAM are done in nanoseconds [billionths]
and drives are measured in milliseconds [thousandths]. From this we know
that when a disk write from memory to disk is going to take place it is not
one 4kb page that is going to be written. There is going to be a STACK of
them written to the pagefile referred to as a multiwrite request. We now
know that the â€œpagesâ€ can be 4kb, 2MB and 4MB. A 2MB page would occupy 512
4k clusters. That is 512 WRITES with 4k clusters. If 64K clusters that
would be done to 32 writes. It is going to take longer to write 512 times
then 32 times with the same read/write heads. 4kb clusters donâ€™t add up as
optimal for pagefile operations. 4kb clusters DO NOT waste disk space BUT
you sacrifice SPEED. Speed is what you want to optimize for paging to disk.

Going from physical RAM to disk EVERY WRITE has to be cached!!! The write
has to wait for the disk. Itâ€™s that billionth to a thousandth time lag.
This would make pagefile write operations the slowest part of the entire
memory operation. You do not increase write speed with smaller cluster sizes
but with LARGER ones. Ideally if you knew the AVERAGE write size you would
adjust your cluster size accordingly.

The information I could not find on Intelâ€™s or Microsoftâ€™s web sites is how
information is written to pagefile.sys. What incremental sizes are used to
write to the pagefile. In other words, is a stack of 4k memory pages [using
your model] written as one contiguous write? When retrieved and deleted 4k
at a time how is this space recovered to provide the largest contiguous block
available to be written? After all if a 128kb of 4kb pages is written to
pagefile.sys and only 32kb are read and deleted holes in that 128kb
contiguous write block will happen.

By chance do you have any links that may answer these questions?
Another question would be are you going to continue to recommend 4kb
allocation units as optimal for paging on your web site?

Alex Nichol · Dec 9, 2004

Joshua said:
Section 3.6 Paging (Virtual Memory)
Second paragraph.
â€œWhen paging is used, the processor divides the linear address space into
fixed sized paged (generally 4 Kbytes in length) that is mapped into physical
memory and/or disk storage.â€

Which is relevant to Windows

Section 3.6.1 Paging Options
Paging is controlled by three flags in the processors control registers:
PG (paging flag)
PSE (page size extensions) flag
PAE (physical address extension)
Skipping down to the PSE description
â€œThe PSE flag enables large page sizes: 4 Mbyte [thatâ€™s MEGAbytes] pages or
2 Mbyte pages (when the PAE flag is set)

Which is not. The provision is for specialist systems working in a
different way, not by 'demand paging'

Guest · Dec 21, 2004

Sorry Alex but this is not what Microsoft is saying about XP.
See here
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnwxp/html/idealdevplat.asp
Please review the chapter "Windows XP: Kernel improvements".

Note the "larger minimum memory size for large pages" chapter.
"By mapping this core operating system code and data with 4MB pages, the
first reference to any byte within each 4MB region results in the x86 memory
management unit caching the address translation information to find any other
byte within the 4MB region without having to resort to looking in page table
data structures. This speeds address translation.
The Windows XP change was needed because of the continual increase in
typical memory configurations of PC systems."

XP has to support the programming written for it. If MS talks about 4meg
paging why do you think XP doesn't support it and will only use the
flat/linear memory addressing model?

You did not address the writing to pagefile issue I raised. How do you
justify
the 4kb cluster size as being optimal when considering that paging to disk
doesn't happen only 4k at a time.

Alex Nichol said:
Section 3.6.1 Paging Options
Paging is controlled by three flags in the processors control registers:
PG (paging flag)
PSE (page size extensions) flag
PAE (physical address extension)
Skipping down to the PSE description
â€œThe PSE flag enables large page sizes: 4 Mbyte [thatâ€™s MEGAbytes] pages or
2 Mbyte pages (when the PAE flag is set)

Click to expand...

Which is not. The provision is for specialist systems working in a
different way, not by 'demand paging'

Alex Nichol · Dec 22, 2004

Joshua said:
Note the "larger minimum memory size for large pages" chapter.
"By mapping this core operating system code and data with 4MB pages, the
first reference to any byte within each 4MB region results in the x86 memory
management unit caching the address translation information to find any other
byte within the 4MB region without having to resort to looking in page table
data structures. This speeds address translation.
The Windows XP change was needed because of the continual increase in
typical memory configurations of PC systems."

XP has to support the programming written for it. If MS talks about 4meg
paging why do you think XP doesn't support it and will only use the
flat/linear memory addressing model?

They are talking about central kernel code in the non-paged area.

You seem determined that the paging system should not work in the way it
does. I have had enough

Guest · Dec 23, 2004

Didn't mean to put you off Alex.

It isn't how paging occures that is in question. It is your conclusion
which is now a recommendation on your web site that is questioned. You went
from 4k paging in memory to the apparently unsupported conclusion that 4k
disk cluster sizes as optimal for the pagefile.

I asked for the documentation/reasoning behind this recommendation. What I
got was go review Intel's web site. I did so and found that Intel
programming is not based on a single memory model [4k paging -flat or linear
memory model]. I even found a Microsoft article saying MS is written to
support large memory pages of 4meg.

These articles go in the face of "XP only does 4k paging" statements.

The second part of the issue, which you never have addressed, is how the
pagefile is written to. We know for a fact a 4kb memory page is not written
to the hard drive directly. We know this for two reasons; 1. that going from
billionths of a sec. ram to thousandths of a sec hard disk the memory to disk
writes have to be cached, and 2. that the disks controller caches writes to
optimize head positioning performance.

I see the 4K clusters as optimal for pagefile recommendation having two
holes. One is 4kb, 2meg and 4meg memory pages and the other is disk writes.

I have searched Intel and Microsoft's web sites. I have visited two
bookstores in addition to the books I have for the answer some simple
questions.
How is data/code written to the pagefile??? What is the mechanism for
maintaining the internal pagefile structure? This appears to be an area not
addressed.

Without knowing this information your 4kb page to 4kb cluster recommendation
is based on a false conclusion. I asked you for what documentation you based
this recommendation on. You said go learn about paging. I have since added
to my knowledge of paging [I didn't know about the 2/4meg pages] but I still
don't have a single shred of evidence supporting your web pages
recommendation for pagefile cluster size.

Hey if you don't want to discuss paging fine. How about this one issue?
What do you base your recommendation on?

cquirke (MVP Win9x) · Dec 24, 2004

On Thu, 23 Dec 2004 14:01:03 -0800, "Joshua Bolton"

...4k paging in memory to the apparently unsupported conclusion that 4k
disk cluster sizes as optimal for the pagefile.

I asked for the documentation/reasoning behind this recommendation. What I
got was go review Intel's web site. I did so and found that Intel
programming is not based on a single memory model [4k paging -flat or linear
memory model]. I even found a Microsoft article saying MS is written to
support large memory pages of 4meg.

Have you got a URL for that?

AFAIK the original 386 design hard-coded virtual memory paging in 4k
blocks, which fits 4k clusters quite nicely. Whether this is still
hard-coded today, depends on both Intel and MS. This virtual paging
isn't related to segment size, BTW.

The second part of the issue, which you never have addressed, is how the
pagefile is written to. We know for a fact a 4kb memory page is not written
to the hard drive directly. We know this for two reasons; 1. that going from
billionths of a sec. ram to thousandths of a sec hard disk the memory to disk
writes have to be cached, and 2. that the disks controller caches writes to
optimize head positioning performance.

The disk's controller, in the case of xIDE, is on the HD and thus
transparent to what we are dealing with (transferring from RAM to HD).

How is data/code written to the pagefile??? What is the mechanism for
maintaining the internal pagefile structure? This appears to be an area not
addressed.

Not sure if it's documented by MS. I do know that the pagefile isn't
handled the same way as other files; it's optimized for speed.

---------- ----- ---- --- -- - - - -

Proverbs Unscrolled #37
"Build it and they will come and break it"

Guest · Dec 27, 2004

Hi Cquirke:

Have you got a URL for that?

Yes. There are two of them and they are in the first post of this thread

"AFAIK the original 386 design hard-coded virtual memory paging in 4k
blocks, which fits 4k clusters quite nicely." and "This virtual paging
isn't related to segment size, BTW".

Do you have a URL for this? Anything from MS saying 4kb clusters are
optimal for pagefile.sys? A source for the reason behind this?

My first point is that from the Intel and Microsoft urls I find both
manufacturers support for large memory pages [2 or 4meg pages]. No where can
I find documentation that says a 4kb page in memory is written to the hard
drive ie. pagefile.sys in 4kb incriments. A simple documented statement from
Intel or MS would say something like "pagefile.sys is written in such a
manner that a 4kb memory page is equal to a 4kb cluster written on the disk".
In all of your study have you ever read a statement like that? I haven't.
So where is the basis for saying since 4kb paging in memory that 4kb clusters
are optimal for pagefile.sys?

My arguements to the contrary start with how going from billionths of a
second in memory to thousands of a second access time on a disk, the pagefile
writes [memory to disk] are going to stack up while waiting for the disk to
become available to be written to.

"thus transparent to what we are dealing with (transferring from RAM to HD)."

The time spent transferring data/code from memory to disk IS revelvant when
you consider that cluster size can make a huge difference to any file.
Pagefile.sys is a file.

For example if you do video editing you do NOT want a 4kb cluster size.
This is due to the fact you are working with relatively large files 1-2gigs
and larger. Math shows you the reason. A 2gig file on 4kb clusters takes up
500 clusters. A 2gig file on 512kb clusters takes up 4 clusters. If
contigous and the drives being equal the pickup rate should be equal. But
rarely is a file contigous and this is where the true "tire meets the road"
consequences of fragmentation come in. 4 reads vs 500 reads makes the larger
cluster size a faster retrieval option for large files.

If you have a gig pagefile, would it be optimal to have larger cluster sizes?

I am left with the question of how is pagefile.sys written to? How is it
internally organized?

Lets say the OS wants to "pageout" to disk 128kb of code and data from
memory [this can be a cumlitive of 32 4kb pages]. It sends it somewhere [a
buffer] to await the disk write or marks it as available to be paged out. In
the meantime of waiting for the disk to become available another 256kb of
working set memory needs to be paged to disk due to a different program/os
request. These two disk writes are sent to the disk controller. The disk
controller computs and sets their order in the disk cache to optimize the
disk head movement and its subsquent writes. 384kb of data/code is written
to the pagefile.

Lets assume [since we can't seem to find any documentation to support this
aspect] that we are dealing with only 4kb paging from memory to 4kb clusters.
You physically can't go from a 4kb memory page to a 4kb cluster any faster
then the SLOWEST link. Which in this case is the hard drive. It would slow
head movement if the disk controller waited for each 4kb page to do a disk
write so its going to stack them up. Just like a capacitor does before it
releases its energy. Data is written as a collection of 4kb pages [assuming
paging to disk is ONLY done in 4kb segments].

So how is this data/code organized in the pagefile?

Lets look at the reverse of this process of writting from memory to disk.
We wrote 384kb of data to pagefile.sys. The memory manager determines it
wants back 64kb of that paged out memory back in its working set. The MM
makes the disk request and waits for the data to be returned. After all the
MM is in billionths of a time slice and the drive controller is only living
in the thousandths time slice world. The data is read back into memory.

Where does the cluster size of 4kb help in the disk read operation? The disk
head could have read two 32kb blocks of data just as fast as thirtytwo 4kb
blocks if they are configuous. Now there is a 64kb hole in the 384kb block
of data. How is this space optimized for the next block of data to be
written?

So where is the reasoning that since paging in Ram is done in 4kb pages that
this translates into 4kb cluster sizes are optimal for paging to disk?

Everything I can find concerning paging and disk operations points to larger
cluster sizes would optimize paging not smaller ones. Especially if you were
using larger pagefiles.

The last point is that both Intel and MS documentation talk about large
memory pages. The intel URLs talk about how programmers can use switches in
their programming to utilize different memory models. In the Microsoft URL
concerning XP in the section titled "Windows XP: Kernel Improvements", MS
talks about XP's use of "large memory pages of 4meg". According to this
article if you have more then 255megs of ram you are automatically going to
use large memory pages.

quote from article
"on such systems, these files (and other core operating system data such as
initial nonpaged pool and the data structures that describe the state of each
physical memory page) are mapped with 4MB "large pages" (as opposed to the
normal 4KB page size on the x86 processor set). By mapping this core
operating system code and data with 4MB pages"

and
"The Windows XP change was needed because of the continual increase in
typical memory configurations of PC systems. The minimum memory to use large
pages is now more than 255MB, instead of more than 127MB"

The article does not talk about paging to disk at all.

If the Intel processor supports the three memory management models, MS
programming suppports large memory pages then how can it be assumed ALL
memory manager operations are done in 4kb pages? Wouldn't the OS's memory
manager stratagy change depending on the application being run? How do we go
from 4kb pages in RAM to be equal to 4kb clusters on disk? It's like saying
4lbs of apples are equal to 4lbs of oranges. They are equal in weight but not
in content.

If you come across a link addressing these questions it sure would be great
to have. Thanks

Thank you everyone for participating in this discussion. Happy New Year to
all!

cquirke (MVP Win9x) · Dec 28, 2004

On Mon, 27 Dec 2004 11:29:04 -0800, "Joshua Bolton"

Hi Cquirke:
Hi!

Yes. There are two of them and they are in the first post of this thread

Sorry, I only have the last 2 posts. XP General cracks upto 1000 a
day, and I don't even parse all the headers, so the original posts in
this thread are beyond by newsreader's recall.

Do you have a URL for this? Anything from MS saying 4kb clusters are
optimal for pagefile.sys? A source for the reason behind this?

Only old paper, I'm afraid: "Programming the Intel 80386" by Bud E
Smith and Mark T Johnson from 1987, page 285:

"An 80386 page is a 4 Kb-sized piece of memory (some other computers
use other page sizes). A page may start at any point in memory, but
for convenience's sake pages are usually placed at addresses 4k apart
(page frames). The addresses of page frames are 0, 4 Kb, 8 Kb, 16 Kb
etc. Any data item that starts at one of these addresses is said to be
"aligned on a page boundry". Addressing a page frame in the 4 Gb
linear address space is made easier because only a 20-bit address is
needed; the last 12 of the total 32bits in the address are all zeros."

When they speak of "other computers", they are not talking 486 etc. as
at this time, the 386 was still dripping amniotic fluid.

My first point is that from the Intel and Microsoft urls I find both
manufacturers support for large memory pages [2 or 4meg pages].

If you mean 4k, 2M or 4M pages, then I doubt whether anything is going
to use the large page sizes for paging as we know it. If you mean
pick a number between 4k and 2M" then I'd want to know at what place
in the 386+ series this came in; Intel may have added something that
MS hasn't used. After all, for us consumers, it took MS until 1995 to
harness the full benefits of the 386 design

No where can I find documentation that says a 4kb page in memory
is written to the hard drive ie. pagefile.sys in 4kb incriments.

Alas, this book gives a surfeit of such detail!

"A physical address has three parts: a 10-bit DIR entry, a 10-bit PAGE
entry, and a 12-bit OFFSET (see Figure 5-3). These three fields are
used to generate a physical address in the following way:

1. The 32-bit CR3 register holds the address of the current Page
Directory. The low 12 bits of CR3 are always 0 because the Page
Directory always starts on a page boundry. The 10-bit DIR value from
the Linear Address points to the needed Page Directory Entry.

2. The high 20 bits of the Page Directory Entry point to a Page
Table; 12 low-order zeros are tacked on to make a full physical
address for the page-aligned table. The 10-bit PAGE value from the
middle of the Linear Address points to the needed Page Table Entry.

3. The high 20 bits of the Page Table Entry point to a page frame; 12
low-order zeroes are tacked on to make a full linear address for the
page frame. The 12-bit OFFSET value from the low-order bits of the
Linear Address points to the needed location in memory.

Each page directory has up to 1024 entries, allowing that many page
tables. Each page table can point to as many as 1024 pages, each 4 Kb
long. This means that a single directory can address 1024 * 1024 *
4096 bytes, or 4 Gb (the entire physical address space of the 80386).

The Page Directory Entries (PDEs) and Page Table Entries (PTEs) are
almost identical. The page table can be thought of as a "page
descriptor table" and the Page Table Entry as a "page descriptor", if
this helps keep them straight. Figure 5-3 shows an entry like that
found in either table. A bit-by-bit breakdown is listed below.

1. PAGING ADDRESS (Bits 31...12). For a PDE this address points to a
Page Table; for a PTE it points to a page. The lower 12 bits of the
address are always 0's, since both Page Tables and pages are aligned
on 4096-byte boundries.

Bit
31 12 11...9 8 7 6 5 4 3 2 1 0
Paging Address OS Res 0 0 D A 0 0 U/S R/W P

Figure 5-3. Page Directory/Page Table Entry

2. OS RESERVED (Bits 11...9). These bits are available for operating
system use. A typical function would be keeping statistics for
virtual memory and page swapping, like the number of times a page has
been accessed in a given period.

3. D (Bit 6). This is the Dirty bit for page table entries, which is
set automatically when the page is written to.

4. A (Bit 5). This is the Accessed bit for page table and page
directory entries, which is set automatically when the page is read or
written to.

5. U/S (Bit 2). This is the User/Supervisor bit. If it's set then
programs with protection level 3 (lowest level) are allowed access.

6. R/W (Bit 1). If U/S is 1 (user access is allowed) then a zero in
this bit means only read accesses are allowed; a 1 means write
accesses are also allowed. {some detail snipped}

7. P (Bit 0). This is the Present bit that indicates whether the PDE
or PTE points to a page that is presently in memory. If this bit is 1
the bit fields are defined as above. If it's 0 the needed page is out
on disk and the remaining 31 bits can be the location of the page slot
out on disk that has the needed page."

The term "page slot" is defined earlier as: "The 4 Kb sections on disk
that hold pages are called page slots".

What I take home, after the above eye-glazing detail, is that 4k page
size is something fairly deeply built into the processor's hardware,
at least in the 386 generation, and not somthing whimsical that's
easily changed. I stopped following this level of processor detail
after the 486; as I recall, there wasn't anything in the 486 that
changed this, though there were a couple of extrra general-purpose
registers and a sprinkling of new instructions.

Win3.0 was written for 8088/8086 (Real Mode), 286 (Standard Mode) and
386 (386 Enhanced Mode). Real Mode was dropped in Win3.1, and
Standard Mode was dropped in WfW3.11 (it lives on as the
"mini-Windows" that runs as a Win95 installation GUI). NT peeled off
before Win95, and Win95 was written for 386.

I doubt whether any of the Win9x series (up to and including WinME)
changed the original swap file design at the level of page size,
especially as FAT32 typically uses 4k clusters.

It's possible the NT series may have changed this, somewhere on the
long journey from pre-Win95 NT to XP; if so, I'd guess this would have
been done for Win2000. But more likely it's the same, given that NTFS
has a hard preference for 4k clusters too.

My arguements to the contrary start with how going from billionths of a
second in memory to thousands of a second access time on a disk, the pagefile
writes [memory to disk] are going to stack up while waiting for the disk to
become available to be written to.

Yes, this is where the "wait" in swapping comes in. And yes; several
successive Moore's Law years should mean transistor space is cheap
enough to redo those 1024 x 1024 x 4096 figures to something grander,
but an "edge" like that can cause quite a ruckuss at the system
software level - a lot of stuff would have to be re-written.

"thus transparent to what we are dealing with (transferring from RAM to HD)."

The time spent transferring data/code from memory to disk IS revelvant when
you consider that cluster size can make a huge difference to any file.
Pagefile.sys is a file.

One avoids that by treating the swap file as a "special case", where
contiguity is assumed, thus avoiding trips to the FAT to see where
each next cluster is in the chain. Win3.1 introduced the "permanent"
swap file that worked in exactly this way, and one of the
optimizations of the day was to choose this for speed.

In Win95, this contiguous requirement seems to have fallen away, given
the OS can dynamically resize the swap file. I suspect what's
happened is that cluster run chains are managed internally somehow,
much as is the norm within NTFS. But that's a guess.

However, I don;t think everything stops when a page has to be pulled
from disk to RAM. More likely the OS shedulder passes control to the
next thread while the stalled one awaits its disk food.

For example if you do video editing you do NOT want a 4kb cluster size.
This is due to the fact you are working with relatively large files 1-2gigs
and larger. Math shows you the reason. A 2gig file on 4kb clusters takes up
500 clusters. A 2gig file on 512kb clusters takes up 4 clusters.

Er... did you mean 2M, there?

I think you will find the video editing app asserts itself to some
extent, in various ways, to work around file system issues. It's
likely to block data writes a number of clusters at a time, and (on
FAT32) break across files when the maximum size limit is reached.

But yes; while you don't want 512k clusters, you may indeed want
bigger clusters in such cases. The way you do that is through
partitioning, so that different things can benefit from different file
systems and cluster sizes. In this case, it's more likely the video
data will be on a different physical drive, to unlink the patterns of
head travel for system and data access.

rarely is a file contigous and this is where the true "tire meets the road"
consequences of fragmentation come in. 4 reads vs 500 reads makes the larger
cluster size a faster retrieval option for large files.

True. Paging size is just one factor in performance, which is in turn
one factor in which file system and cluster size you choose. Because
priorities differ, I like 4k for C: but may prefer bigger elsewhere.

So how is this data/code organized in the pagefile?

I have a feeling it may mirror the glimpse of what we saw about how
the processor itself managed page tables etc.

Where does the cluster size of 4kb help in the disk read operation? The disk
head could have read two 32kb blocks of data just as fast as thirtytwo 4kb
blocks if they are configuous. Now there is a 64kb hole in the 384kb block
of data. How is this space optimized for the next block of data to be
written?

If you take the 4k size as imposed by the CPU - and speculation aside,
I see no reason to believe this has changed - then a 16k cluster size
means 4 x as many bytes have to trundle through the bus and into RAM
via DMA, with the excess thrown out if it's not needed (or extra
inventory tracking if it is to be used as "cache").

While DMA frees the CPU of gazing at each byte as it makes the trip,
it's still not quite a free lunch.

The last point is that both Intel and MS documentation talk about large
memory pages. The intel URLs talk about how programmers can use switches in
their programming to utilize different memory models.

I'd worry about confusing memory segmentation, software's internal
overlay paging schemes, and the processor's virtual memory paging,
when considering what a "page" means.

concerning XP in the section titled "Windows XP: Kernel Improvements", MS
talks about XP's use of "large memory pages of 4meg". According to this
article if you have more then 255megs of ram you are automatically going to
use large memory pages.

Well, that's a URL I'd like to see. Maybe I can biopsy and Google it?

quote from article
"on such systems, these files (and other core operating system data such as
initial nonpaged pool and the data structures that describe the state of each
physical memory page) are mapped with 4MB "large pages" (as opposed to the
normal 4KB page size on the x86 processor set). By mapping this core
operating system code and data with 4MB pages"

Bingo!

http://msdn.microsoft.com/msdnmag/issues/01/12/xpkernel/default.aspx

"... these files (and other core operating system data such as initial
nonpaged pool and the data structures that describe the state of each
physical memory page) are mapped with 4MB "large pages" (as opposed to
the normal 4KB page size on the x86 processor set). By mapping this
core operating system code and data with 4MB pages, the first
reference to any byte within each 4MB region results in the x86 memory
management unit caching the address translation information to find
any other byte within the 4MB region without having to resort to
looking in page table data structures. This speeds address
translation."

Note that this large page size applies to some files only, not
everything, and that the article doesn't imply the processor's page
size has changed. It looks more like an added layer of abstraction
over an unchanged processor reality.

The article does not talk about paging to disk at all.

Quite; it goes about locating offsets into large files without having
to check whether it's still in RAM. You'd prolly find the files
involved are locked into RAM anyway, i.e. not to be paged out.

If the Intel processor supports the three memory management models, MS
programming suppports large memory pages then how can it be assumed ALL
memory manager operations are done in 4kb pages? Wouldn't the OS's memory
manager stratagy change depending on the application being run? How do we go
from 4kb pages in RAM to be equal to 4kb clusters on disk?

We're guessing at detail here. We can either take the recommendation
of the system guys at face value - and they do seem to show a
preference for 4k clusters - or we have to swot up to the point we can
meaningfully assess this stuff first-hand.

To be honest, it's more work than I can take on, right now ;-)

---------- ----- ---- --- -- - - - -

"He's such a character!"
' Yeah - CHAR(0) '

Guest · Dec 28, 2004

Man I hate it when I can't do math! Yes 2meg not 2gig.
Yes this is a lot of work but then I have been on the pagefile optimization
issue for almost 5 years now. The basis for this discussion is that Alex
Nichol has a web site that is being recommended by others for pagefile
optimization. I believe a number of the recommendations are contrary to
optimization. Key in this discussion is how pages in RAM are written to the
drive.

I object to web sites that make statements/recommendations not based on
evidence or documentation. This sites recommendation is that since 4k pages
are used in memory that 4k clusters are optimal which continues to be
unsupported. So began my digging into actual pagefile.sys operations.

I use IE to read this forum and can scroll to the top. The layout here sucks
compared to almost any other site.

[Here is the section and urls from the top post.]
Intel architecture involves three memory management models and they are FLAT
MODEL, SEGMENTED MODEL and REAL ADDRESS MODE MODEL
See ftp://download.intel.com/design/Pentium4/manuals/25366514.pdf
Section 3.3
Look at chapter three, page 55 where it discusses segmented memory.
"The segment selector identifies the segment to be accessed and the offset
identifies a byte in the address pace of the segment. The programs running
on a IA-32 processor can address up to 16,383 segments of diffent sizes and
types, and each segment can be as large as 2 to the 32nd power"

See here: http://www.intel.com/design/pentiumii/manuals/24319202.pdf

Section 3.6 Paging (Virtual Memory)
Second paragraph.
â€œWhen paging is used, the processor divides the linear address space into
fixed sized paged (generally 4 Kbytes in length) that is mapped into physical
memory and/or disk storage.â€

Section 3.6.1 Paging Options
Paging is controlled by three flags in the processors control registers:
PG (paging flag)
PSE (page size extensions) flag
PAE (physical address extension)
Skipping down to the PSE description
â€œThe PSE flag enables large page sizes: 4 Mbyte [thatâ€™s MEGAbytes] pages or
2 Mbyte pages (when the PAE flag is set)

If you look at table 3-3 you will note 4KB, 2MB and 4MB paging sizes.
[end of paste]

Very similiar to the information you posted concerning in Ram memory
management is contained in those links at Intel's site.

What happens in RAM memory is pretty well documented except for the pageout
operations. No where do I find the mechanism used to transfer pages in
memory to the hard disk.
There is lots of discussion on in RAM memory operations and that 4kb, 2m and
4m are page sizes being used. XP is using some of this in its kernal
improvements with 4m pages.

But you are right. I am left with guessing concerning this aspect of OS
operations. Sure is strange there is so much info on operation in RAM but
not concerning the internal workings of pagefile.sys.

I will keep digging. Thanks for the help.

cquirke (MVP Win9x) · Dec 29, 2004

On Tue, 28 Dec 2004 10:11:01 -0800, "Joshua Bolton"

Yes this is a lot of work but then I have been on the pagefile optimization
issue for almost 5 years now. The basis for this discussion is that Alex
Nichol has a web site that is being recommended by others for pagefile
optimization. I believe a number of the recommendations are contrary to
optimization. Key in this discussion is how pages in RAM are written to the
drive.

I object to web sites that make statements/recommendations not based on
evidence or documentation. This sites recommendation is that since 4k pages
are used in memory that 4k clusters are optimal which continues to be
unsupported. So began my digging into actual pagefile.sys operations.

So far, I'd go with the site's recommendations, as I've not seen
anything to suggest (say) that XP now pages in 8k or larger blocks.

[Here is the section and urls from the top post.]

Ah, thanks!

Intel architecture involves three memory management models and they are FLAT
MODEL, SEGMENTED MODEL and REAL ADDRESS MODE MODEL
See ftp://download.intel.com/design/Pentium4/manuals/25366514.pdf
Section 3.3
Look at chapter three, page 55 where it discusses segmented memory.
"The segment selector identifies the segment to be accessed and the offset
identifies a byte in the address pace of the segment. The programs running
on a IA-32 processor can address up to 16,383 segments of diffent sizes and
types, and each segment can be as large as 2 to the 32nd power"

Segments are not pages; that's a crucial thing.

See here: http://www.intel.com/design/pentiumii/manuals/24319202.pdf

That I've downloaded, thanks...

Section 3.6 Paging (Virtual Memory)
Second paragraph.
“When paging is used, the processor divides the linear address space into
fixed sized paged (generally 4 Kbytes in length) that is mapped into physical
memory and/or disk storage.”

Yep. It goes on to mention the Look-aside Buffers, which was some of
the detail I snipped in my last post.

Section 3.6.1 Paging Options
Paging is controlled by three flags in the processors control registers:
PG (paging flag)
PSE (page size extensions) flag
PAE (physical address extension)
Skipping down to the PSE description
“The PSE flag enables large page sizes: 4 Mbyte [that’s MEGAbytes] pages or
2 Mbyte pages (when the PAE flag is set)

At this point, I think we can close the book ;-)

It's clear that page size is hard-coded to one of only 3 options, as
set by the PSE and PAE flags. Either 4k, 2M or 4M.

Both 2M and 4M are way too granular for routine use, and in any case,
would contain a number of clusters (thus ?fragmentation and extra head
travel within one page) no matter how big the cluster size was.

Generally, the largest cluster size is going to be around 64k - still
way smaller than the 2M page size.

Even if MS were to make some use of 2M and 4M pages, they'd be faced
with the multiple-clusters-per-page issue. I have a hunch they'd
solve that issue by making special arrangements for large-paged
material to be kept contiguous, irrespective of cluster size.

If you look at table 3-3 you will note 4KB, 2MB and 4MB paging sizes.
[end of paste]

But this isn't a range; it's either one of those three sizes. It's
like if I offer you CDRs in lots of 5, 5 million or 10 million; maybe
you'd rather buy them 20 or 50 at a time, but even so you'd have to
buy them in 5s because 5 million is way too many.

What happens in RAM memory is pretty well documented except for the pageout
operations. No where do I find the mechanism used to transfer pages in
memory to the hard disk.

The system sprawls on both sides of the hardware divide.

Some stuff is hardware-initiated and managed; the generation and
routing of page fault exceptions, the tables that track which page is
in RAM etc., and limits such as the page size.

The OS has to interact with these realities, and while it can
virtualise them to some extent, it may not pay to do so. It's the OS
that acts on a page fault exception, and does the actual pulling of
pages into RAM or (when dirty) writing them back to disk first.

Generally, writes hurt more than reads. A page that never changes can
be dropped out of RAM, and it's "cheap" to do so because if you need
it again, you just read it off disk again. But a page that is "dirty"
has to be written to disk if it's paged out of RAM, so there's double
the impact; the write to free up the RAM, and the read to get it back.

So the tendency is to delay writes, and that increases the critical
window risk that applies during a bad exit.

There is lots of discussion on in RAM memory operations and that 4kb, 2m and
4m are page sizes being used. XP is using some of this in its kernal
improvements with 4m pages.

Yes, but I don't think they are routinely paging around in 2M or 4M
chunks. I suspect certain core files are paged large, to make it
"cheaper" to look up offsets in the page, as well as reduce the
clutter of too much page housekeeping.

When it gets to the section on 36-bit addressing, I get a sense of
deja vu; it looks like the 32-bit address limit is today's 1M barrier
(which only became a 640k barrier because IBM decided to stick thier
ROMs at the then-top of the memory map).

To squash that addressability issue flat, roll on 64-bit computing

But you are right. I am left with guessing concerning this aspect of OS
operations. Sure is strange there is so much info on operation in RAM but
not concerning the internal workings of pagefile.sys.

Well, a lot of the RAM stuff is up to the processor, and has to be
documented to attract sware devs to support that processor.

OTOH, details of how the page file is managed are up to the OS, and as
application devs "don't need" that info, whereas competing OSs may
find it useful, you are less likely to find it documented!

I will keep digging. Thanks for the help.

Keep us posted

---------- ----- ---- --- -- - - - -

"He's such a character!"
' Yeah - CHAR(0) '

temporarily disable Volume Shadow Copy Service (VSS)?	2	Apr 11, 2009
Cluster Size on XP-Sp2	3	Aug 12, 2004
tweaking the policy of pageout	1	Apr 29, 2008
Let's discuss XP memory management (paging, pagfile, etc.)	2	Feb 27, 2006
Virtual Memory technical information	2	May 24, 2006
This is a bit of a showstopper	22	Nov 20, 2008
PRoblem with DVD RW drive	12	Feb 10, 2015
Page file issues	1	Oct 4, 2006

Att. Alex Nichol -VM cont.

Guest

Alex Nichol

Guest

Alex Nichol

Guest

cquirke (MVP Win9x)

Guest

cquirke (MVP Win9x)

Guest

cquirke (MVP Win9x)

Ask a Question

Similar Threads