Virtual Memory II: the return of objrmap
Andrea Arcangeli not only wants to make the Linux kernel scale to and beyond 32GB of memory on 32-bit processors; he seems to be in a real hurry. There are, it would seem, customers waiting for a 2.6-based distribution which can run in such environments.
For Andrea, the real culprit in the exhaustion of low memory is clear: it's the reverse-mapping virtual memory ("rmap") code. The rmap code was first described on this pagein January, 2002; its purpose is to make it easier for the kernel to free memory when swapping is required. To that end, rmap maintains, for each physical page in the system, a chain of reverse pointers; each pointer indicates a page table which has a reference for that page. By following the rmap chains, the kernel can quickly find all mappings for a given page, unmap them, and swap the page out.
The rmap code solved some real performance problems in the kernel's virtual memory subsystem, but it, too has a cost. Every one of those reverse mapping entries consumes memory - low memory in particular. Much effort has gone into reducing the memory cost of the rmap chains, but the simple fact remains: as the amount of memory (and the number of processes using that memory) goes up, the rmap chains will consume larger amounts of low memory. Eliminating the rmap overhead would go a long way toward allowing the kernel to scale to larger systems. Of course, one wants to eliminate this overhead while not losing the benefits that rmap brings.
Andrea's approach is to bring back and extend the object-based reverse mapping patches. The initial object-based patch was created by Dave McCracken; LWNcovered this patcha year ago. Essentially, this patch eliminates the rmap chains for memory which maps a file by following pointers "the long way around" and searching candidate virtual memory areas (VMAs). Andrea hasupdated this patchand fixed some bugs, but the core of the patch remains the same; see last year's description for the details.
Last week, we raised the possibility that the virtual memory subsystem could see fundamental changes in the course of the 2.6 "stable" series. This week, Linusconfirmed that possibilityin response to Andrea's object-based reverse mapping patch:
I certainly prefer this to the 4:4 horrors. So it sounds worth it to put it into -mm if everybody else is ok with it.
Assuming this work goes forward, it has the usual implications for the stable kernel. Even assuming that it stays in the -mm tree for some time, its inclusion into 2.6 is likely to destabilize things for a few releases until all of the obscure bugs are shaken out.
Dave McCracken's original patch, in any case, only solves part of the problem. It gets rid of the rmap chains for file-backed memory, but it does nothing for anonymous memory (basic process data - stacks, memory obtained withmalloc(), etc.), which has no "object" behind it. File-backed memory is a large portion of the total, especially on systems which are running large Oracle servers and use big, shared file mappings. But anonymous memory is also a large part of the mix; it would be nice to take care of the rmap overhead for that as well.
To that end, Andrea has postedanother patch(in preliminary form) which provides object-based reverse mapping for anonymous memory as well. It works, essentially, by replacing the rmap chain with a pointer to a chain of virtual memory area (VMA) structures.
Anonymous pages are always created in response to a request for memory from a single process; as a result, they are never shared at creation time. Given that, there is no need for a new anonymous page to have a chain of reverse mappings; we know that there can be only a single mapping. Andrea's patch adds a union tostruct pagewhich includes the existingmappingpointer (for non-anonymous memory) and adds a couple of new ones. One of those is simply calledvma, and it points to the (single) VMA structure pointing to the page. So if a process has several non-shared, anonymous pages in the same virtual memory area, the structure looks somewhat like this:
With this structure, the kernel can find the page table which maps a given page by following the pointers through the VMA structure.
Life gets a bit more complicated when the process forks, however. Once that happens, there will be multiple page tables pointing to the same anonymous pages and a single VMA pointer will no longer be adequate. To deal with this case, Andrea has created a new "anon_vma" structure which implements a linked list of VMAs. The third member of the newstruct pageunion is a pointer to this structure which, in turn, points to all VMAs which might contain the page. The structure now looks like:
If the kernel needs to unmap a page in this scenario, it must follow the linked list and examine every VMA it finds. Once the page is unmapped from every page table found, it can be freed.
There are some memory costs to this scheme: the VMA structure requires a newlist_headstructure, and theanon_vmastructure must be allocated whenever a chain must be formed. One VMA can refer to thousands of pages, however, so a per-VMA cost will be far less than the per-page costs incurred by the existing rmap code.
This approach does incur a greater computational cost. Freeing a page requires scanning multiple VMAs which may or may not contain references to the page under consideration. This cost will increase with the number of processes sharing a memory region. Ingo Molnar, who is fond of O(1) solutions,is nervousabout object-based schemes for this reason. According to Ingo, losing the possibility of creating an O(1) page unmapping scheme is a heavy cost to pay for the prize of making large amounts of memory work on obsolete hardware.
The solution that Ingo would like to see, instead, is to reduce the per-page memory overhead by reducing the number of pages. The means to that end ispage clustering- grouping adjacent hardware pages into larger virtual pages. Page clustering would reduce rmap overhead, and reduce the size of the main kernel memory map as well. The available page clustering patch is even more intrusive than object-based reverse mapping, however; it seems seriously unlikely to be considered for 2.6.
原文地址: Virtual Memory II: the return of objrmap
分享到:
相关推荐
This book dedicates itself to explaining, in detail, how the memory manager is implemented in Linux, thereby cutting down the time needed to understand it from many months to mere weeks. The Linux VM...
Linux内存管理:Virtual Memory and Linux - AlanOtt.pdf;Linux内存管理:Virtual Memory and Linux - AlanOtt.pdf;Linux内存管理:Virtual Memory and Linux - AlanOtt.pdf
Understanding The Linux Virtual Memory Manager.rar
linux virtual memory manager
elasticsearch启动后自动关闭:max virtual memory areas vm.max_map_count [65530] is too low, increase to at… elasticsearch 我遇到的问题是用docker 启动elasticsearch后会自动关闭,具体关闭时间点没注意,...
good memory express~you can find many useful express for the basic information about the virtual memory
Code Commentary On The Linux Virtual Memory Manager Code Commentary On The Linux Virtual Memory Manager
Memory Management in the Java HotSpot Virtual Machine.pdf
The Architecture of the Java Virtual Machine Data Types Word Size The Class Loader Subsystem Loading, Linking and Initialization The Primordial Class Loader Class Loader Objects Name Spaces ...
专门介绍Linux虚拟存储管理的文档,虽然较老,还是有参考价值的
Understanding the Linux®Virtual Memory Manager
a revised and expanded edition of Virtual Reality: Computers Mimic the Physical World, which was published in 1998 as part of the Science Sourcebooks series. The older book reflected the field of ...
The Kernel-based Virtual Machine, or kvm, is a new Linux subsystem which leverages these virtualization extensions to add a virtual machine monitor (or ...seamlessly with the rest of the system.
This is a good textbook to understand the crucial knowledge of the popular OS in our times!
PPT illustrates virtual memory.
linux Virtual Memory Manager
react-virtual-list允许您显示固定高度项目的大列表,而仅使这些项目在屏幕上可见。 这允许使用少得多的DOM元素呈现大列表。其他好处: 一个依赖项(它是prop-types ) 性能-演示页面几乎总是保持60fps以上的速度 将...