-
-PV HASHING Changes - JK 1/2007
-
-Pve's establish physical to virtual mappings. These are used for aliasing of a
-physical page to (potentially many) virtual addresses within pmaps. In the
-previous implementation the structure of the pv_entries (each 16 bytes in size) was
-
-typedef struct pv_entry {
- struct pv_entry_t next;
- pmap_t pmap;
- vm_map_offset_t va;
-} *pv_entry_t;
-
-An initial array of these is created at boot time, one per physical page of
-memory, indexed by the physical page number. Additionally, a pool of entries
-is created from a pv_zone to be used as needed by pmap_enter() when it is
-creating new mappings. Originally, we kept this pool around because the code
-in pmap_enter() was unable to block if it needed an entry and none were
-available - we'd panic. Some time ago I restructured the pmap_enter() code
-so that for user pmaps it can block while zalloc'ing a pv structure and restart,
-removing a panic from the code (in the case of the kernel pmap we cannot block
-and still panic, so, we keep a separate hot pool for use only on kernel pmaps).
-The pool has not been removed since there is a large performance gain keeping
-freed pv's around for reuse and not suffering the overhead of zalloc for every
-new pv we need.
-
-As pmap_enter() created new mappings it linked the new pve's for them off the
-fixed pv array for that ppn (off the next pointer). These pve's are accessed
-for several operations, one of them being address space teardown. In that case,
-we basically do this
-
- for (every page/pte in the space) {
- calc pve_ptr from the ppn in the pte
- for (every pv in the list for the ppn) {
- if (this pv is for this pmap/vaddr) {
- do housekeeping
- unlink/free the pv
- }
- }
- }
-
-The problem arose when we were running, say 8000 (or even 2000) apache or
-other processes and one or all terminate. The list hanging off each pv array
-entry could have thousands of entries. We were continuously linearly searching
-each of these lists as we stepped through the address space we were tearing
-down. Because of the locks we hold, likely taking a cache miss for each node,
-and interrupt disabling for MP issues the system became completely unresponsive
-for many seconds while we did this.
-
-Realizing that pve's are accessed in two distinct ways (linearly running the
-list by ppn for operations like pmap_page_protect and finding and
-modifying/removing a single pve as part of pmap_enter processing) has led to
-modifying the pve structures and databases.
-
-There are now two types of pve structures. A "rooted" structure which is
-basically the original structure accessed in an array by ppn, and a ''hashed''
-structure accessed on a hash list via a hash of [pmap, vaddr]. These have been
-designed with the two goals of minimizing wired memory and making the lookup of
-a ppn faster. Since a vast majority of pages in the system are not aliased
-and hence represented by a single pv entry I've kept the rooted entry size as
-small as possible because there is one of these dedicated for every physical
-page of memory. The hashed pve's are larger due to the addition of the hash
-link and the ppn entry needed for matching while running the hash list to find
-the entry we are looking for. This way, only systems that have lots of
-aliasing (like 2000+ httpd procs) will pay the extra memory price. Both
-structures have the same first three fields allowing some simplification in
-the code.
-
-They have these shapes
-
-typedef struct pv_rooted_entry {
- queue_head_t qlink;
- vm_map_offset_t va;
- pmap_t pmap;
-} *pv_rooted_entry_t;
-
-
-typedef struct pv_hashed_entry {
- queue_head_t qlink;
- vm_map_offset_t va;
- pmap_t pmap;
- ppnum_t ppn;
- struct pv_hashed_entry *nexth;
-} *pv_hashed_entry_t;
-
-The main flow difference is that the code is now aware of the rooted entry and
-the hashed entries. Code that runs the pv list still starts with the rooted
-entry and then continues down the qlink onto the hashed entries. Code that is
-looking up a specific pv entry first checks the rooted entry and then hashes
-and runs the hash list for the match. The hash list lengths are much smaller
-than the original pv lists that contained all aliases for the specific ppn.
-
-*/
+ *
+ * PV HASHING Changes - JK 1/2007
+ *
+ * Pve's establish physical to virtual mappings. These are used for aliasing of a
+ * physical page to (potentially many) virtual addresses within pmaps. In the
+ * previous implementation the structure of the pv_entries (each 16 bytes in size) was
+ *
+ * typedef struct pv_entry {
+ * struct pv_entry_t next;
+ * pmap_t pmap;
+ * vm_map_offset_t va;
+ * } *pv_entry_t;
+ *
+ * An initial array of these is created at boot time, one per physical page of
+ * memory, indexed by the physical page number. Additionally, a pool of entries
+ * is created from a pv_zone to be used as needed by pmap_enter() when it is
+ * creating new mappings. Originally, we kept this pool around because the code
+ * in pmap_enter() was unable to block if it needed an entry and none were
+ * available - we'd panic. Some time ago I restructured the pmap_enter() code
+ * so that for user pmaps it can block while zalloc'ing a pv structure and restart,
+ * removing a panic from the code (in the case of the kernel pmap we cannot block
+ * and still panic, so, we keep a separate hot pool for use only on kernel pmaps).
+ * The pool has not been removed since there is a large performance gain keeping
+ * freed pv's around for reuse and not suffering the overhead of zalloc for every
+ * new pv we need.
+ *
+ * As pmap_enter() created new mappings it linked the new pve's for them off the
+ * fixed pv array for that ppn (off the next pointer). These pve's are accessed
+ * for several operations, one of them being address space teardown. In that case,
+ * we basically do this
+ *
+ * for (every page/pte in the space) {
+ * calc pve_ptr from the ppn in the pte
+ * for (every pv in the list for the ppn) {
+ * if (this pv is for this pmap/vaddr) {
+ * do housekeeping
+ * unlink/free the pv
+ * }
+ * }
+ * }
+ *
+ * The problem arose when we were running, say 8000 (or even 2000) apache or
+ * other processes and one or all terminate. The list hanging off each pv array
+ * entry could have thousands of entries. We were continuously linearly searching
+ * each of these lists as we stepped through the address space we were tearing
+ * down. Because of the locks we hold, likely taking a cache miss for each node,
+ * and interrupt disabling for MP issues the system became completely unresponsive
+ * for many seconds while we did this.
+ *
+ * Realizing that pve's are accessed in two distinct ways (linearly running the
+ * list by ppn for operations like pmap_page_protect and finding and
+ * modifying/removing a single pve as part of pmap_enter processing) has led to
+ * modifying the pve structures and databases.
+ *
+ * There are now two types of pve structures. A "rooted" structure which is
+ * basically the original structure accessed in an array by ppn, and a ''hashed''
+ * structure accessed on a hash list via a hash of [pmap, vaddr]. These have been
+ * designed with the two goals of minimizing wired memory and making the lookup of
+ * a ppn faster. Since a vast majority of pages in the system are not aliased
+ * and hence represented by a single pv entry I've kept the rooted entry size as
+ * small as possible because there is one of these dedicated for every physical
+ * page of memory. The hashed pve's are larger due to the addition of the hash
+ * link and the ppn entry needed for matching while running the hash list to find
+ * the entry we are looking for. This way, only systems that have lots of
+ * aliasing (like 2000+ httpd procs) will pay the extra memory price. Both
+ * structures have the same first three fields allowing some simplification in
+ * the code.
+ *
+ * They have these shapes
+ *
+ * typedef struct pv_rooted_entry {
+ * queue_head_t qlink;
+ * vm_map_offset_t va;
+ * pmap_t pmap;
+ * } *pv_rooted_entry_t;
+ *
+ *
+ * typedef struct pv_hashed_entry {
+ * queue_head_t qlink;
+ * vm_map_offset_t va;
+ * pmap_t pmap;
+ * ppnum_t ppn;
+ * struct pv_hashed_entry *nexth;
+ * } *pv_hashed_entry_t;
+ *
+ * The main flow difference is that the code is now aware of the rooted entry and
+ * the hashed entries. Code that runs the pv list still starts with the rooted
+ * entry and then continues down the qlink onto the hashed entries. Code that is
+ * looking up a specific pv entry first checks the rooted entry and then hashes
+ * and runs the hash list for the match. The hash list lengths are much smaller
+ * than the original pv lists that contained all aliases for the specific ppn.
+ *
+ */