X-Git-Url: https://git.saurik.com/redis.git/blobdiff_plain/1d03c1c98a45ec569e017e4c0b8957c4ce00850b..HEAD:/deps/jemalloc/doc/jemalloc.xml.in diff --git a/deps/jemalloc/doc/jemalloc.xml.in b/deps/jemalloc/doc/jemalloc.xml.in index 7a32879a..54b87474 100644 --- a/deps/jemalloc/doc/jemalloc.xml.in +++ b/deps/jemalloc/doc/jemalloc.xml.in @@ -30,6 +30,7 @@ malloc calloc posix_memalign + aligned_alloc realloc free malloc_usable_size @@ -41,6 +42,7 @@ rallocm sallocm dallocm + nallocm --> general purpose memory allocation functions @@ -72,6 +74,11 @@ size_t alignment size_t size + + void *aligned_alloc + size_t alignment + size_t size + void *realloc void *ptr @@ -154,6 +161,12 @@ void *ptr int flags + + int nallocm + size_t *rsize + size_t size + int flags + @@ -183,6 +196,14 @@ alignment must be a power of 2 at least as large as sizeof(void *). + The aligned_alloc function + allocates size bytes of memory such that the + allocation's base address is an even multiple of + alignment. The requested + alignment must be a power of 2. Behavior is + undefined if size is not an integral multiple of + alignment. + The realloc function changes the size of the previously allocated memory referenced by ptr to size bytes. The @@ -297,12 +318,15 @@ for (i = 0; i < nbins; i++) { Experimental API The experimental API is subject to change or removal without regard - for backward compatibility. + for backward compatibility. If + is specified during configuration, the experimental API is + omitted. The allocm, rallocm, - sallocm, and - dallocm functions all have a + sallocm, + dallocm, and + nallocm functions all have a flags argument that can be used to specify options. The functions only check the options that are contextually relevant. Use bitwise or (|) operations to @@ -344,6 +368,15 @@ for (i = 0; i < nbins; i++) { object. This constraint can apply to both growth and shrinkage. + + ALLOCM_ARENA(a) + + + Use the arena specified by the index + a. This macro does not validate that + a specifies an arena in the valid + range. + @@ -351,7 +384,9 @@ for (i = 0; i < nbins; i++) { least size bytes of memory, sets *ptr to the base address of the allocation, and sets *rsize to the real size of the allocation if - rsize is not NULL. + rsize is not NULL. Behavior + is undefined if size is + 0. The rallocm function resizes the allocation at *ptr to be at least @@ -364,7 +399,8 @@ for (i = 0; i < nbins; i++) { language="C">size + extra) bytes, though inability to allocate the extra byte(s) will not by itself result in failure. Behavior is - undefined if (size + + undefined if size is 0, or if + (size + extra > SIZE_T_MAX). @@ -374,6 +410,15 @@ for (i = 0; i < nbins; i++) { The dallocm function causes the memory referenced by ptr to be made available for future allocations. + + The nallocm function allocates no + memory, but it performs the same size computation as the + allocm function, and if + rsize is not NULL it sets + *rsize to the real size of the allocation that + would result from the equivalent allocm + function call. Behavior is undefined if + size is 0. @@ -408,9 +453,9 @@ for (i = 0; i < nbins; i++) { suboptimal for several reasons, including race conditions, increased fragmentation, and artificial limitations on maximum usable memory. If is specified during configuration, this - allocator uses both sbrk + allocator uses both mmap 2 and - mmap + sbrk 2, in that order of preference; otherwise only mmap 2 is used. @@ -455,24 +500,14 @@ for (i = 0; i < nbins; i++) { allocations in constant time. Small objects are managed in groups by page runs. Each run maintains - a frontier and free list to track which regions are in use. Unless - is specified during configuration, - allocation requests that are no more than half the quantum (8 or 16, - depending on architecture) are rounded up to the nearest power of two that - is at least sizeof(void *). - Allocation requests that are more than half the quantum, but no more than - the minimum cacheline-multiple size class (see the opt.lg_qspace_max - option) are rounded up to the nearest multiple of the quantum. Allocation - requests that are more than the minimum cacheline-multiple size class, but - no more than the minimum subpage-multiple size class (see the opt.lg_cspace_max - option) are rounded up to the nearest multiple of the cacheline size (64). - Allocation requests that are more than the minimum subpage-multiple size - class, but no more than the maximum subpage-multiple size class are rounded - up to the nearest multiple of the subpage size (256). Allocation requests - that are more than the maximum subpage-multiple size class, but small - enough to fit in an arena-managed chunk (see the sizeof(double). All other small + object size classes are multiples of the quantum, spaced such that internal + fragmentation is limited to approximately 25% for all but the smallest size + classes. Allocation requests that are larger than the maximum small size + class, but small enough to fit in an arena-managed chunk (see the opt.lg_chunk option), are rounded up to the nearest run size. Allocation requests that are too large to fit in an arena-managed chunk are rounded up to the nearest multiple of @@ -490,41 +525,55 @@ for (i = 0; i < nbins; i++) { Size classes - - - - + + + + Category - Subcategory + Spacing Size - Small - Tiny + Small + lg [8] - Quantum-spaced + 16 [16, 32, 48, ..., 128] - Cacheline-spaced - [192, 256, 320, ..., 512] + 32 + [160, 192, 224, 256] + + + 64 + [320, 384, 448, 512] - Subpage-spaced - [768, 1024, 1280, ..., 3840] + 128 + [640, 768, 896, 1024] - Large + 256 + [1280, 1536, 1792, 2048] + + + 512 + [2560, 3072, 3584] + + + Large + 4 KiB [4 KiB, 8 KiB, 12 KiB, ..., 4072 KiB] - Huge + Huge + 4 MiB [4 MiB, 8 MiB, 12 MiB, ...] @@ -592,32 +641,42 @@ for (i = 0; i < nbins; i++) { - config.dynamic_page_shift + config.fill (bool) r- - was - specified during build configuration. + was specified during + build configuration. - config.fill + config.lazy_lock (bool) r- - was specified during + was specified + during build configuration. + + + + + config.mremap + (bool) + r- + + was specified during build configuration. - config.lazy_lock + config.munmap (bool) r- - was specified - during build configuration. + was specified during + build configuration. @@ -662,51 +721,41 @@ for (i = 0; i < nbins; i++) { - config.swap + config.tcache (bool) r- - was specified during - build configuration. + was not specified + during build configuration. - config.sysv + config.tls (bool) r- - was specified during + was not specified during build configuration. - config.tcache - (bool) - r- - - was not specified - during build configuration. - - - - - config.tiny + config.utrace (bool) r- - was not specified - during build configuration. + was specified during + build configuration. - config.tls + config.valgrind (bool) r- - was not specified during + was specified during build configuration. @@ -735,38 +784,28 @@ for (i = 0; i < nbins; i++) { - - - opt.lg_qspace_max - (size_t) - r- - - Size (log base 2) of the maximum size class that is a - multiple of the quantum (8 or 16 bytes, depending on architecture). - Above this size, cacheline spacing is used for size classes. The - default value is 128 bytes (2^7). - - - + - opt.lg_cspace_max + opt.lg_chunk (size_t) r- - Size (log base 2) of the maximum size class that is a - multiple of the cacheline size (64). Above this size, subpage spacing - (256 bytes) is used for size classes. The default value is 512 bytes - (2^9). + Virtual memory chunk size (log base 2). The default + chunk size is 4 MiB (2^22). - + - opt.lg_chunk - (size_t) + opt.dss + (const char *) r- - Virtual memory chunk size (log base 2). The default - chunk size is 4 MiB (2^22). + dss (sbrk + 2) allocation precedence as + related to mmap + 2 allocation. The following + settings are supported: “disabled”, “primary”, + and “secondary” (default). @@ -775,9 +814,9 @@ for (i = 0; i < nbins; i++) { (size_t) r- - Maximum number of arenas to use. The default maximum - number of arenas is four times the number of CPUs, or one if there is a - single CPU. + Maximum number of arenas to use for automatic + multiplexing of threads and arenas. The default is four times the + number of CPUs, or one if there is a single CPU. @@ -794,7 +833,7 @@ for (i = 0; i < nbins; i++) { 2 or a similar system call. This provides the kernel with sufficient information to recycle dirty pages if physical memory becomes scarce and the pages remain unused. The - default minimum ratio is 32:1 (2^5:1); an option value of -1 will + default minimum ratio is 8:1 (2^3:1); an option value of -1 will disable dirty page purging. @@ -830,7 +869,49 @@ for (i = 0; i < nbins; i++) { 0x5a. This is intended for debugging and will impact performance negatively. This option is disabled by default unless is specified during - configuration, in which case it is enabled by default. + configuration, in which case it is enabled by default unless running + inside Valgrind. + + + + + opt.quarantine + (size_t) + r- + [] + + Per thread quarantine size in bytes. If non-zero, each + thread maintains a FIFO object quarantine that stores up to the + specified number of bytes of memory. The quarantined memory is not + freed until it is released from quarantine, though it is immediately + junk-filled if the opt.junk option is + enabled. This feature is of particular use in combination with Valgrind, which can detect attempts + to access quarantined objects. This is intended for debugging and will + impact performance negatively. The default quarantine size is 0 unless + running inside Valgrind, in which case the default is 16 + MiB. + + + + + opt.redzone + (bool) + r- + [] + + Redzones enabled/disabled. If enabled, small + allocations have redzones before and after them. Furthermore, if the + opt.junk option is + enabled, the redzones are checked for corruption during deallocation. + However, the primary intended purpose of this feature is to be used in + combination with Valgrind, + which needs redzones in order to do effective buffer overflow/underflow + detection. This option is intended for debugging and will impact + performance negatively. This option is disabled by + default unless running inside Valgrind. @@ -850,20 +931,30 @@ for (i = 0; i < nbins; i++) { - + - opt.sysv + opt.utrace (bool) r- - [] + [] - If enabled, attempting to allocate zero bytes will - return a NULL pointer instead of a valid pointer. - (The default behavior is to make a minimal allocation and return a - pointer to it.) This option is provided for System V compatibility. - This option is incompatible with the opt.xmalloc option. - This option is disabled by default. + Allocation tracing based on + utrace + 2 enabled/disabled. This option + is disabled by default. + + + + + opt.valgrind + (bool) + r- + [] + + Valgrind + support enabled/disabled. This option is vestigal because jemalloc + auto-detects whether it is running inside Valgrind. This option is + disabled by default, unless running inside Valgrind. @@ -899,27 +990,10 @@ malloc_conf = "xmalloc:true";]]> allocations to be satisfied without performing any thread synchronization, at the cost of increased memory use. See the opt.lg_tcache_gc_sweep - and opt.lg_tcache_max - options for related tuning information. This option is enabled by - default. - - - - - opt.lg_tcache_gc_sweep - (ssize_t) - r- - [] - - Approximate interval (log base 2) between full - thread-specific cache garbage collection sweeps, counted in terms of - thread-specific cache allocation/deallocation events. Garbage - collection is actually performed incrementally, one size class at a - time, in order to avoid large collection pauses. The default sweep - interval is 8192 (2^13); setting this option to -1 will disable garbage - collection. + option for related tuning information. This option is enabled by + default unless running inside Valgrind. @@ -943,31 +1017,21 @@ malloc_conf = "xmalloc:true";]]> [] Memory profiling enabled/disabled. If enabled, profile - memory allocation activity, and use an - atexit - 3 function to dump final memory - usage to a file named according to the pattern - <prefix>.<pid>.<seq>.f.heap, - where <prefix> is controlled by the opt.prof_prefix - option. See the opt.lg_prof_bt_max - option for backtrace depth control. See the opt.prof_active option for on-the-fly activation/deactivation. See the opt.lg_prof_sample option for probabilistic sampling control. See the opt.prof_accum option for control of cumulative sample reporting. See the opt.lg_prof_tcmax - option for control of per thread backtrace caching. See the opt.lg_prof_interval - option for information on interval-triggered profile dumping, and the - opt.prof_gdump - option for information on high-water-triggered profile dumping. - Profile output is compatible with the included pprof - Perl script, which originates from the google-perftools + option for information on interval-triggered profile dumping, the opt.prof_gdump + option for information on high-water-triggered profile dumping, and the + opt.prof_final + option for final profile dumping. Profile output is compatible with + the included pprof Perl script, which originates + from the gperftools package. @@ -985,17 +1049,6 @@ malloc_conf = "xmalloc:true";]]> jeprof. - - - opt.lg_prof_bt_max - (size_t) - r- - [] - - Maximum backtrace depth (log base 2) when profiling - memory allocation activity. The default is 128 (2^7). - - opt.prof_active @@ -1023,8 +1076,8 @@ malloc_conf = "xmalloc:true";]]> Average interval (log base 2) between allocation samples, as measured in bytes of allocation activity. Increasing the sampling interval decreases profile fidelity, but also decreases the - computational overhead. The default sample interval is 1 (2^0) (i.e. - all allocations are sampled). + computational overhead. The default sample interval is 512 KiB (2^19 + B). @@ -1038,28 +1091,8 @@ malloc_conf = "xmalloc:true";]]> dumps enabled/disabled. If this option is enabled, every unique backtrace must be stored for the duration of execution. Depending on the application, this can impose a large memory overhead, and the - cumulative counts are not always of interest. See the - opt.lg_prof_tcmax - option for control of per thread backtrace caching, which has important - interactions. This option is enabled by default. - - - - - opt.lg_prof_tcmax - (ssize_t) - r- - [] - - Maximum per thread backtrace cache (log base 2) used - for heap profiling. A backtrace can only be discarded if the - opt.prof_accum - option is disabled, and no thread caches currently refer to the - backtrace. Therefore, a backtrace cache limit should be imposed if the - intention is to limit how much memory is used by backtraces. By - default, no limit is imposed (encoded as -1). - + cumulative counts are not always of interest. This option is disabled + by default. @@ -1099,6 +1132,23 @@ malloc_conf = "xmalloc:true";]]> option. This option is disabled by default. + + + opt.prof_final + (bool) + r- + [] + + Use an + atexit + 3 function to dump final memory + usage to a file named according to the pattern + <prefix>.<pid>.<seq>.f.heap, + where <prefix> is controlled by the opt.prof_prefix + option. This option is enabled by default. + + opt.prof_leak @@ -1110,51 +1160,11 @@ malloc_conf = "xmalloc:true";]]> atexit 3 function to report memory leaks detected by allocation sampling. See the - opt.lg_prof_bt_max - option for backtrace depth control. See the opt.prof option for information on analyzing heap profile output. This option is disabled by default. - - - opt.overcommit - (bool) - r- - [] - - Over-commit enabled/disabled. If enabled, over-commit - memory as a side effect of using anonymous - mmap - 2 or - sbrk - 2 for virtual memory allocation. - In order for overcommit to be disabled, the swap.fds mallctl must have - been successfully written to. This option is enabled by - default. - - - - - tcache.flush - (void) - -- - [] - - Flush calling thread's tcache. This interface releases - all cached objects and internal data structures associated with the - calling thread's thread-specific cache. Ordinarily, this interface - need not be called, since automatic periodic incremental garbage - collection occurs, and the thread cache is automatically discarded when - a thread exits. However, garbage collection is triggered by allocation - activity, so it is possible for a thread that stops - allocating/deallocating to retain its cache indefinitely, in which case - the developer may find manual flushing useful. - - thread.arena @@ -1162,11 +1172,8 @@ malloc_conf = "xmalloc:true";]]> rw Get or set the arena associated with the calling - thread. The arena index must be less than the maximum number of arenas - (see the arenas.narenas - mallctl). If the specified arena was not initialized beforehand (see - the arenas.initialized mallctl), it will be automatically initialized as a side effect of calling this interface. @@ -1226,144 +1233,102 @@ malloc_conf = "xmalloc:true";]]> mallctl* calls. - - - arenas.narenas - (unsigned) - r- - - Maximum number of arenas. - - - - - arenas.initialized - (bool *) - r- - - An array of arenas.narenas - booleans. Each boolean indicates whether the corresponding arena is - initialized. - - - - - arenas.quantum - (size_t) - r- - - Quantum size. - - - arenas.cacheline - (size_t) - r- - - Assumed cacheline size. - - - - - arenas.subpage - (size_t) - r- - - Subpage size class interval. - - - - - arenas.pagesize - (size_t) - r- - - Page size. - - - - - arenas.chunksize - (size_t) - r- - - Chunk size. - - - - - arenas.tspace_min - (size_t) - r- + thread.tcache.enabled + (bool) + rw + [] - Minimum tiny size class. Tiny size classes are powers - of two. + Enable/disable calling thread's tcache. The tcache is + implicitly flushed as a side effect of becoming + disabled (see thread.tcache.flush). + - arenas.tspace_max - (size_t) - r- + thread.tcache.flush + (void) + -- + [] - Maximum tiny size class. Tiny size classes are powers - of two. + Flush calling thread's tcache. This interface releases + all cached objects and internal data structures associated with the + calling thread's thread-specific cache. Ordinarily, this interface + need not be called, since automatic periodic incremental garbage + collection occurs, and the thread cache is automatically discarded when + a thread exits. However, garbage collection is triggered by allocation + activity, so it is possible for a thread that stops + allocating/deallocating to retain its cache indefinitely, in which case + the developer may find manual flushing useful. - + - arenas.qspace_min - (size_t) - r- + arena.<i>.purge + (unsigned) + -- - Minimum quantum-spaced size class. + Purge unused dirty pages for arena <i>, or for + all arenas if <i> equals arenas.narenas. + - + - arenas.qspace_max - (size_t) - r- + arena.<i>.dss + (const char *) + rw - Maximum quantum-spaced size class. + Set the precedence of dss allocation as related to mmap + allocation for arena <i>, or for all arenas if <i> equals + arenas.narenas. See + opt.dss for supported + settings. + - + - arenas.cspace_min - (size_t) + arenas.narenas + (unsigned) r- - Minimum cacheline-spaced size class. + Current limit on number of arenas. - + - arenas.cspace_max - (size_t) + arenas.initialized + (bool *) r- - Maximum cacheline-spaced size class. + An array of arenas.narenas + booleans. Each boolean indicates whether the corresponding arena is + initialized. - arenas.sspace_min + arenas.quantum (size_t) r- - Minimum subpage-spaced size class. + Quantum size. - arenas.sspace_max + arenas.page (size_t) r- - Maximum subpage-spaced size class. + Page size. @@ -1376,52 +1341,13 @@ malloc_conf = "xmalloc:true";]]> Maximum thread-cached size class. - - - arenas.ntbins - (unsigned) - r- - - Number of tiny bin size classes. - - - - - arenas.nqbins - (unsigned) - r- - - Number of quantum-spaced bin size - classes. - - - - - arenas.ncbins - (unsigned) - r- - - Number of cacheline-spaced bin size - classes. - - - - - arenas.nsbins - (unsigned) - r- - - Number of subpage-spaced bin size - classes. - - arenas.nbins (unsigned) r- - Total number of bin size classes. + Number of bin size classes. @@ -1491,6 +1417,16 @@ malloc_conf = "xmalloc:true";]]> for all arenas if none is specified. + + + arenas.extend + (unsigned) + r- + + Extend the array of arenas by appending a new arena, + and returning the new arena index. + + prof.active @@ -1576,7 +1512,9 @@ malloc_conf = "xmalloc:true";]]> application. This is a multiple of the page size, and greater than or equal to stats.allocated. - + This does not include + stats.arenas.<i>.pdirty and pages + entirely devoted to allocator metadata. @@ -1590,8 +1528,7 @@ malloc_conf = "xmalloc:true";]]> application. This is a multiple of the chunk size, and is at least as large as stats.active. This - does not include inactive chunks backed by swap files. his does not - include inactive chunks embedded in the DSS. + does not include inactive chunks. @@ -1602,8 +1539,7 @@ malloc_conf = "xmalloc:true";]]> [] Total number of chunks actively mapped on behalf of the - application. This does not include inactive chunks backed by swap - files. This does not include inactive chunks embedded in the DSS. + application. This does not include inactive chunks. @@ -1661,6 +1597,20 @@ malloc_conf = "xmalloc:true";]]> + + + stats.arenas.<i>.dss + (const char *) + r- + + dss (sbrk + 2) allocation precedence as + related to mmap + 2 allocation. See opt.dss for details. + + + stats.arenas.<i>.nthreads @@ -1680,7 +1630,7 @@ malloc_conf = "xmalloc:true";]]> Number of pages in active runs. - + stats.arenas.<i>.pdirty (size_t) @@ -1908,17 +1858,6 @@ malloc_conf = "xmalloc:true";]]> to allocate changed. - - - stats.arenas.<i>.bins.<j>.highruns - (size_t) - r- - [] - - Maximum number of runs at any time thus far. - - - stats.arenas.<i>.bins.<j>.curruns @@ -1962,17 +1901,6 @@ malloc_conf = "xmalloc:true";]]> class. - - - stats.arenas.<i>.lruns.<j>.highruns - (size_t) - r- - [] - - Maximum number of runs at any time thus far for this - size class. - - stats.arenas.<i>.lruns.<j>.curruns @@ -1983,65 +1911,6 @@ malloc_conf = "xmalloc:true";]]> Current number of runs for this size class. - - - - swap.avail - (size_t) - r- - [] - - Number of swap file bytes that are currently not - associated with any chunk (i.e. mapped, but otherwise completely - unmanaged). - - - - - swap.prezeroed - (bool) - rw - [] - - If true, the allocator assumes that the swap file(s) - contain nothing but nil bytes. If this assumption is violated, - allocator behavior is undefined. This value becomes read-only after - swap.fds is - successfully written to. - - - - - swap.nfds - (size_t) - r- - [] - - Number of file descriptors in use for swap. - - - - - - swap.fds - (int *) - rw - [] - - When written to, the files associated with the - specified file descriptors are contiguously mapped via - mmap - 2. The resulting virtual memory - region is preferred over anonymous - mmap - 2 and - sbrk - 2 memory. Note that if a file's - size is not a multiple of the page size, it is automatically truncated - to the nearest page size multiple. See the - swap.prezeroed - mallctl for specifying that the files are pre-zeroed. - @@ -2065,10 +1934,9 @@ malloc_conf = "xmalloc:true";]]> This implementation does not provide much detail about the problems it detects, because the performance impact for storing such information - would be prohibitive. There are a number of allocator implementations - available on the Internet which focus on detecting and pinpointing problems - by trading performance for extra sanity checks and detailed - diagnostics. + would be prohibitive. However, jemalloc does integrate with the most + excellent Valgrind tool if the + configuration option is enabled. DIAGNOSTIC MESSAGES @@ -2124,6 +1992,27 @@ malloc_conf = "xmalloc:true";]]> + The aligned_alloc function returns + a pointer to the allocated memory if successful; otherwise a + NULL pointer is returned and + errno is set. The + aligned_alloc function will fail if: + + + EINVAL + + The alignment parameter is + not a power of 2. + + + + ENOMEM + + Memory allocation error. + + + + The realloc function returns a pointer, possibly identical to ptr, to the allocated memory if successful; otherwise a NULL @@ -2196,11 +2085,13 @@ malloc_conf = "xmalloc:true";]]> Experimental API The allocm, rallocm, - sallocm, and - dallocm functions return + sallocm, + dallocm, and + nallocm functions return ALLOCM_SUCCESS on success; otherwise they return an - error value. The allocm and - rallocm functions will fail if: + error value. The allocm, + rallocm, and + nallocm functions will fail if: ALLOCM_ERR_OOM @@ -2259,6 +2150,8 @@ malloc_conf = "lg_chunk:24";]]> 2, sbrk 2, + utrace + 2, alloca 3, atexit