echothrust/howtos

A list of OpenBSD (mostly) material

View on GitHub

[WIP] OpenBSD Virtual Memory Statistics with vmstat

Introduction

As a subscriber of the openbsd-misc mailing list, often times i come across a mail (usually from someone far more knowledgeable) that ends up creating more questions than answers in my head. This case was no different a mail from Claudio Jeker on the thread with subject OpenBSD SMP - BGPd - send_rtmsg: action 1, prefix A.B.C.D/24: No buffer space available - panic: malloc: out of space in kmem_map

The mail that got me into all this was the following, which was in response to some attachments from vmstat output under different condition1

So the problem is that the malloc space is filled by
a) 26540K of devbuf -- because of the multiqueue support in ixl
b) 63493K of ACPI -- what the heck ACPI?!?
and then there is not enough space for rtable. A full table requires
in your example 50816K of rtable malloc space.

Now on amd64 all of this needs to fit into 128MB which is impossible.

You can use config(8) and bsd.re-config(5) to adjust the nkmempg variable
to something like 131072 (which is 4 times the default size).
This can be verified with `sysctl vm.nkmempages`

Now ixl(4) and ACPI should not be such pigs but in the end 128MB of kernel
malloc space is just stupidly small on a system with 128GB of memory.

The alternative is to set "option NKMEMPAGES=131072" in your GENERIC
config file (or option NKMEMPAGES_MAX=131072). See also options(4).

Long term is the fix this proper. All of this was built when computers had
100MB of memory not 100GB.

So I went on a quest find out the following:

nkmempg: kernel malloc area in PAGE_SIZE-sized logical pages

Display the current nkmempg number of PAGE_SIZE-sized logical pages for your system sysctl vm.nkmempages (check options(4))

The default PAGE_SIZE for amd64 is 4096 and is calculated under amd64/param.h as such

#define PAGE_SHIFT 12
#define PAGE_SIZE	 (1 << PAGE_SHIFT)

So in order to estimate how much memory will be allocated for the kernel pages we can do the following

PAGE_SIZE=$((1<<12)) # calculate the pagesize
PAGE_SIZE=$(sysctl -n hw.pagesize) # alternatively get the current pagesize as defined by the running hardware
NKMEMPAGES=$(sysctl -n vm.nkmempages) # get the current nkmempages
NKMEMPAGES_BYTES=$(($NKMEMPAGES*$PAGE_SIZE)) # multiply by PAGE_SIZE to total bytes
NKMEMPAGES_MB=$(($NKMEMPAGES_BYTES/1024/1024)) # divide by 1024 twice to get MB

This gives us the (default) size of 128MB.

Checking virtual memory statistics with vmstat

The manual page for vmstat(8) states the following

vmstat reports certain kernel statistics kept about process, virtual memory, disk, trap, and CPU activity. The default behavior is to print a one-line summary of these statistics. The -c and -w flags may be used to continually report summaries

among its options, the one that we will be focusing on is -m which the manual states the following

-m Report on the usage of kernel dynamic memory listed first by size of allocation and then by type of usage.

When running vmstat -m we get presented with 4 different views

The size for the rtable can be found at the original message and is derived by the output produced by vmstat2 with support for single processor, under the section Memory statistics by type. The sizes for ACPI and devbuf can be found on the output produced by vmstat3 with support for symmetric multi processor systems under the same section.

The columns of interest for deducing these numbers are Type, InUse, MemUse and HighUse which correspond to the following

NKMEMPAGES and NKMEMPAGES_MAX

Size of kernel malloc area in PAGE_SIZE-sized logical pages. This area is covered by the kernel submap kmem_map. The kernel attempts to auto-size this map based on the amount of physical memory in the system. Platform-specific code may place bounds on this computed size, which may be viewed with the sysctl(8) variable vm.nkmempages.

See /usr/include/machine/param.h for the default upper bound. The related option NKMEMPAGES_MAX allows the bounds to be overridden in the kernel configuration file in the event the computed value is insufficient resulting in an out of space in kmem_map panic.

NMBCLUSTERS

The following is taken from a comment by Henning Brauer :man_bowing: on undeadly.org5

Network data is traversing through the system in so-called mbufs, memory buffers of a fixed size - 256 bytes if memory serves me right. Lots of them linked together (linked list) represent a packet then.

As most stuff the kernel needs to do with network data only needs access to the header data and this is completely withing the first mbuf usually, that is very efficient.

HOWEVER, the payload may be bigger, and it would be very inefficient to split it into 265 bytes chunks. That’s why mbuf clusters exist, which are bigger - 2048 bytes if memory serves me right, and the mbufs only contain a reference to the mbuf cluster then.

This memory space is only needed as long as the network data is traveling through the system. Once it is send out on some NIC or written to some socket where some deamon is listening, or dropped for whatever reason, the mbufs and mbuf clusters occupied by the packet are freed.

If they are not freed then we have a mbuf leak, thus, the amount of NMBCLUSTERS you need is usually very small. I have exactly 3 machines out of 60 or so that need increased NMBCLUSTERS

oh, and the mbuf(9) manpage is a very good read on that.

References