What are huge pages anyway?

To answer this question we can take the dictionary definition (well, Wikipedia!) for a memory page:

“A page, memory page, or virtual page is a fixed-length contiguous block of virtual memory, described by a single entry in the page table. It is the smallest unit of data for memory allocation performed by the operating system on behalf of a program, and for transfers between the main memory and any other auxiliary store, such as a hard disk drive.”

And so what is a “huge page” – well, it’s a page…. that is huge.

By default, when a database is allocated memory within which to store its SGA (System Global Area) Oracle Linux will chop up the allocated memory (or RAM) into a bunch of 4k pages.  So if we think of a pretty ordinary database with a 4Gb SGA, we’re talking 1048756 individual pages – the problem being that the Linux kernel has to do some legwork in managing and maintain each page as memory is written and read.

In addition to this, linux will store a “page table” – that is used similar to a database index to rapidly access the required memory pages.  The larger the number of pages we have, the more memory this page table will require to store a record for each page, and similarly the longer it will take to sift through the list of pages to find the one we want.

When configuring huge pages, the page size jumps from 4k up to 2Mb (and can be configured up to 1Gb if the hardware is capable, but let’s take 2Mb here as an example).  This means that our 4Gb SGA is now comprised of only 2048 individual pages – clearly less work for the linux kernel to do!

In addition to this, the way linux handles huge pages means that they are “pinned” into the server memory, so under load the database SGA will never be “aged out” so any swapping activity on the server should have a minimal impact on the database itself.

Great! So how do I set it up?

Well, fortunately Oracle provide a handy script (search for hugepages_setting.sh) that can be used to work out what huge pages settings you have, so let’s start here:

#!/bin/bash
#
# hugepages_settings.sh
#
# Linux bash script to compute values for the
# recommended HugePages/HugeTLB configuration
#
# Note: This script does calculation for all shared memory
# segments available when the script is run, no matter it
# is an Oracle RDBMS shared memory segment or not.
# Check for the kernel version
KERN=`uname -r | awk -F. ‘{ printf(“%d.%d\n”,$1,$2); }’`
# Find out the HugePage size
HPG_SZ=`grep Hugepagesize /proc/meminfo | awk {‘print $2’}`
# Start from 1 pages to be on the safe side and guarantee 1 free HugePage
NUM_PG=1
# Cumulative number of pages required to handle the running shared memory segments
for SEG_BYTES in `ipcs -m | awk {‘print $5’} | grep “[0-9][0-9]*”`
do
MIN_PG=`echo “$SEG_BYTES/($HPG_SZ*1024)” | bc -q`
if [ $MIN_PG -gt 0 ]; then
NUM_PG=`echo “$NUM_PG+$MIN_PG+1” | bc -q`
fi
done
# Finish with results
case $KERN in
‘2.4’) HUGETLB_POOL=`echo “$NUM_PG*$HPG_SZ/1024” | bc -q`;
echo “Recommended setting: vm.hugetlb_pool = $HUGETLB_POOL” ;;
‘2.6’ | ‘3.8’) echo “Recommended setting: vm.nr_hugepages = $NUM_PG” ;;
*) echo “Unrecognized kernel version $KERN. Exiting.” ;;
esac
# End

So, we take this script and run it against our system with all the oracle services running:

[root@claremont~]# ./hugepages_settings.sh
Recommended setting: vm.nr_hugepages = 64006

So we should do as it says and configure 64006 huge pages in order to store the current memory requirements of the system in huge pages.  This is set by adding an additional line “vm.nr_hugepages=64006” to the /etc/sysctl.conf file and flushing the config through by running  “sysctl –p”.

At this point we will see that the huge pages have been configured ready for use but are not actually being used.  This is because we need to configure the memory limits on the server to allow for their use and also bounce the database so that it can detect that huge pages are there to be used.  So first we update /etc/security/limits.conf to add the following:

* soft memlock 131084288
* hard memlock 131084288

The value here can be set to any amount up to the maximum available memory on the server, but as a minimum must be set to the size of memory that your huge pages configuration will use – ie, number of huge pages multiplied by huge page size – in this case 64006*2048.  Next we bounce the database so that it can make use of the huge pages and now we can verify that huge pages are actually in use (ie the number of free pages is fewer than the total number of pages!) with:

[root@claremont ~]# cat /proc/meminfo | grep Huge
HugePages_Total: 64006
HugePages_Free: 1548
HugePages_Rsvd: 1538
HugePages_Surp: 0
Hugepagesize: 2048 kB

Is that it?

Well, not entirely… there are some other things to consider:

  • If there are insufficient huge pages to house the entire Oracle SGA, Oracle will not use ANY huge pages at all.  This can be combatted by setting the database parameter “use_large_pages=only” – with this parameter set, the database will fail to start up if it is unable to allocate huge pages.
  • You cannot use AMM (ie, MEMORY_TARGET) with huge pages – you are limited to the use of SGA_TARGET
  • Linux 6 variants use “Transparent Huge pages” which is a mechanism of allocating huge pages at runtime rather than pre-allocating at system boot (and hence reserving them) – Oracle is unable to make good use of this and suffers performance issues – so transparent huge pages should be disabled in a Linux 6 environment (see MOS note 1557478.1 for more details)
  • If you modify the “memory landscape” on the server – the SGA size or the total amount of memory available to the server (quite common in a virtual environment) – you will need to recalculate and reconfigure huge pages – a recent scenario on a customer system saw us reduce the memory allocation to a virtual machine, leaving huge pages at their existing setting meant that the server reserved a large portion of the memory for huge pages, leaving very little for non-huge pages processes (eg application tiers) – this caused large amounts of swapping activity and severe performance issues.

Mike Sowerbutts

Managing Consultant

Mike is responsible for Claremont’s DBA delivery function across both Consultancy and Managed Services practices.In addition, he is responsible for the maintenance and enhancement of Claremont’s internal IT infrastructure.

Share This