Oracle EBS 12.2 on a shared file system using NFS Mount points

A customer who recently upgraded from Oracle E-Business Suite release 12.1 to 12.2, suddenly began to experience significant performance degradation during regular maintenance routines, such as stopping/starting the application or cloning and patching.

The Claremont team has upgraded and supported a significant number of Oracle EBS 12.2 systems across multiple platforms and servers and knew from experience what the runtime of these jobs should be and realised that something was not quite right.

The customer's E-Business Suite 12.2 architecture

To set the scene, the customer’s architecture was such that each Oracle E-Business Suite environment had its own dedicated slice of storage presented to a database server. This storage was then shared over a Network File System (NFS) to a pair of load-balanced application servers.

Analysing the performance issues

While routines such as pre-clone, rapid clone, or application bounces were running, we examined the health of the servers themselves. We detected no obvious capacity problems and CPU and memory were always well within tolerance and the Linux machines were never in a stressed state.

We also found that the performance of the online application was never impacted when these jobs were running and users reported no degradation of experience – Forms and OAF screen responded well and the runtime of concurrent jobs was as expected.

The Oracle 19c database performed well with various diagnostics tools and reports were showing no red flags.

The problem appeared to be limited to the application servers and especially for any routine initiated by the Weblogic Admin Server which appeared to be struggling with I/O-related tasks.

Digging deeper into performance degradation

The next step was to run some comparative I/O performance benchmarks between:

The customer’s application server with NFS attached storage vs the customer’s database server with direct attached storage.
The customer’s database server with direct attached storage vs one of our own servers in the Claremont Cloud with direct attached storage.
The customer’s application server with NFS attached storage vs one of our own servers in the Claremont Cloud with direct attached storage.
The customer’s application server with NFS attached storage vs one of our own servers in the Claremont Cloud with NFS attached storage.

What we were trying to isolate here is where the performance difference came from – was the underlying storage subsystem for the customer inherently slower than expected, or was it having that storage presented over NFS that was the issue? And if the NFS storage was the problem, how did that compare with another site with the same setup?

Several tools were used in this phase of the analysis – dd, ioping, iostat and iozone – for both read and write performance tests.

For example:

time sh -c "dd if=/dev/zero of=ddfile bs=8k count=2000000 && sync" (to create a file of 16 GB and sync it to the disk and report how long it takes to write)

time dd if=ddfile of=/dev/null bs=8k (to report how long it takes to read the file)

/opt/iozone/bin/iozone -I -l 1 -u 1 -r 8k 8-s 10M -F <NFS_mount>/iozonetmpfile (to get an iozone analysis on different aspects of the performance of the NFS mount)

The same set of tests was run on each server in the comparison set, and the results were then recorded, collated and analysed.

Identifying the performance bottleneck in Oracle E-business Suite release 12.2

The I/O performance of the customer’s database server compared pretty well with the direct-attached storage machine in the Claremont Cloud.

Write throughput 657MB/s (customer) vs 634 MB/s (Claremont)
Read throughput 2.4GB/s (customer) vs 2.6 GB/s (Claremont)

This suggested that the problem was not with the customer’s storage subsystem itself.

Performance was seen to drop off when the tests were run on the servers with NFS attached storage, but the level of that drop-off was much more significant for the customer’s server compared to the Claremont one, most markedly for read performance.

o Write throughput 106MB/s (customer app) vs

657MB/s (customer database) – 619% slower
433MB/s (Claremont NFS) – 408% slower

o Read throughput 34MB/s (customer app) vs

2662MB/s (customer database) – 7830% slower
434MB/s (Claremont NFS) – 1276% slower

These headline figures were backed up with a drill-down analysis into individual tests which all showed a huge drop in write and (especially) read performance for the customer’s application servers with NFS attached storage that we just didn’t see with our ones in the Claremont Cloud.

We had now been able to isolate the problem and quantify the impact with actual vs expected performance.

NFS Mount Point options

The next step of the investigation was to determine why the customer’s system experienced such a significant drop-off in performance on NFS attached storage, compared to the equivalent system in the Claremont Cloud which did not.

One obvious difference straight away was the mount point options used on the application servers.

On the Claremont server, we were using NFS mount options:

rw,nointr,bg,hard,timeo=600,wsize=65536,rsize=65536 0 0

But the customer was using:

rw,hard,intr,bg,timeo=600,rsize=32768,wsize=32768,nfsvers=3,tcp,nolock,acregmin=0,acregmax=0

What do all these mean?

The different values for rsize/wsize looked therefore to be significant, as was the settings for acregmin and acregmax.

Oracle recommendations for sharing the application tier file system in Oracle E-Business Suite release 12.2

My Oracle Support Note Sharing The Application Tier File System in Oracle E-Business Suite Release 12.2 (Doc ID 1375769.1) has a section right at the end about recommended mount options for NFS for Oracle E-Business Suite 12.2.

Our customer was using NFSv3 on Linux so the highlighted section was of interest to us.

This matched with the system we had in the Claremont Cloud but not with the customer’s system.

Furthermore, the comment history of this note threw up some red flags for the settings in use on the customer’s servers.

This was looking very much like the NFS settings in use when the customer’s system was on E-Business Suite 12.1 had followed Oracle’s recommended best practice at the time, but subsequently, Oracle had revised their guidance for release 12.2.

Making the Changes

As the customer’s architecture had a separate LUN per environment, it was an easy thing to pick a development environment, shutdown the application, change the NFS mount point options to align with Oracle’s recommendation and Claremont’s real-world example, and then see if it made any difference.

The Results

Following the NFS mount point change, overall read performance - which was where the most serious problems were before - improved considerably:

35MB/s before (482 seconds elapsed time), 138MB/s after (118 seconds elapsed time)

More consequentially, all of the routines and tasks which had been taking too long before all saw significant improvement in runtimes:

Apps preclone before the change took 34 minutes to complete but 10 minutes after the change.

15 minutes to stop the application before the change but 3 minutes afterwards.
The Main stage of the apps clone took up to 10 hours before the change, but less than 2 hours afterwards.
Actions required by Weblogic Admin server done as part (startup/shutdown, preclone, deleting managed servers, fs_clone) all significantly quicker.

Following the initial trial on the first environment, the changes were rolled out to all of the other instances, including production.

The benefits of improving Oracle E-Business Suite release 12.2 performance

In business terms, these changes meant that:

Clones could be completed in under a day rather than the 2 days they were taking previously.
Less downtime was required for patching and releases because the shutdown/restart of the application was quicker.
Much less time needed for the DBAs to spend on mundane and routine tasks.
And ultimately another happy customer.

In the end, the changes that were required were trivial – but these small changes had a big impact. The Claremont team followed a thorough analysis to identify what the issue was and ultimately resolve it.

Resources:

Choosing the right Managed Services Provider

If you are looking for an Oracle Partner who can help you with the performance of your Oracle EBS System, goes about it the right way, and can back up the talk, then contact us.

And if you would like to find out more about NFS Mount Points or have a question, you can email us at info@claremont.co.uk or phone us on +44 (0) 1483 549004.

12 Oct 2022

How to improve the performance of Oracle E-Business Suite release 12.2 on a shared file system while using NFS Mount points