Child pages
  • Tuning the JVM
Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 3 Next »

Caveat: These are recommendations only and it is recommended that customers refer to Oracle and other tuning documentation, and use a performance testing optimisation strategy to optimally configure IG according to deployment. Only an incremental optimisation strategy will determine the optimal configuration for a given deployment and user load.

Before conducting your tuning exercise:

  1. Define performance targets.
  2. Determine which version of the java JVM is available. More recent JVMs usually contain performance improvements of their own, especially in the area of garbage collection.
  3. Selection of a 64-bit JVM supports much more available memory.
  4. Select a garbage collector according to your needs and limitations. Bear in mind that garbage collectors have differing algorithms to suit particular goals and reduce the need to "stop the world". What is right for one implementation may not be right for another. You should test and compare with realistic scenarios and load in a pre-production environment.

Memory

As a proxy, IG is generally a low memory consumer, proxying request and response data between the client and server. Caching (notification, session and request) would require revising memory settings. 

  • IG largely consumes YoungGen space so JVM memory settings should be focussed there.
  • IG is a low user of OldGen memory, which remains largely static after startup. Note that, if IG is proxying large resources then you may see more consumption here - as OldGen is used to house larger objects. Again, caching will likely increase the need for larger available OldGen space.
  • Set PermGen/ Metaspace appropriately:

    • PermGen was renamed Metaspace in java 8, with some changes.

    • Be aware that PermGen has a default maximum size (84Mb on a 64-bit JVM) while the Metaspace default is unlimited.

    • Note also that when the JVM resizes these spaces, a full GC must be done, which is expensive, so is worth monitoring usage during performance testing.

    • Incremental testing with anticipated load in a pre-production environment should easily identify the maximum required non-heap memory:
      • Setting this too low will result in OutOfMemoryErrors. Setting too high wastes memory.
      • Before java 8, inlined strings were also held in the PermGen space - adding a degree of variability that made determining the appropriate value difficult. This is not the case with later versions.

Memory Options

While configuring memory is a trial and error process, and best optimised incrementally, the following memory options are worth considering:

  • -server  ensures the JVM uses server-optimised configuration, compilation and execution. This is the default with a 64-bit JVM.
  • -XX:PermSize=<size>G and -XX:MaxPermSize=<size> to configure PermGen space (java 7).
  • -XX:MetaspaceSize=<size>G and -XX:MaxMetaSpaceSize=<size>G to configure Metaspace (from java 8).
  • -Xms<size>G  and -Xmx<size>G  to configure the initial and maximum heap space:
    • Oracle recommend that these figures are set the same to avoid expensive allocation operations. However, recent improvements in G1GC with its adaptive optimization algorithm (see below) means that this limit to vertical scaling may no longer be advisable. If using G1GC, consider using individual initial and maximum values.
    • Incrementally test to optimise, with anticipated concurrent load in pre-production testing. Perhaps set initial to 5Gb and maximum to 10Gb to begin with and observe if the JVM ever needs to increase the heap-size beyond 5Gb. If so, incrementally test and increase the initial size and determine the level at which the JVM no longer needs to increase available heap space. Use this value to determine your initial and maximum, depending on garbage collector selection.
    • As a start point, IG has been shown to operate well in performance testing with a heapsize of 5Gb. That is, with this sized heap, even with more memory available, a memory increment was never done.
  • -XX:NewSize=<size>G and -XX:MaxNewSize=<size>G to configure the initial and maximum YoungGen space:
    • This configuration is important to IG as a large consumer of Eden space.
    • Incrementally test to optimise, with anticipated concurrent load in a pre-production environment. The maximum should be within the overall allocated maximium heap-size (-Xmx), leaving space for non-heap memory.
  • -XX:+UseStringDeduplication to prevent String duplication and conserve memory (java 8u20+).

The following provides a example set of JVM options:

Memory sizing
# memory: heap size - if using G1GC then set individual initial and max values \
-Xms5G -Xmx5G \
# memory: younggen space size \
-XX:NewSize=2G -XX:MaxNewSize=4G \
# memory: non-heap sizing - java 7 \
-XX:PermSize=100M -XX:MaxPermSize=200M \
# memory: non-heap sizing - java 8+ \
-XX:MetaspaceSize=100M -XX:MaxMetaspaceSize=200M

Garbage collection (GC)

Selecting the right garbage collector and tuning collection can reduce expensive "stop the world" pauses in collection. As IG is largely a consumer of YounGen memory, there is a lot of scope for tuning to avoid expensive major collections (OldGen space consumption).

As mentioned, it is generally advisable to select from the most recent available, stable garbage collectors available for the given JVM version. That's not to say the latest collector is necessarily the best, as each has its own goals and algorithms supporting that goal. For example, The Parallel GC aims to increase throughput whereas GCG1 aims to reduce latency. It is therefore advisable to test and compare.

Note, the Concurrent Mark Sweep (CMS) collector has been omitted having been deprecated in java 9.

Parallel

  • Default Hotspot collector in java 8: - in resource-rich environment (multiprocessor, memory availability).
  • Provides best overall performance in a multiprocessor environment.
  • Multiple threads for managing heap-space speeds up collection process.
  • Freezes application threads during GC.
  • -XX:MaxGCPauseMillis supports configuration of the ideal max time to pause in GC. This is purely a goal and not guaranteed. It does, however, allow some configuration between throughput (longer pauses) and latency (shorter pauses). Test values between 500ms - 2000ms.
  • -XX:GCPauseTimeInterval supports configuration of the ideal max time between pauses in GC. This is purely a goal and not guaranteed. It does, however, allow some configuration between throughput (longer pauses) and latency (shorter pauses). Test values between 500ms - 2000ms.
  • -XX:GCTimeRatio supports configuration of the optimal ratio between time in GC and application time. This ratio is 1% by default and should not be configured to be greater than 5%.
  • Adaptive sizing of each generation can be controlled using parameters -XX:YoungGenerationSizeIncrement-XX:TenuredGenerationSizeIncrement and -XX:AdaptiveSizeDecrementScaleFactor.
Parallel GC configuration
# gc: use Parallel GC \
-XX:+UseParallelGC \
# number of threads \
-XX:ParallelGCThreads=<size> \
# target max time in millis to pause (desirable) \
-XX:MaxGCPauseMillis=<size> \
# ratio of time spent collecting \
-XX:GCTimeRatio=<size> \
# percentage to increase young generation on increment - default is 20%
-XX:YoungGenerationSizeIncrement=<%> \
# percentage to increase old generation on increment - default is 20%
-XX:TenuredGenerationSizeIncrement=<%> \
# percentage to decrease generations on decrement - as a percentage of the increase - default is 5%
-XX:AdaptiveSizeDecrementScaleFactor=<%>

Garbage-First (G1GC) 

  • Available since java 7u4 and the default HotSpot collector since java 9 (in resource-rich environments).
  • A "Mostly Concurrent Collector", meaning a parallel, concurrent, and incrementally compacting low-pause collector.
  • Replaces the ConcurrentMarkSweep (CMS) collector (deprecated in java 9).
  • Concurrent collector, which conducts expensive operations concurrently with the application threads. 
  • Designed for multiprocessor environments with large available memory.
  • Partitions heap into equal sized regions, each with a contiguous range of virtual memory
  • Minimises collection pauses in multiprocessor, memory-rich environments.
  • Uses an internal adaptive optimization algorithm to manage available heap-space, meaning it's adaptive to fast-memory growth/ spikes in load but also scales efficiently with lower load. With G1GC, this overrides -Xms  increments.
G1GC configuration
# gc: use G1GC \
-XX:+UseG1GC \
# number of threads \
-XX:ParallelGCThreads=<size> \
# target max time in millis to pause (desirable) \
-XX:MaxGCPauseMillis=<size> \
# ratio of time spent collecting
-XX:GCTimeRatio=<size>


# make maximum use of available physical memory - but be cautious of wasteful use/ reduced vertical scalability
-XX:-AggressiveHeap

References

  1. DZone: Choosing the Right GC
  2. Oracle Garbage Collection Tuning Guide
  3. Oracle G1GC Tuning Guide
  4. JVM Tuning with G1GC by @marknienaber on medium.com
  • No labels