JRuby on Windows: Using the G1 Garbage Collector Helps

When running with JRuby, you have a few options for selecting the Garbage Collector. This post shows one observation on the memory consumption depending on the garbage collector and JAVA version.

Much of the post is based on the conversation in this GitHub issue and based on comments by Charles Oliver Nutter.

Background

We used and compared Apache POI (the Java API for Microsoft Documents) and the xlsxtream gem to generate a simple XLSX file (1M rows x 20 cells). Xlsxtream is a streaming writer for XLSX files, written in pure Ruby and therefore ran fine on JRuby.

It appeared that Xlsxtream running on JRuby took a lot more memory. Observing memory casually from Windows Task Manager, I noticed that:

  • JRuby consumes between 1.6GB – 2.0GB
  • CRuby 3.0 consumes between 15GB – 20GB

I’m used to JRuby taking 8x – 10x more memory than CRuby but this seemed like a case where it was closer to 100x of CRuby. This was with no start-up parameters in place, no limiting of the max heap, and the numbers simply read from Task Manager. To get the source code, take a look at my rb-xlsx-converter-comparison repository on GitHub and read the README and specifically look at the code in use_xlsxtream.rb.

The value that I was looking at in the Task Manager is “Memory (active private working set)” which Windows shows as “The amount of physical memory in use by the process that that can’t be used by other processes (excludes suspended UWP processes)”.

Looking towards the GC

Charles ran the code and commented that the overall process size (Real memory) is under 500MB, but the “virtual” memory size is nearly 10GB. This number is what the JVM has requested be available for its maximum heap size, should it need to grow that large.

He then switched to the parallel GC and saw considerably wilder memory use with high heap spikes approaching 2GB. He then commented that the G1 GC seems to have much better memory-utilization characteristics, and never exceeded a 400MB total heap while usually using less than half of it.

He also mentioned that Java 8 uses the Parallel GC by default. Java 11+ use G1.

Based on his recommendation, I did both:

  • Try using a newer release of Java – I used Java 11 and Java 16
  • Switch to the G1 GC by passing -J-XX:+UseG1GC to the JVM via JRuby command line

The results are in the picture below. Observing the same characteristic on Windows (the second last column from the right), we see that it is considerably lower than the value on Java 8 with the default GC (first set of numbers) and then down to different numbers.

Conclusion

If you must use Java 8, take a look at the memory consumption and see if the G1 GC helps in your case. It’s easy to do. Just start jruby with the flag specified as in: jruby -J-XX:+UseG1GC use_xlsxtream.rb. Of course, if you’re using a newer version of JAVA, it will likely use G1 as the default anyway.

I hope you find the post useful. If you have any other information to add, please let me know by commenting below and I will try to update the post.

comments powered by Disqus