Remember the article about memory management? Ruby heap, Garbage collector, malloc and all this stuff? Guess what, it’s back, but with actionable techniques to upgrade your developer game if you do not make use of them already.
This article is a part of a small series about optimising Ruby performance through memory management tips:
Let’s have a little ‘memory’ refreshment first.
Ruby memory management
In Ruby, almost everything is an object. Therefore almost everything is stored on the Ruby heap which is part of the system heap. Once an object is no longer in use it will be collected by the garbage collector (GC) to free up memory space so the space can be reused by the Ruby VM.
Storing new objects on the Ruby heap has a tiny performance cost, but tiny a thousand / million / billion times is important, therefore, avoiding unnecessary memory allocation is a good practice to have better performing programs.
Now that I have provided with the ‘why’ part, let’s discuss the ‘how’ part.
true
, false
and nil
true
, false
and nil
are highly optimised objects in Ruby that are called “immediate objects”, meaning they are already created in memory and therefore do not require additional memory allocation.
To be really precise, true
, false
and nil
are singleton instances of the TrueClass
, FalseClass
and NilClass
classes (a singleton instance refers to an object that is instantiated only once).
So when you reference true
, false
or nil
in your code, it does not create any new object in the heap, it points to the corresponding singleton instance. This results in less memory consumption and better performance for theses values.
Let me prove it to you using the Ruby core ObjectSpace
module extended with the objspace
library (it deals with internal statistic information about object/memory management):
require 'objspace' puts "True: #{ObjectSpace.memsize_of(true)} bytes" puts "False: #{ObjectSpace.memsize_of(false)} bytes" puts "Nil: #{ObjectSpace.memsize_of(nil)} bytes" puts "Empty hash: #{ObjectSpace.memsize_of({})} bytes"
Resulting in:
➜ (tests) ruby memory_true_false_nil.rb True: 0 bytes False: 0 bytes Nil: 0 bytes Empty hash: 40 bytes
See? true
, false
and nil
do not consume any memory, whereas an empty hash will claim some heap memory because it’s an all-new object.
Now you probably want an actionable technique to put this in use. One I like is default value for optional parameters.
Putting it in use: optional parameters default values
require 'benchmark/ips' data = [] def process_data_v1(data, options = {}) if options[:verbose] puts "Verbose mode enabled" end end def process_data_v2(data, options = nil) if options puts "Verbose mode enabled" if options[:verbose] end end Benchmark.ips do |x| x.report("process_data_v1") { process_data_v1(data) } x.report("process_data_v2") { process_data_v2(data) } end
I would say that the way of declaring process_data_v1
is the easier to read, but is it the most optimised? You guessed it, it’s not, and by a landslide:
➜ (tests) ruby performance_true_false_nil.rb ruby 3.1.2p20 (2022-04-12 revision 4491bb740a) [arm64-darwin22] Warming up -------------------------------------- process_data_v1 1.464M i/100ms process_data_v2 2.089M i/100ms Calculating ------------------------------------- process_data_v1 14.646M (± 1.3%) i/s - 73.223M in 5.000309s process_data_v2 21.169M (± 0.2%) i/s - 106.541M in 5.032972s
(Read this as “the benchmark-ips
gem was able to run process_data_v1
method 14.6 million times per second and the process_data_v2
method 21.1 million times per second on average”)
Here we have witnessed a 45% increase in performance when not using an empty hash as a default value. That’s because we need to allocate memory in the ruby heap when instantiating a new Hash, whereas it’s not necessary for nil
.
That’s it, a plain 45% increase in performance, quite easy when you know Ruby internals, isn’t it? And I am not even assessing memory usage here.
Wrapping it up
But let’s now put this in perspective, it’s a nice optimisation, sure, but is it really significant?
Well, it depends of the context, if you run this method a lot, like a million time per second, well it’s a pretty nice and easy optimisation. But if you run this job once a minute?… Well 🙃
Keep in mind that this optimisation is great, but depending on the context you might better invest your coding time into other more impactful optimisations (like adding some tests). But sure, if the context requires it, that’s an easy one to have in mind 😉.
Resources used:
- Ruby core documentation - objspace - https://ruby-doc.org/3.3.0/exts/objspace/ObjectSpace.html
- Benchmark-ips Github repository - https://github.com/evanphx/benchmark-ips
- Akash Choudhary - Ruby: Memory Internals and Optimization - https://skychoudhary56.medium.com/ruby-memory-internals-and-optimization-1555df696f71