Process management in Ruby

Ruby

Published on by

Mathieu EUSTACHY

Mathieu EUSTACHY

7 minutes reading

You probably already heard this word "process". But do you know exactly what is a process in your operating system? How it's different from an application? What are the stack and the heap?


Application, process, heap, stack, wow that's a lot of triggering words. Ever wondered what they are and how they are translated in the Ruby world? 💎


Let’s dive into it!


This article is the 2nd article of a broader series about “low-level” computing concepts applied to Ruby.

  1. What is a Ruby implementation?
  2. Process management in Ruby
  3. Concurrency and parallelism in Ruby
  4. Thread management in Ruby
  5. Memory management in Ruby


Always keep in mind that I am intentionally summarising things to give you a quick overview, there is more to each concept.



Application vs Process


Let's start with two fondamental definitions:

  • Application: A file containing a list of instructions stored on the disk (= an executable)
  • Process: An instance of an executing application


To put it simply, when launched, the application is loaded into the memory and it becomes a process. You can launch several processes of the same application, like when you have several Google Chrome browsers opened, or, in a more prosaic way, you running web server might use several processes, each being an instance of your web application (but more on that in the next article about concurrency and parallelism).



An application is basically something like a Class, and a process an instance of said Class.



What is a process?


A process encapsulates all the data for running the application, it is composed of several states:

  • Static states (they can also be called as ‘the static memory’)
  • Text
  • The code of the program (remember that you use a kind of elaborate text editor to write code at first) 
  • Data section
  • The global variables and data which are available when the process is first initialised, for an executing Rails app it could be environment variables for example (that's why you have to restart your server when you update your environment variables: it's part of the static state)


  • Dynamic states, they grow and shrink during execution (they can also be called as ‘the request processing memory’)
  • Stack
  • Used for managing several things, such as:
  • the execution flow of a program, which is called the call stack, it ensures that when the function completes its execution, control returns to the point in the program from which the function was called
  • the storage of local variables and function parameters, it allocates memory to store them when the function is called
  • the storage of function call information, it stores various pieces of information that are needed to manage the function call and its subsequent return (return address, saved CPU registers, and a few other things)
  • the function definitions
  • Memory on the stack is automatically allocated and deallocated as functions are called and return (more on memory in a coming article of this series)
  • Analogy: think of the stack as a stack of plates in a kitchen. You add a plate to the top when you start stacking, and you always take away the last plate you added. The stack of plates represents the execution of your program. Each plate corresponds to a function currently running. When you finish a function, you remove the top plate (end of the function).


  • Heap
  • Used for allocating memory dynamically
  • It’s basically about storing objects most of the time
  • In Ruby, everything is an object, therefore everything is stored on the heap.
  • So the local variables stored on the stack, what we have just mentioned earlier, are references to objects stored on the heap (ouch…) 
  • Second ouch: forget my previous statement, not every local variable value is exactly independently stored on the heap in Ruby: nil, false, true, and small integers within the Fixnum range (2**62), are handled with specific optimisations to avoid unnecessary heap allocation. (Ouch *3 …). Meaning if you declare several local variables whose values are nil, they will all point to the same nil singleton instance of the NilClass class on the heap 🙃
  • Memory on the heap requires explicit allocation and deallocation, and its scope is not bound to the function or block where it is allocated (unlike the stack!).
  • In Ruby we do not manage memory allocation or deallocation, unlike programming languages like C or Rust. It’s the garbage collector (GC) role to allocate memory and then to identify and reclaim it when no longer in use (more on memory in a coming article of this series).
  • Analogy: think of the heap as a storage area where you put objects that you're not sure about their duration of use. It's like having a shelf where you place items you want to keep for a longer time. For example, images you download in an application could be stored in the heap because they might be needed for some time, even if the function that loaded them is finished.


I hope I did not lose you with the previous statements. They can be quite hard to grasp because they seem very obscure and they are not explicitly available when you write Ruby code.

But they are crucial to understand the underlying logic in Ruby, and at the end, make you a way better developer.


Below the most commonly used schema to illustrate it:



The arrows represent the dynamic behaviour of the stack and the heap.



Ruby example


Okay, now let me summarise everything that we have just learned with a Ruby example:


# process_example.rb

# Global variable
$global_variable = "I am a global variable"

def heap_example(x)
 hash = { anything: x }
 array = [x, hash]
end

def my_program
 x = 5
 heap_example(x)
end

my_program



Here we have a ruby file which is an application (remember it’s an executable), when we run ruby process_example.rb in our terminal it will load this file in the memory and create a process.


This process have 2 static states which are:

  • a Text, which is the text content of our process_example.rb
  • a Data section, which is composed of the $global_variable variable


Then, the process will read the file and start growing and shrinking its dynamic states (stack and heap).


It will : 

  1. Start growing the stack by tracking the heap_example and my_program methods in the stack so it knows where they are located to execute them later
  2. Execute the my_program method
  3. Add the my_program stack frame on top of the stack
  4. function call information => it should return at the my_program line after executing the method
  5. local variables => it stores the x object to the heap and its reference to the stack frame
  6. Execute the heap_example method with x parameter
  7. Add the heap_example stack frame on top of the stack
  8. function call information => it should return at the heap_example(x) line after executing the method
  9. Execute each line of code within the method
  10. it stores the hash and array objects to the heap and their references to the stack 
  11. Arrive at the end of the method, execute it, return the value of last line ([5, 10])
  12. Remove the heap_example stack frame, therefore remove hash and array objects references (but the objects still exist on the heap for a short time after!)
  13. Arrive at the end of the method, execute it, return the value of last line ([5, 10])
  14. Remove the my_program stack frame
  15. At some point, the Garbage Collector will remove hash and array objects from the heap
  16. Arrive at the end of the file, finish execution



That’s it!


You now have a better understanding of what happens behind the scenes when you run a Ruby program. You properly better understand what are the stack and the heap and why these concepts are so important for the next article of the series that will talk about threads, concurrency, memory, etc.


As for the next article of this series, it will delve into concurrency and parallelism, an intro before heading to the thread management in Ruby article.



Get ready for it!





Resources used:





My last articles