Introduction to the Linux Boot Process

Linux is a popular open-source operating system that is often the choice of developers due to its efficiency and versatility. In this article, we’ll take a closer look at how things work “under the hood” when it comes to Linux systems

What is the boot process in Linux?

Before we begin, it is crucial to understand what the boot process is in Linux. The boot process in Linux refers to the series of steps that occur from the moment you push your power button right until your Linux distribution is fully loaded. While the following guide is generally applicable to all Linux distributions, keep in mind that some minor variations may exist depending on the hardware and Linux distribution you're using.

Linux boot process

Fig 1 The Linux boot process Flowchart

Step 1: BIOS/UEFI Initialization:

Before the boot process begins, AC current is converted to DC current which is used to power the system. Immediately after pushing the power button, you will likely be greeted with BIOS which is a firmware program stored on a chip on the motherboard.

BIOS is responsible for initializing the hardware and loading the bootloader. It performs POST (Power-on-self-test) to check the conditions of the hardware on the system by sending a short signal to all the hardware on the system if the hardware sends the ACK (acknowledgement) back that means the hardware is working and if no ACK is received that means that there is an error with the hardware or the device. BIOS also keeps track of all the hardware and the peripherals in the CMOS memory for the next time the system is booted. If you encounter an error at this step of the booting process, your hardware might be to blame for it

BIOS is also responsible for loading the MBR(Master Boot Record) by looking at the memory sequentially. If the first partition is empty then it checks for the next one. If the MBR is not found then the boot process is halted. Upon successful completion of this step, control of the system is then given to MBR.

BIOS is more quickly being replaced by UEFI (United Extensible Firmware Interface). UEFI acts as a middleman to connect the computer’s firmware to its operating system; it is used to initialize the hardware and start the OS. It stores all of the info about initialization in a special file with an .efi extension which contains info helping the UEFI to directly boot the OS.

Here, both the programs and the architecture are hardware and CPU-independent which increases its reach to a variety of processors and allows some powerful capabilities into UEFI that were not possible via BIOS

UEFI process

Fig 2 UEFI Process

BIOS Process

Fig 3 BIOS process

Step 2: MBR Or GPT:

MBR is information located in the first section of the hard disk which defines how and where the operating system is located and the root file system. It is a crucial component of the system as it contains programs which determine which partition of the system will be used for boot without which the system will not be able to start. The size of MBR is only about 512 bytes.

The main function of MBR is to find the bootable partition. After finding the bootable partition, the next step is loading the secondary bootloader, which is typically GRUB in modern systems. This secondary bootloader has multiple responsibilities. Firstly, it loads the kernel into memory, essential for the operating system to function. Secondly, it gathers real-time information like date and time from CMOS. Additionally, the secondary bootloader creates a disk to load the initramfs (also known as initrd).

The initramfs plays a crucial role in initializing and mounting the root file system. It provides the necessary functionality to drivers, enabling them to locate and operate the devices efficiently.

initramfs

Fig 4 Initramfs overview

While MBR is traditionally paired with BIOS, it is slowly being replaced by GPT which is supported by UEFI. Unlike MBR, which stores the OS data only in one partition, the GPT stores that data in all the partitions making it easy to boot into the system if one of the partitions ever gets corrupted or erased, some of the data can be recovered or partially restored. The traditional MBR system has a size limitation of 2 Terabytes and MBR can only have a maximum of 4 partition tables (although extended partitions can be configured), GPT allows up to 128 partitions. Thus GPT is more corruption resilient and has better partition management but the downside is that it is a lot more complex than the traditional MBR partition system.

It is important to note that the UEFI supports both MBR and GPT but the traditional BIOS only supports MBR

What is a file system in Linux?

File system in Linux refers to the method of storing and organizing data in a human-readable form. The root partition of the file system contains information about unloading the kernel and information about necessary processes and device drivers.

Step 3: Kernel

The kernel (which is usually stored compressed) is unloaded into a virtual file system and after the kernel has been set up, it is loaded into the root file system as soon as initramfs is done mounting it. The control of the device is now given to the kernel

What is a kernel?

The kernel is called the heart of the operating system. It acts as an interface between the computer and the hardware process.

In a system, only read and write operations are permitted on the memory and those are done by the CPU (central processing unit).

The kernel is nothing but a software program that is used to tell the CPU what to do.

Functions of Kernel in Linux

Aside from all the above tasks, Kernel also manages user processes which combine to form the user space (processes). Kernels are responsible for configuring I/O (Input/ Output) devices, memory and certain necessary processes, which are known as init processes. After loading the kernel, init process is started.

what is kernel

Fig 5 An overview of the architecture

Step 5: Init Systems (Sysvinit or Systemd)

What is the init system?

Init systems are programs responsible for starting services which usually include processes to provide the kernel functionality, to load the necessary device drivers, starting up several necessary processes, making sure that starting and stopping are done cleanly etc.

There are two major init systems for Linux:

  1. Sysvinit

  2. Systemd

Sysvinit

Sysvinit was the original init system which had sequential run levels which supported different modes of the operating system.

Run Levels are nothing but scripts that are run at the start and stop of the system and they are usually defined by numbers where each number has its own unique functionality. Sysvinit has 7 run levels which are:

  • Level 0: Used to Halt and shut down the system

  • Level 1: Has a single-user mode and is used to configure interfaces but it doesn't allow any other root origins

  • Level 2: Has multi-user mode but no network interfaces are configured

  • Level 3: Multi-user mode which has daemon running capabilities and network interfaces, it starts all the systems normally

  • Level 4: Undefined/ Left for users to customize

  • Level 5: Has all of the functionality of Level 3 and has x-11 graphical mode and OS manager

  • Level 6: Is used to reboot the system

This init system used to work fine for initial models which did not require any heavy computing. However, as the processors got more advanced and the need for faster start and stop increased, the sysvinit started to quickly lose popularity and was replaced by systemd

What are processes?

If a computer program is nothing but a set of instructions then a process is the execution of those instructions. A program can have more than one process associated with it and a process can be of multiple threads depending on the need or the hardware. Systemd is nothing but an init system to manage user processes and provides replacement for various utilities like logging, network connection management, device management, event management etc. making the life of a system user much less chaotic

Systemd:

Unlike the previous init system, systemd doesn't work in sequential mode, utilizing the power of modern multi-core processors and processing things parallelly. This system effectively eliminates the drawbacks of the previous system such as the time taken for login and it also resolves dependencies, something which was difficult in the previous version. It also supports hotplug devices and caters to the needs of modern technologies by providing the facility for cloud storage. It creates log events and snapshots of the system

The way systemd works is that it starts processing by looking into the rc directory for the corresponding run level. Here, every process is originated by the parent process except the central process, which are important process originated by the kernel systems. When the kernel begins the init process, the system md processes with process id 1 and then goes idle unless called again. It prepares the user space and brings the OS into an operational state by starting all the processes on the system. New processes in the systemd are created using either fork/exec.

Fork is used to copy the parent process. However, the child process doesn't inherit i/o operations or memory locks of the parent. Fork makes a copy of the parent process and still preserves the parent process but they reside in different address spaces

The exec creates a copy of the parent process called child process which then goes on to replace the parent process. Systemd also introduces the concept of units, which is something that systemd can manage for you. There are several types of units such as service unit (used to provide service), target units (group of systemd units), device unit, snapshot unit, timer unit etc

To summarize, systemd is an amazing software suite that can make one’s life easier. If you’re ever curious about systemd, you can always find more information about the same on Wikipedia or systemd’s GitHub

After the init process is done and the user environment is formed, we are now closer to our desktop. The typical desktop environment begins with a daemon, called the display manager, that starts a graphic environment which consists of a graphical server that provides a basic underlying graphical stack and a login manager that provides the ability to enter credentials and select a session. After the user has entered the correct credentials, the session manager starts a session. A session is a set of programs such as UI elements (panels, desktops, applets, etc.) which, together, can form a complete desktop environment and you can now enjoy your OS

Thus, you are booted into your system normally and now can enjoy your favourite Linux distribution!

Step 6: Conclusion

Let us summarize the flow of the Linux boot process

  • Power is turned on and you are greeted by BIOS which performs a hardware check and gives control to MBR

  • MBR loads the secondary boot loader and Kernel and locates the root file system

  • When the kernel is loaded, it starts the init process

  • After the init process, your OS is loaded and you are ready to use it