Boot Procedure

  1. The primary bootloader, which is usually a fixed part of the hardware recognizes bootable media.
    • On PC compatible systems, UEFI or BIOS are the primary bootloaders.
    • On BIOS systems, the Master Boot Record (MBR) area of a connected storage device is searched for a boot executable. The MBR is a special area, separate from the actual data storage area. BIOS is an old system, dating back to the early 1980s, but not yet obsolete.
    • On a UEFI system, which supports storage media with much larger capacity, the GPT partition table is used to determine if the storage device provides an EFI System Partition containing the folder /efi with a subfolder that contains the secondary boot loader. /efi/microsoft or /efi/grub, for example. After a Linux system is booted by UEFI hardware, you can open a terminal and inspect the partition table by typing sudo gdisk /dev/sda.
  2. If a secondary bootloader is found on the bootable media, it is loaded.
    • On PC compatibles the secondary bootloader is usually GRUB2.
    • On Arm Cortex-A based systems, this is often U-Boot.
    • Some systems have proprietary bootloaders. Raspberry Pi, for example.
    • To support the bootloader, a standard interface called LBA (linear block access) exists for storage media. This interface is not optimized for the capabilities of the storage device but is sufficient for the bootloader to search for a bootable operating system, which will later load the appropriate driver for the storage media.
  3. The secondary bootloader loads the Linux kernel.
    • In the case of GRUB2, the boot media is searched for the boot partition using LBA mode, stage 2 is loaded from a separate data area of the boot media, the module for working with the boot partitions filesystem is loaded, the GRUB config file (/boot/grub/grub.cfg) is loaded from the boot partition, the user is optionally shown a GRUB menu, the kernel is loaded with the user parameters.
  4. The kernel optionally loads an initial root file system into memory (often initramfs), which includes the device drivers specific for the target hardware. For example, display drivers for showing graphical output during the boot sequence.
    • Technically, all hardware-specific device drivers can be compiled into the kernel itself. In order to keep the kernel generic, however, and to avoid mixing components with incompatible distribution licenses (the GPL 2.0 license used by Linux prevents distributing a kernel with built-in proprietary drivers), certain drivers are placed into initramfs, from where they are loaded when needed.
    • After the hardware is initialized, the kernel switches from initramfs to the real filesystem.
  5. The kernel starts the first process, the init process. The init process starts all the system processes (a.k.a. services) configured to run when the system launches.
    • init used to be a shell script located in the filesystem at /sbin/init. Nowadays, many Linux distributions use the systemd initialization method, where a binary executable is called instead of a shell script.


systemd

Unit files

Unit files determine what kind of task is being started. The types of tasks are

  • service for running daemons
  • mount for mounting storage media, etc
  • automount for mounting directories when needed
  • timer for starting jobs at specific times
  • target (a group of unit files, or a state that the machine should be in after booting)
  • path (monitoring filesystem events)

The original unit files that are installed by the package manager are placed in /usr/lib/systemd/system and should not be modified. If modifications to unit files are to be made, the modified unit files should be placed into /etc/systemd/system.

Unit relationships

There are various relationships between two units:

  • requires: The dependent unit must be loaded in order to load the current unit.
  • wants: The dependent unit is not critical
  • requisite: the dependent unit must be loaded and already active.
  • conflicts: the unit must not be loaded for the current to be loaded
  • before: the current unit must be activated before the given units
  • after: the current unit must be activated after the given units


Targets

Targets are groups of unit files. If a target includes the statement AllowIsolate = yes the target becomes an initialization endpoint at which systemd stops executing other units. Isolate targets are therefore similar to System V runlevels that were used for initialization before systemd.

For a custom target (or service), the relationship to the unit files that are WantedBy or RequiredBy the target (or service) will be established by systemd through symbolic links to the wanted or required unit files. These symbolic links are placed in /etc/systemd/system, within a subdirectory whose name starts with the name of the unit file and ends with the name of the type of relationship.

Which Linux distro and version am I running?

cat /etc/os-release


Dynamic Shared Objects (DSO)

Ulrich Drepper's How To Write Shared Libraries is a good resource for learning more about the ELF format, about optimizing your shared objects, and how to control the visibility of program symbols in the generated binaries.

Tracing system calls

strace is a command that lets you trace the system calls made by the command that is passed as the argument. For example,

strace echo "hello"

prints

execve("/usr/bin/echo", ["echo", "hello"], 0x7ffefa8f4928 /* 42 vars */) = 0
brk(NULL)                               = 0x561cf6532000
arch_prctl(0x3001 /* ARCH_??? */, 0x7fff61293910) = -1 EINVAL (Invalid argument)
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7faf85c44000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
newfstatat(3, "", {st_mode=S_IFREG|0644, st_size=53267, ...}, AT_EMPTY_PATH) = 0
mmap(NULL, 53267, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7faf85c36000
close(3)                                = 0
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0P\237\2\0\0\0\0\0"..., 832) = 832
pread64(3, "\6\0\0\0\4\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0"..., 784, 64) = 784
pread64(3, "\4\0\0\0 \0\0\0\5\0\0\0GNU\0\2\0\0\300\4\0\0\0\3\0\0\0\0\0\0\0"..., 48, 848) = 48
pread64(3, "\4\0\0\0\24\0\0\0\3\0\0\0GNU\0i8\235HZ\227\223\333\350s\360\352,\223\340."..., 68, 896) = 68
newfstatat(3, "", {st_mode=S_IFREG|0644, st_size=2216304, ...}, AT_EMPTY_PATH) = 0
pread64(3, "\6\0\0\0\4\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0"..., 784, 64) = 784
mmap(NULL, 2260560, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7faf85a0e000
mmap(0x7faf85a36000, 1658880, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x28000) = 0x7faf85a36000
mmap(0x7faf85bcb000, 360448, PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1bd000) = 0x7faf85bcb000
mmap(0x7faf85c23000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x214000) = 0x7faf85c23000
mmap(0x7faf85c29000, 52816, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7faf85c29000
close(3)                                = 0
mmap(NULL, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7faf85a0b000
arch_prctl(ARCH_SET_FS, 0x7faf85a0b740) = 0
set_tid_address(0x7faf85a0ba10)         = 1874
set_robust_list(0x7faf85a0ba20, 24)     = 0
rseq(0x7faf85a0c0e0, 0x20, 0, 0x53053053) = 0
mprotect(0x7faf85c23000, 16384, PROT_READ) = 0
mprotect(0x561cf5067000, 4096, PROT_READ) = 0
mprotect(0x7faf85c7e000, 8192, PROT_READ) = 0
prlimit64(0, RLIMIT_STACK, NULL, {rlim_cur=8192*1024, rlim_max=RLIM64_INFINITY}) = 0
munmap(0x7faf85c36000, 53267)           = 0
getrandom("\x01\x62\x8d\xf4\xec\xea\xd8\x4e", 8, GRND_NONBLOCK) = 8
brk(NULL)                               = 0x561cf6532000
brk(0x561cf6553000)                     = 0x561cf6553000
openat(AT_FDCWD, "/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3
newfstatat(3, "", {st_mode=S_IFREG|0644, st_size=6070224, ...}, AT_EMPTY_PATH) = 0
mmap(NULL, 6070224, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7faf85441000
close(3)                                = 0
newfstatat(1, "", {st_mode=S_IFCHR|0620, st_rdev=makedev(0x88, 0), ...}, AT_EMPTY_PATH) = 0
write(1, "hello\n", 6)                  = 6
close(1)                                = 0
close(2)                                = 0
exit_group(0)                           = ?
+++ exited with 0 +++

lists system calls made by echo as they are made.

Instead of printing system calls as they are made, strace can print a short summary of which system calls were how many times and the time spent.

strace -c echo "hello"

prints

hello
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
  0,00    0,000000           0         1           read
  0,00    0,000000           0         1           write
  0,00    0,000000           0         5           close
  0,00    0,000000           0         9           mmap
  0,00    0,000000           0         3           mprotect
  0,00    0,000000           0         1           munmap
  0,00    0,000000           0         3           brk
  0,00    0,000000           0         4           pread64
  0,00    0,000000           0         1         1 access
  0,00    0,000000           0         1           execve
  0,00    0,000000           0         2         1 arch_prctl
  0,00    0,000000           0         1           set_tid_address
  0,00    0,000000           0         3           openat
  0,00    0,000000           0         4           newfstatat
  0,00    0,000000           0         1           set_robust_list
  0,00    0,000000           0         1           prlimit64
  0,00    0,000000           0         1           getrandom
  0,00    0,000000           0         1           rseq
------ ----------- ----------- --------- --------- ----------------
100,00    0,000000           0        43         2 total



Similarly, ltrace will list the calls to library functions made by the command that was passed as the argument to ltrace.

sudo apt install ltrace



Signals

Signals are software interrupts that the the kernel can send an executable. The executable may react to the signal or choose to ignore it. The two signals SIGKILL and SIGTERM, however, can not be ignored by executable and must be handled.
An overview of the signals can be found in the man pages entry about signals

man 7 signal


Some command-line tools do useful stuff when certain signals are sent to them. For example, the dd tool prints the progress of the copying task when the SIGUSR1 signal is sent to it via the kill command.

dd if=${HOME}/image.iso of=/dev/sdb &
kill -s USR1 $(pidof dd)


Interprocess Communication Mechanisms

Shared Memory

Linux offers creating System V shared memory segments, which can be used for interprocess communication.

#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/shm.h>

// creating an IPC key
key_t segmentKey = ftok(”path_to_some_existing_file", 'R')
if (segmentKey == -1) {
  perror("could not create IPC key");
  exit(1);
}
// accessing the shared memory segment, creating it if it doesn't exist
int sharedMemoryID = shmget(segmentKey, SHM_SIZE, 0644 | IPC_CREAT)
if (sharedMemoryID == -1) {
  perror("could not access (or create) the shared memory segment");
  exit(1);
}
// attaching a pointer
void* memoryPtr = shmat(sharedMemoryID, NULL, 0);
if (*((char*)memoryPtr) == -1) {
  perror("could not attach a pointer to the shared memory segment");
  exit(1);
}

To learn more about shared memory segments, type

man ftok
man shmget
man shmat


Memory Mapped Files

Linux offers mapping files to memory regions, which enables working on the file contents much more efficiently, as if it is a memory region. For example, pointers can be used in a C/C++ program to edit the contents. The memory mapped region can be shared between processes. This allows the processes to operate on large files, for example in a writer-subscriber architecture, while preserving memory. Unlike the shared memory method described above, the contents of the shared memory region are written to a file and thus persist between sessions and even system reboots.

#include <stdlib.h>

#include <stdio.h>

#include <unistd.h>
#include <errno.h>

#include <string.h>

#include <fcntl.h>

#include <sys/types.h>

#include <sys/stat.h>

#include <sys/mman.h>



static const int MAX_INPUT_LENGTH = 50;

int main(int argc, char** argv)
{
  int fd = open(argv[1], O_RDWR | O_CREAT);

  char* shared_mem = mmap(NULL, MAX_INPUT_LENGTH, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);

  close(fd);


  if (!strcmp(argv[2], "read")) {

    // continously listens to text messages from the 'write' process and writes to the terminal
    while (1) {

      shared_mem[MAX_INPUT_LENGTH-1] = '\0';

      printf("%s", shared_mem);

      sleep(1);

    }

  }

  else if (!strcmp(argv[2], "write"))
 {
    // continuously sends the input from the terminal to the 'read' process
    while (1)
 {
      fgets(shared_mem, MAX_INPUT_LENGTH, stdin);
    }
  }
  else
 {
    printf("Unsupported command\n");

  }
}

To learn more about memory mapped files, refer to the man page

man mmap


Pipes

Pipes are another IPC mechanism where one process can send messages to another.

#include <stdio.h>
#include <stdlib.h>

#include <errno.h>

#include <sys/types.h>

#include <unistd.h>



int main(void)
{

  int pipeDescriptor[2];


  pipe(pipeDescriptor); // writing to pipeDescriptor[0], reading from pipeDescriptor[1]

 

  if (!fork())
  {

    printf(" CHILD: writing to pipe and exiting\n");

    write(pipeDescriptor[1], "test", 5);

  }
  else
  {

    char buffer[5];

    printf("PARENT: reading from pipe\n");

    read(pipeDescriptor[0], buffer, 5);

    printf("PARENT: read \"%s\"\n", buf);

    wait(NULL);

  }
 

  return 0;

}


To learn more about pipes, refer to the man page

man pipe


dmesg


D-Bus


C++ Notes

Platform detection

To detect whether code is compiled for the Linux platform you can check if __linux__ is defined.

#ifdef __linux__

(here you can find preprocessor definitions for many operating systems)


Debug data: