Linux: Difference between revisions

Line 3: Line 3:
#* On PC compatible systems, [https://en.wikipedia.org/wiki/UEFI UEFI] or BIOS are the primary bootloaders.
#* On PC compatible systems, [https://en.wikipedia.org/wiki/UEFI UEFI] or BIOS are the primary bootloaders.
#* On BIOS systems, the Master Boot Record (MBR) area of a connected storage device is searched for a boot executable. The MBR is a special area, separate from the actual data storage area. BIOS is an old system, dating back to the early 1980s, but not yet obsolete.
#* On BIOS systems, the Master Boot Record (MBR) area of a connected storage device is searched for a boot executable. The MBR is a special area, separate from the actual data storage area. BIOS is an old system, dating back to the early 1980s, but not yet obsolete.
#* On a UEFI system, which supports storage media with much larger capacity, the ''GPT'' partition table is used to determine if the storage device provides an ''EFI System Partition'' containing the folder ''/efi'' with a subfolder that contains the secondary boot loader. ''/efi/microsoft'' or ''/efi/grub'', for example. After a Linux system is booted by UEFI hardware, you can open a terminal and inspect the UEFI boot partition via <span class="terminal">&nbsp;sudo gdisk /dev/sda&nbsp;</span>.
#* On a UEFI system, which supports storage media with much larger capacity, the ''GPT'' partition table is used to determine if the storage device provides an ''EFI System Partition'' containing the folder ''/efi'' with a subfolder that contains the secondary boot loader. ''/efi/microsoft'' or ''/efi/grub'', for example. After a Linux system is booted by UEFI hardware, you can open a terminal and inspect the partition table by typing <span class="terminal">&nbsp;sudo gdisk /dev/sda&nbsp;</span>.
# If a secondary bootloader is found on the bootable media, it is loaded.
# If a secondary bootloader is found on the bootable media, it is loaded.
#* On PC compatibles the secondary bootloader is usually [https://www.gnu.org/software/grub GRUB].
#* On PC compatibles the secondary bootloader is usually [https://www.gnu.org/software/grub GRUB].

Revision as of 2022-11-27T14:58:16

Typical Boot Procedure

  1. The primary bootloader, which is usually a fixed part of the hardware recognizes bootable media.
    • On PC compatible systems, UEFI or BIOS are the primary bootloaders.
    • On BIOS systems, the Master Boot Record (MBR) area of a connected storage device is searched for a boot executable. The MBR is a special area, separate from the actual data storage area. BIOS is an old system, dating back to the early 1980s, but not yet obsolete.
    • On a UEFI system, which supports storage media with much larger capacity, the GPT partition table is used to determine if the storage device provides an EFI System Partition containing the folder /efi with a subfolder that contains the secondary boot loader. /efi/microsoft or /efi/grub, for example. After a Linux system is booted by UEFI hardware, you can open a terminal and inspect the partition table by typing  sudo gdisk /dev/sda .
  2. If a secondary bootloader is found on the bootable media, it is loaded.
    • On PC compatibles the secondary bootloader is usually GRUB.
    • On Arm Cortex-A based systems, this is often U-Boot.
    • Some systems have proprietary bootloaders. Raspberry Pi, for example.
    • To support the bootloader, a standard interface called LBA (linear block access) exists for storage media. This interface is not optimized for the capabilities of the storage device but is sufficient for the bootloader to search for a bootable operating system, which will later load the appropriate driver for the storage media.
  3. The secondary bootloader loads the Linux kernel.
  4. The kernel loads an initial root file system into memory (often initramfs), which includes some basic device drivers. For example, display drivers for showing graphical output during the boot sequence.
  5. The kernel starts the first process, the init process, which start all the system processes (a.k.a. services) configured to run when the system launches.
    • init used to be a shell script located in the filesystem at /sbin/init. Nowadays, many Linux distributions use the systemd initialization method, where a binary executable is called instead of a shell script.


Dynamic Shared Objects (DSO)

Ulrich Drepper's How To Write Shared Libraries is a good resource for learning more about the ELF format, about optimizing your shared objects, and how to control the visibility of program symbols in the generated binaries.

Tracing system calls

strace is a command that lets you trace the system calls made by the command that is passed as the argument. For example,

strace echo "hello"

prints

execve("/usr/bin/echo", ["echo", "hello"], 0x7ffefa8f4928 /* 42 vars */) = 0
brk(NULL)                               = 0x561cf6532000
arch_prctl(0x3001 /* ARCH_??? */, 0x7fff61293910) = -1 EINVAL (Invalid argument)
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7faf85c44000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
newfstatat(3, "", {st_mode=S_IFREG|0644, st_size=53267, ...}, AT_EMPTY_PATH) = 0
mmap(NULL, 53267, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7faf85c36000
close(3)                                = 0
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0P\237\2\0\0\0\0\0"..., 832) = 832
pread64(3, "\6\0\0\0\4\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0"..., 784, 64) = 784
pread64(3, "\4\0\0\0 \0\0\0\5\0\0\0GNU\0\2\0\0\300\4\0\0\0\3\0\0\0\0\0\0\0"..., 48, 848) = 48
pread64(3, "\4\0\0\0\24\0\0\0\3\0\0\0GNU\0i8\235HZ\227\223\333\350s\360\352,\223\340."..., 68, 896) = 68
newfstatat(3, "", {st_mode=S_IFREG|0644, st_size=2216304, ...}, AT_EMPTY_PATH) = 0
pread64(3, "\6\0\0\0\4\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0"..., 784, 64) = 784
mmap(NULL, 2260560, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7faf85a0e000
mmap(0x7faf85a36000, 1658880, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x28000) = 0x7faf85a36000
mmap(0x7faf85bcb000, 360448, PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1bd000) = 0x7faf85bcb000
mmap(0x7faf85c23000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x214000) = 0x7faf85c23000
mmap(0x7faf85c29000, 52816, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7faf85c29000
close(3)                                = 0
mmap(NULL, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7faf85a0b000
arch_prctl(ARCH_SET_FS, 0x7faf85a0b740) = 0
set_tid_address(0x7faf85a0ba10)         = 1874
set_robust_list(0x7faf85a0ba20, 24)     = 0
rseq(0x7faf85a0c0e0, 0x20, 0, 0x53053053) = 0
mprotect(0x7faf85c23000, 16384, PROT_READ) = 0
mprotect(0x561cf5067000, 4096, PROT_READ) = 0
mprotect(0x7faf85c7e000, 8192, PROT_READ) = 0
prlimit64(0, RLIMIT_STACK, NULL, {rlim_cur=8192*1024, rlim_max=RLIM64_INFINITY}) = 0
munmap(0x7faf85c36000, 53267)           = 0
getrandom("\x01\x62\x8d\xf4\xec\xea\xd8\x4e", 8, GRND_NONBLOCK) = 8
brk(NULL)                               = 0x561cf6532000
brk(0x561cf6553000)                     = 0x561cf6553000
openat(AT_FDCWD, "/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3
newfstatat(3, "", {st_mode=S_IFREG|0644, st_size=6070224, ...}, AT_EMPTY_PATH) = 0
mmap(NULL, 6070224, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7faf85441000
close(3)                                = 0
newfstatat(1, "", {st_mode=S_IFCHR|0620, st_rdev=makedev(0x88, 0), ...}, AT_EMPTY_PATH) = 0
write(1, "hello\n", 6)                  = 6
close(1)                                = 0
close(2)                                = 0
exit_group(0)                           = ?
+++ exited with 0 +++

lists system calls made by echo as they are made.

Instead of printing system calls as they are made, strace can print a short summary of which system calls were how many times and the time spent.

strace -c echo "hello"

prints

hello
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
  0,00    0,000000           0         1           read
  0,00    0,000000           0         1           write
  0,00    0,000000           0         5           close
  0,00    0,000000           0         9           mmap
  0,00    0,000000           0         3           mprotect
  0,00    0,000000           0         1           munmap
  0,00    0,000000           0         3           brk
  0,00    0,000000           0         4           pread64
  0,00    0,000000           0         1         1 access
  0,00    0,000000           0         1           execve
  0,00    0,000000           0         2         1 arch_prctl
  0,00    0,000000           0         1           set_tid_address
  0,00    0,000000           0         3           openat
  0,00    0,000000           0         4           newfstatat
  0,00    0,000000           0         1           set_robust_list
  0,00    0,000000           0         1           prlimit64
  0,00    0,000000           0         1           getrandom
  0,00    0,000000           0         1           rseq
------ ----------- ----------- --------- --------- ----------------
100,00    0,000000           0        43         2 total



Similarly, ltrace will list the calls to library functions made by the command that was passed as the argument to ltrace.

sudo apt install ltrace



Signals

Signals are software interrupts that the the kernel can send an executable. The executable may react to the signal or choose to ignore it. The two signals SIGKILL and SIGTERM, however, can not be ignored by executable and must be handled.
An overview of the signals can be found in the man pages entry about signals

man 7 signal


Some command-line tools do useful stuff when certain signals are sent to them. For example, the dd tool prints the progress of the copying task when the SIGUSR1 signal is sent to it via the kill command.

dd if=${HOME}/image.iso of=/dev/sdb &
kill -s USR1 $(pidof dd)


Interprocess Communication Mechanisms

Shared Memory

Linux offers creating System V shared memory segments, which can be used for interprocess communication.

#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/shm.h>

// creating an IPC key
key_t segmentKey = ftok(”path_to_some_existing_file", 'R')
if (segmentKey == -1) {
  perror("could not create IPC key");
  exit(1);
}
// accessing the shared memory segment, creating it if it doesn't exist
int sharedMemoryID = shmget(segmentKey, SHM_SIZE, 0644 | IPC_CREAT)
if (sharedMemoryID == -1) {
  perror("could not access (or create) the shared memory segment");
  exit(1);
}
// attaching a pointer
void* memoryPtr = shmat(sharedMemoryID, NULL, 0);
if (*((char*)memoryPtr) == -1) {
  perror("could not attach a pointer to the shared memory segment");
  exit(1);
}

To learn more about shared memory segments, type

man ftok
man shmget
man shmat


Memory Mapped Files

Linux offers mapping files to memory regions, which enables working on the file contents much more efficiently, as if it is a memory region. For example, pointers can be used in a C/C++ program to edit the contents. The memory mapped region can be shared between processes. This allows the processes to operate on large files, for example in a writer-subscriber architecture, while preserving memory. Unlike the shared memory method described above, the contents of the shared memory region are written to a file and thus persist between sessions and even system reboots.

#include <stdlib.h>

#include <stdio.h>

#include <unistd.h>
#include <errno.h>

#include <string.h>

#include <fcntl.h>

#include <sys/types.h>

#include <sys/stat.h>

#include <sys/mman.h>



static const int MAX_INPUT_LENGTH = 50;

int main(int argc, char** argv)
{
  int fd = open(argv[1], O_RDWR | O_CREAT);

  char* shared_mem = mmap(NULL, MAX_INPUT_LENGTH, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);

  close(fd);


  if (!strcmp(argv[2], "read")) {

    // continously listens to text messages from the 'write' process and writes to the terminal
    while (1) {

      shared_mem[MAX_INPUT_LENGTH-1] = '\0';

      printf("%s", shared_mem);

      sleep(1);

    }

  }

  else if (!strcmp(argv[2], "write"))
 {
    // continuously sends the input from the terminal to the 'read' process
    while (1)
 {
      fgets(shared_mem, MAX_INPUT_LENGTH, stdin);
    }
  }
  else
 {
    printf("Unsupported command\n");

  }
}

To learn more about memory mapped files, refer to the man page

man mmap


Pipes

Pipes are another IPC mechanism where one process can send messages to another.

#include <stdio.h>
#include <stdlib.h>

#include <errno.h>

#include <sys/types.h>

#include <unistd.h>



int main(void)
{

  int pipeDescriptor[2];


  pipe(pipeDescriptor); // writing to pipeDescriptor[0], reading from pipeDescriptor[1]

 

  if (!fork())
  {

    printf(" CHILD: writing to pipe and exiting\n");

    write(pipeDescriptor[1], "test", 5);

  }
  else
  {

    char buffer[5];

    printf("PARENT: reading from pipe\n");

    read(pipeDescriptor[0], buffer, 5);

    printf("PARENT: read \"%s\"\n", buf);

    wait(NULL);

  }
 

  return 0;

}


To learn more about pipes, refer to the man page

man pipe


dmesg


D-Bus


C++ Notes

Platform detection

To detect whether code is compiled for the Linux platform you can check if __linux__ is defined.

#ifdef __linux__

(here you can find preprocessor definitions for many operating systems)