Linux: Difference between revisions
Line 3: | Line 3: | ||
#* On PC compatible systems, [https://en.wikipedia.org/wiki/UEFI UEFI] or BIOS are the primary bootloaders. | #* On PC compatible systems, [https://en.wikipedia.org/wiki/UEFI UEFI] or BIOS are the primary bootloaders. | ||
#* On BIOS systems, the Master Boot Record (MBR) area of a connected storage device is searched for a boot executable. The MBR is a special area, separate from the actual data storage area. BIOS is an old system, dating back to the early 1980s, but not yet obsolete. | #* On BIOS systems, the Master Boot Record (MBR) area of a connected storage device is searched for a boot executable. The MBR is a special area, separate from the actual data storage area. BIOS is an old system, dating back to the early 1980s, but not yet obsolete. | ||
#* On a UEFI system, which supports storage media with much larger capacity, the ''GPT'' partition table is used to determine if the storage device provides an ''EFI System Partition'' containing the folder ''/efi'' with a subfolder that contains the secondary boot loader. ''/efi/microsoft'' or ''/efi/grub'', for example. After a Linux system is booted by UEFI hardware, you can open a terminal and inspect the | #* On a UEFI system, which supports storage media with much larger capacity, the ''GPT'' partition table is used to determine if the storage device provides an ''EFI System Partition'' containing the folder ''/efi'' with a subfolder that contains the secondary boot loader. ''/efi/microsoft'' or ''/efi/grub'', for example. After a Linux system is booted by UEFI hardware, you can open a terminal and inspect the partition table by typing <span class="terminal"> sudo gdisk /dev/sda </span>. | ||
# If a secondary bootloader is found on the bootable media, it is loaded. | # If a secondary bootloader is found on the bootable media, it is loaded. | ||
#* On PC compatibles the secondary bootloader is usually [https://www.gnu.org/software/grub GRUB]. | #* On PC compatibles the secondary bootloader is usually [https://www.gnu.org/software/grub GRUB]. |
Revision as of 2022-11-27T14:58:16
Typical Boot Procedure
- The primary bootloader, which is usually a fixed part of the hardware recognizes bootable media.
- On PC compatible systems, UEFI or BIOS are the primary bootloaders.
- On BIOS systems, the Master Boot Record (MBR) area of a connected storage device is searched for a boot executable. The MBR is a special area, separate from the actual data storage area. BIOS is an old system, dating back to the early 1980s, but not yet obsolete.
- On a UEFI system, which supports storage media with much larger capacity, the GPT partition table is used to determine if the storage device provides an EFI System Partition containing the folder /efi with a subfolder that contains the secondary boot loader. /efi/microsoft or /efi/grub, for example. After a Linux system is booted by UEFI hardware, you can open a terminal and inspect the partition table by typing sudo gdisk /dev/sda .
- If a secondary bootloader is found on the bootable media, it is loaded.
- On PC compatibles the secondary bootloader is usually GRUB.
- On Arm Cortex-A based systems, this is often U-Boot.
- Some systems have proprietary bootloaders. Raspberry Pi, for example.
- To support the bootloader, a standard interface called LBA (linear block access) exists for storage media. This interface is not optimized for the capabilities of the storage device but is sufficient for the bootloader to search for a bootable operating system, which will later load the appropriate driver for the storage media.
- The secondary bootloader loads the Linux kernel.
- The kernel loads an initial root file system into memory (often initramfs), which includes some basic device drivers. For example, display drivers for showing graphical output during the boot sequence.
- The kernel starts the first process, the init process, which start all the system processes (a.k.a. services) configured to run when the system launches.
- init used to be a shell script located in the filesystem at /sbin/init. Nowadays, many Linux distributions use the systemd initialization method, where a binary executable is called instead of a shell script.
Ulrich Drepper's How To Write Shared Libraries is a good resource for learning more about the ELF format, about optimizing your shared objects, and how to control the visibility of program symbols in the generated binaries.
Tracing system calls
strace is a command that lets you trace the system calls made by the command that is passed as the argument. For example,
strace echo "hello"
prints
execve("/usr/bin/echo", ["echo", "hello"], 0x7ffefa8f4928 /* 42 vars */) = 0 brk(NULL) = 0x561cf6532000 arch_prctl(0x3001 /* ARCH_??? */, 0x7fff61293910) = -1 EINVAL (Invalid argument) mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7faf85c44000 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 newfstatat(3, "", {st_mode=S_IFREG|0644, st_size=53267, ...}, AT_EMPTY_PATH) = 0 mmap(NULL, 53267, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7faf85c36000 close(3) = 0 openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3 read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0P\237\2\0\0\0\0\0"..., 832) = 832 pread64(3, "\6\0\0\0\4\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0"..., 784, 64) = 784 pread64(3, "\4\0\0\0 \0\0\0\5\0\0\0GNU\0\2\0\0\300\4\0\0\0\3\0\0\0\0\0\0\0"..., 48, 848) = 48 pread64(3, "\4\0\0\0\24\0\0\0\3\0\0\0GNU\0i8\235HZ\227\223\333\350s\360\352,\223\340."..., 68, 896) = 68 newfstatat(3, "", {st_mode=S_IFREG|0644, st_size=2216304, ...}, AT_EMPTY_PATH) = 0 pread64(3, "\6\0\0\0\4\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0"..., 784, 64) = 784 mmap(NULL, 2260560, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7faf85a0e000 mmap(0x7faf85a36000, 1658880, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x28000) = 0x7faf85a36000 mmap(0x7faf85bcb000, 360448, PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1bd000) = 0x7faf85bcb000 mmap(0x7faf85c23000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x214000) = 0x7faf85c23000 mmap(0x7faf85c29000, 52816, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7faf85c29000 close(3) = 0 mmap(NULL, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7faf85a0b000 arch_prctl(ARCH_SET_FS, 0x7faf85a0b740) = 0 set_tid_address(0x7faf85a0ba10) = 1874 set_robust_list(0x7faf85a0ba20, 24) = 0 rseq(0x7faf85a0c0e0, 0x20, 0, 0x53053053) = 0 mprotect(0x7faf85c23000, 16384, PROT_READ) = 0 mprotect(0x561cf5067000, 4096, PROT_READ) = 0 mprotect(0x7faf85c7e000, 8192, PROT_READ) = 0 prlimit64(0, RLIMIT_STACK, NULL, {rlim_cur=8192*1024, rlim_max=RLIM64_INFINITY}) = 0 munmap(0x7faf85c36000, 53267) = 0 getrandom("\x01\x62\x8d\xf4\xec\xea\xd8\x4e", 8, GRND_NONBLOCK) = 8 brk(NULL) = 0x561cf6532000 brk(0x561cf6553000) = 0x561cf6553000 openat(AT_FDCWD, "/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3 newfstatat(3, "", {st_mode=S_IFREG|0644, st_size=6070224, ...}, AT_EMPTY_PATH) = 0 mmap(NULL, 6070224, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7faf85441000 close(3) = 0 newfstatat(1, "", {st_mode=S_IFCHR|0620, st_rdev=makedev(0x88, 0), ...}, AT_EMPTY_PATH) = 0 write(1, "hello\n", 6) = 6 close(1) = 0 close(2) = 0 exit_group(0) = ? +++ exited with 0 +++
lists system calls made by echo as they are made.
Instead of printing system calls as they are made, strace can print a short summary of which system calls were how many times and the time spent.
strace -c echo "hello"
prints
hello % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 0,00 0,000000 0 1 read 0,00 0,000000 0 1 write 0,00 0,000000 0 5 close 0,00 0,000000 0 9 mmap 0,00 0,000000 0 3 mprotect 0,00 0,000000 0 1 munmap 0,00 0,000000 0 3 brk 0,00 0,000000 0 4 pread64 0,00 0,000000 0 1 1 access 0,00 0,000000 0 1 execve 0,00 0,000000 0 2 1 arch_prctl 0,00 0,000000 0 1 set_tid_address 0,00 0,000000 0 3 openat 0,00 0,000000 0 4 newfstatat 0,00 0,000000 0 1 set_robust_list 0,00 0,000000 0 1 prlimit64 0,00 0,000000 0 1 getrandom 0,00 0,000000 0 1 rseq ------ ----------- ----------- --------- --------- ---------------- 100,00 0,000000 0 43 2 total
Similarly, ltrace will list the calls to library functions made by the command that was passed as the argument to ltrace.
sudo apt install ltrace
Signals
Signals are software interrupts that the the kernel can send an executable. The executable may react to the signal or choose to ignore it. The two signals SIGKILL and SIGTERM, however, can not be ignored by executable and must be handled.
An overview of the signals can be found in the man pages entry about signals
man 7 signal
Some command-line tools do useful stuff when certain signals are sent to them. For example, the dd tool prints the progress of the copying task when the SIGUSR1 signal is sent to it via the kill command.
dd if=${HOME}/image.iso of=/dev/sdb & kill -s USR1 $(pidof dd)
Interprocess Communication Mechanisms
Linux offers creating System V shared memory segments, which can be used for interprocess communication.
#include <sys/types.h> #include <sys/ipc.h> #include <sys/shm.h> // creating an IPC key key_t segmentKey = ftok(”path_to_some_existing_file", 'R') if (segmentKey == -1) { perror("could not create IPC key"); exit(1); } // accessing the shared memory segment, creating it if it doesn't exist int sharedMemoryID = shmget(segmentKey, SHM_SIZE, 0644 | IPC_CREAT) if (sharedMemoryID == -1) { perror("could not access (or create) the shared memory segment"); exit(1); } // attaching a pointer void* memoryPtr = shmat(sharedMemoryID, NULL, 0); if (*((char*)memoryPtr) == -1) { perror("could not attach a pointer to the shared memory segment"); exit(1); }
To learn more about shared memory segments, type
man ftok man shmget man shmat
Memory Mapped Files
Linux offers mapping files to memory regions, which enables working on the file contents much more efficiently, as if it is a memory region. For example, pointers can be used in a C/C++ program to edit the contents. The memory mapped region can be shared between processes. This allows the processes to operate on large files, for example in a writer-subscriber architecture, while preserving memory. Unlike the shared memory method described above, the contents of the shared memory region are written to a file and thus persist between sessions and even system reboots.
#include <stdlib.h> #include <stdio.h> #include <unistd.h> #include <errno.h> #include <string.h> #include <fcntl.h> #include <sys/types.h> #include <sys/stat.h> #include <sys/mman.h> static const int MAX_INPUT_LENGTH = 50; int main(int argc, char** argv) { int fd = open(argv[1], O_RDWR | O_CREAT); char* shared_mem = mmap(NULL, MAX_INPUT_LENGTH, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); close(fd); if (!strcmp(argv[2], "read")) { // continously listens to text messages from the 'write' process and writes to the terminal while (1) { shared_mem[MAX_INPUT_LENGTH-1] = '\0'; printf("%s", shared_mem); sleep(1); } } else if (!strcmp(argv[2], "write")) { // continuously sends the input from the terminal to the 'read' process while (1) { fgets(shared_mem, MAX_INPUT_LENGTH, stdin); } } else { printf("Unsupported command\n"); } }
To learn more about memory mapped files, refer to the man page
man mmap
Pipes
Pipes are another IPC mechanism where one process can send messages to another.
#include <stdio.h> #include <stdlib.h> #include <errno.h> #include <sys/types.h> #include <unistd.h> int main(void) { int pipeDescriptor[2]; pipe(pipeDescriptor); // writing to pipeDescriptor[0], reading from pipeDescriptor[1] if (!fork()) { printf(" CHILD: writing to pipe and exiting\n"); write(pipeDescriptor[1], "test", 5); } else { char buffer[5]; printf("PARENT: reading from pipe\n"); read(pipeDescriptor[0], buffer, 5); printf("PARENT: read \"%s\"\n", buf); wait(NULL); } return 0; }
To learn more about pipes, refer to the man page
man pipe
dmesg
D-Bus
C++ Notes
Platform detection
To detect whether code is compiled for the Linux platform you can check if __linux__ is defined.
#ifdef __linux__
(here you can find preprocessor definitions for many operating systems)