+ - 0:00:00
Notes for next slide

Processes and Signals

How work gets organized, directed and communicated about

Marek Šuppa
Ondrej Jariabka
Adrián Matejov

1 / 35

Why UNIX-like for Data Science?

  • There is a very good chance you'll work with significant data loads
2 / 35

Why UNIX-like for Data Science?

  • There is a very good chance you'll work with significant data loads

  • Being able to control (and especially stop) processes will be critical

3 / 35

Processes

4 / 35

History of Processes in Operating Systems

  • batch operating systems

  • single-process operating systems

  • multi-process/time-sharing operating systems

    • This is where the modern OSs fit in
5 / 35

Processes in Linux

  • identified by a unique Process ID (PID)

  • and a ton of attributes:

    • the user the process belongs to (the user who runs it)
    • PPID - the PID of its parent process
    • start time (STIME)
    • which terminal is it associated with (TTY)
    • the amount of "CPU time" it consumed (TIME)
    • the actual shell command that started it (CMD)
    • state
6 / 35

Process states

  • R: running or runnable (on run queue)
  • D: uninterruptible sleep
  • S: interruptible sleep (waiting for an event to complete)
  • T: stopped by job control signal
  • Z: defunct ("zombie") process, terminated but not reaped by its parent
7 / 35

Detailed process information

The best source is probably /proc/<PID>/status

$ cat /proc/1382/status | head
Name: ssh
Umask: 0022
State: S (sleeping)
Tgid: 540038
Ngid: 0
Pid: 540038
PPid: 37564
TracerPid: 0
Uid: 1000 1000 1000 1000
Gid: 1000 1000 1000 1000
9 / 35

Listing processes

  • ps
    • by default lists the processes the user is running in the current terminal
$ ps
PID TTY TIME CMD
3583901 pts/39 00:00:00 bash
3583948 pts/39 00:00:00 ps
10 / 35

Listing all processes

  • -e (or -A) lists all processes

  • -f does full-format listing

$ ps -e -f
UID PID PPID C STIME TTY TIME CMD
root 1 0 0 Oct12 ? 00:00:05 /usr/lib/systemd/systemd --switched-root --system --deserialize 34
root 2 0 0 Oct12 ? 00:00:00 [kthreadd]
root 3 2 0 Oct12 ? 00:00:00 [rcu_gp]
root 4 2 0 Oct12 ? 00:00:00 [rcu_par_gp]
root 6 2 0 Oct12 ? 00:00:00 [kworker/0:0H-kblockd]
root 9 2 0 Oct12 ? 00:00:00 [mm_percpu_wq]
root 10 2 0 Oct12 ? 00:00:10 [ksoftirqd/0]
root 11 2 0 Oct12 ? 00:02:10 [rcu_sched]
root 12 2 0 Oct12 ? 00:00:00 [migration/0]
[ ... output omitted ... ]
mrshu 550723 550241 0 13:32 pts/9 00:00:00 ps -e -f
11 / 35

Listing specific information

  • -o lists specific fields, such as
    • cmd
    • pid
    • ppid
    • state
    • user
    • ... and many more you can find in the man page
$ ps -o pid,state,cmd
PID S CMD
550241 S -fish
551637 S bash
551709 R ps -o pid,state,cmd
12 / 35

Listing the process hierarchy

  • -H makes ps show the processes in a "tree" view.
$ ps -H
47754 pts/13 00:00:00 zsh
47827 pts/13 00:00:00 bash
553122 pts/13 00:00:00 bash
553182 pts/13 00:00:00 ps
13 / 35

Listing the process hierarchy

  • -H makes ps show the processes in a "tree" view.
$ ps -H
47754 pts/13 00:00:00 zsh
47827 pts/13 00:00:00 bash
553122 pts/13 00:00:00 bash
553182 pts/13 00:00:00 ps

In the listing above we have zsh which runs bash, which runs bash which runs ps.

This can also be visualized using the pstree command.

14 / 35

Listing processes by PIDs

  • -p pidlist
    • lists processes whose PIDs are in the comma-separated pidlist
$ ps -p 552921,549013,547031
PID TTY TIME CMD
547031 ? 00:00:05 python3
549013 ? 00:00:08 firefox
552921 ? 00:00:00 gnome-calendar
15 / 35

Listing processes of specific users

  • -u userlist
    • lists processes whose users are in the comma-separated userlist
$ ps -u joe123
PID TTY TIME CMD
138565 pts/4 00:04:23 vimx
138580 ? 00:00:51 python3
138594 ? 00:00:38 python3
138595 ? 00:00:16 python3
138596 ? 00:00:25 python3
138597 ? 00:00:18 python3
16 / 35

Signals

17 / 35

Signals as a concept

  • A way for processes to communicate

  • Takes place on the kernel level (i.e. it's very fast)

  • The bandwidth is limited though (you don't send a video this way)

18 / 35

Signals Overview

  • SIGSTOP (19)

    • suspend the process until it receives SIGCONT (18)
  • SIGHUP (1) -- "signal hang up"

    • in the past it signaled literal hang up of the terminal modem's phone
    • often used for re-initialization of a long-running process (Apache Server)
    • in modern usage it means "the controling terminal has closed"
  • SIGTERM (15)

    • terminate a process gracefully
    • the process gets a chance to clean up before it terminates
  • SIGKILL (9)

    • terminate a process
    • this signal cannot be caught -- the process just dies right away
19 / 35

Signals Overview

  • SIGSTOP (19)

    • suspend the process until it receives SIGCONT (18)
  • SIGHUP (1) -- "signal hang up"

    • in the past it signaled literal hang up of the terminal modem's phone
    • often used for re-initialization of a long-running process (Apache Server)
    • in modern usage it means "the controling terminal has closed"
  • SIGTERM (15)

    • terminate a process gracefully
    • the process gets a chance to clean up before it terminates
  • SIGKILL (9)

    • terminate a process
    • this signal cannot be caught -- the process just dies right away

Except for SIGSTOP and SIGKILL programs can handle these signals in their own way.

20 / 35

Sending Signals

On most Linux distributions, this is done via the kill command.

  • kill -s signal pid
    • signal is the name of the signal (like SIGKILL)
    • pid is the PID of the process to send the signal to
$ kill -s SIGKILL 3215

There is also a shorter version:

$ kill -SIGKILL 3215
$ kill -KILL 3215
21 / 35

Sending Signals

On most Linux distributions, this is done via the kill command.

  • kill -s signal pid
    • signal is the name of the signal (like SIGKILL)
    • pid is the PID of the process to send the signal to
$ kill -s SIGKILL 3215

There is also a shorter version:

$ kill -SIGKILL 3215
$ kill -KILL 3215
  • Each signal is defined by its own ID (in parents on the previous slide).
    • These can be listed via kill -L
$ kill -L
1 HUP 2 INT 3 QUIT 4 ILL 5 TRAP 6 ABRT 6 IOT
7 BUS 8 FPE 9 KILL 10 USR1 11 SEGV 12 USR2 13 PIPE
14 ALRM 15 TERM 16 STKFLT 17 CHLD 17 CLD 18 CONT 19 STOP
20 TSTP 21 TTIN 22 TTOU 23 URG 24 XCPU 25 XFSZ 26 VTALRM
27 PROF 28 WINCH 29 IO 29 POLL 30 PWR 31 SYS 34 RTMIN
64 RTMAX
22 / 35

Once again, this is well within the UNIX/Posix philosophy. Shorter yet expressive is better than verbose and redundant, mostly because typing used to be rather expensive.

Process Termination with Signals

  • The standard approach is to first send SIGTERM (15) to a process we want to terminate

  • This is done so that the process can finish up cleanly

$ kill -15 3215
23 / 35

Process Termination with Signals

  • The standard approach is to first send SIGTERM (15) to a process we want to terminate

  • This is done so that the process can finish up cleanly

$ kill -15 3215
  • And if that does not happen, bring the bing guns by sending SIGKILL (9)
$ kill -9 3215
24 / 35

Process Termination with Signals

  • The standard approach is to first send SIGTERM (15) to a process we want to terminate

  • This is done so that the process can finish up cleanly

$ kill -15 3215
  • And if that does not happen, bring the bing guns by sending SIGKILL (9)
$ kill -9 3215
  • killall process
    • kill processes by name (process)
    • sends SIGTERM by default
    • specific signal can be specified like in case of kill
$ killall firefox
25 / 35

Process Termination with Signals

  • The standard approach is to first send SIGTERM (15) to a process we want to terminate

  • This is done so that the process can finish up cleanly

$ kill -15 3215
  • And if that does not happen, bring the bing guns by sending SIGKILL (9)
$ kill -9 3215
  • killall process
    • kill processes by name (process)
    • sends SIGTERM by default
    • specific signal can be specified like in case of kill
$ killall firefox

And if that does not help...

$ killall -9 firefox
26 / 35

Job control in Bash

In other words, how to use signals to control processes form Bash

27 / 35

Stop and suspend a running process

  • A "normal" process in bash is said to be started in the foreground

    • that is, it outputs and reads to/from the terminal
  • Ctrl+C

    • sends the SIGINT (2) signal (similar to SIGTERM)
    • interrupts and generally makes the running process stop
  • Ctrl+Z

    • sends the SIGTSTP (20) signal (similar to SIGSTOP)
    • suspends the program and returns back to the shell
28 / 35

Forground and background processes

Let's consider a long-running command like cp movie.mp4 ~/Movies

  • cp movie.mp4 ~/Movies
    • runs the command on the foreground
    • until it finishes, it is not possible to run (or even type) any further command
    • Ctrl+C will terminate it
    • Ctrl+Z will "suspend" the process (it will be stopped)
    • once stopped, bg will resume its execution in the background
    • conversely, fg will resume its execution in the foreground
29 / 35

Forground and background processes

Let's consider a long-running command like cp movie.mp4 ~/Movies

  • cp movie.mp4 ~/Movies

    • runs the command on the foreground
    • until it finishes, it is not possible to run (or even type) any further command
    • Ctrl+C will terminate it
    • Ctrl+Z will "suspend" the process (it will be stopped)
    • once stopped, bg will resume its execution in the background
    • conversely, fg will resume its execution in the foreground
  • cp movie.mp4 ~/Movies &

    • runs the command in the background by default
    • the shell is available straight away
    • Ctrl+C won't work on it (it is not in the foreground)
    • fg will bring it to the foreground
30 / 35

Job control with jobs

  • jobs
    • Bash internal command (not part of the operating system)
    • lists all processes executed from the terminal
    • each job has its ID (in brackets)
    • these can be used to reference it in fg, bg or kill, e.g. fg %1
    • by default fg and bg take the first job from the table
31 / 35

Job control with jobs

  • jobs
    • Bash internal command (not part of the operating system)
    • lists all processes executed from the terminal
    • each job has its ID (in brackets)
    • these can be used to reference it in fg, bg or kill, e.g. fg %1
    • by default fg and bg take the first job from the table
$ man ps
[1]+ Stopped man ps
$ eog &
[2] 32165
$ jobs
[1]+ Stopped man ps
[2]- Running eog &
$ kill -15 %2
[2]- Terminated eog
$ fg
man ps
32 / 35

Useful commands

33 / 35

wc

  • Stands for "word count" (despite what the name may suggest...).
  • Shows the number of lines, words and characters in a file
$ wc /etc/passwd
54 134 3062 /etc/passwd
$ wc -l /etc/passwd
54 /etc/passwd
$ wc -w /etc/passwd
134 /etc/passwd
$ wc -m /etc/passwd
3062 /etc/passwd
34 / 35

wc

  • Stands for "word count" (despite what the name may suggest...).
  • Shows the number of lines, words and characters in a file
$ wc /etc/passwd
54 134 3062 /etc/passwd
$ wc -l /etc/passwd
54 /etc/passwd
$ wc -w /etc/passwd
134 /etc/passwd
$ wc -m /etc/passwd
3062 /etc/passwd

Works with data piped in from other commands as well:

$ cat /etc/passwd | wc -m
3062
35 / 35

Why UNIX-like for Data Science?

  • There is a very good chance you'll work with significant data loads
2 / 35
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow