With grep
, find
and xargs
Marek Šuppa
Ondrej Jariabka
Adrián Matejov
Why so many commands? Isn't GUI nicer, faster and overall better?
The answer is yes: the GUI is nicer and faster at discovering various options but when it comes to pure execution, CLI tends to be unmatched in speed
In other words, by using a mouse you don't build muscle memory
Why so many commands? Isn't GUI nicer, faster and overall better?
The answer is yes: the GUI is nicer and faster at discovering various options but when it comes to pure execution, CLI tends to be unmatched in speed
In other words, by using a mouse you don't build muscle memory
Great tools for your toolbox
Parallel processing
"The clumsiness of people who have to engage their brain at every step is unbearably painful to watch, at least to me, and that's what the novice-friendly software makes people do, because there's no elegance in them, it's just a mass of features to be learned by rote... it's sadly obvious that we are moving into a way of working that is predominantly conscious, for which I believe the human brain was never prepared. we no longer have the time to let skills sink into the autonomous nervous system, as it were, and even if we try, the criminal in Redmond, WA, has a new, incompatible version out by the time we learned the last version... One of the joys of learning to ride a bicycle is to stop thinking about it -- the feeling that I had successfully programmed my body to master a bicycle at least thrilled me as a kid (except I didn't know the verb "to program")... we need to communicate to users that learning to use Emacs is like learning to ride a bicycle -- it does take some time and effort, it's a worth-while skill to have, and then you never forget. I firmly believe that the novice-friendly software is like giving people several sets of supporting wheels so they won't tilt, but could get moving right away, and then never taking them off, preferring that they keep using them and moving so slowly that they always need them. of course, if you argue that they should remove the supporting wheels to such users, they will, admittedly correctly, argue that they will fall splat on the ground and ruin their three-piece suits. clearly a no-go."
http://groups.google.com/group/comp.emacs/msg/821a0f04bab91864?dmode=source&output=gplain
grep
grep
Suppose you would like to find a specific like (say "guide to the galaxy") you know is saved in some file on the disk.
How would you go about doing that?
grep
Suppose you would like to find a specific like (say "guide to the galaxy") you know is saved in some file on the disk.
How would you go about doing that?
Well, turns out grep
can be of help!
Up until now we used it in the following way
$ grep [regex] [file]
grep
Suppose you would like to find a specific like (say "guide to the galaxy") you know is saved in some file on the disk.
How would you go about doing that?
Well, turns out grep
can be of help!
Up until now we used it in the following way
$ grep [regex] [file]
But it can also be applied recursively on directories and their content.
grep
with recursiongrep -R
grep
with recursiongrep -R
$ tree.├── a│ ├── file1.txt│ └── file2.txt├── b│ └── file40.txt├── c│ └── file3.txt├── d└── g └── book.txt5 directories, 5 files
$ grep -Ri 'guide to the galaxy'g/book.txt:The Hitchiker's Guide to the Galaxy
grep
's Extended Regular Expressionsgrep
uses so called "basic regular expressions" (BRE) by default.grep
's Extended Regular Expressionsgrep
uses so called "basic regular expressions" (BRE) by default.
What we normally consider "regular expressions" are actually "extended regular expressions" (ERE)
grep
's Extended Regular Expressionsgrep
uses so called "basic regular expressions" (BRE) by default.
What we normally consider "regular expressions" are actually "extended regular expressions" (ERE)
The main difference is in backslashing the meta characters:
\?
, \+
, \{
, \}
?
, +
, {
, }
EREs can be turned on in grep
by passing the -E
parameter.
grep
's Extended Regular Expressionsgrep
uses so called "basic regular expressions" (BRE) by default.
What we normally consider "regular expressions" are actually "extended regular expressions" (ERE)
The main difference is in backslashing the meta characters:
\?
, \+
, \{
, \}
?
, +
, {
, }
EREs can be turned on in grep
by passing the -E
parameter.
Sample: find all Uniba login IDs in the current directory:
grep
's Extended Regular Expressionsgrep
uses so called "basic regular expressions" (BRE) by default.
What we normally consider "regular expressions" are actually "extended regular expressions" (ERE)
The main difference is in backslashing the meta characters:
\?
, \+
, \{
, \}
?
, +
, {
, }
EREs can be turned on in grep
by passing the -E
parameter.
Sample: find all Uniba login IDs in the current directory:
$ grep -Rn -E '[a-z]+[0-9]{1,3}'l/access.log:11:Unauthorized access attmept from login ID novak123l/access.log:18:Successfuly authorized roy47z/login.py:4:log_in(username='rob5')
grep
's Extended Regular Expressionsgrep
uses so called "basic regular expressions" (BRE) by default.
What we normally consider "regular expressions" are actually "extended regular expressions" (ERE)
The main difference is in backslashing the meta characters:
\?
, \+
, \{
, \}
?
, +
, {
, }
EREs can be turned on in grep
by passing the -E
parameter.
Sample: find all Uniba login IDs in the current directory:
$ grep -Rn -E '[a-z]+[0-9]{1,3}'l/access.log:11:Unauthorized access attmept from login ID novak123l/access.log:18:Successfuly authorized roy47z/login.py:4:log_in(username='rob5')
But what if we only wanted to search in say .py
files?
find
One-stop solution for figuring out what is where
find
.
$ tree.├── a│ ├── file1.txt│ └── file2.txt├── b│ └── file40.txt├── c│ └── file3.txt└── d4 directories, 4 files
$ find../d./c./c/file3.txt./b./b/file40.txt./a./a/file2.txt./a/file1.txt$ find .../d./c./c/file3.txt./b./b/file40.txt./a./a/file2.txt./a/file1.txt
find
: looking up filenamesCan be done with the -name
option / flag
find [path] -name [pattern]
[pattern]
supports the following wildcards:*
: matches any string (of any length)?
: matches a single character[ ]
: matches a single character from the specified character class$ find .../d./c./c/file3.txt./b./b/file40.txt./a./a/file2.txt./a/file1.txt
$ find . -name "*.txt"./c/file3.txt./b/file40.txt./a/file2.txt./a/file1.txt
$ find . -name "file?.txt"./c/file3.txt./a/file2.txt./a/file1.txt
$ find . -name 'file[2-4].txt'./c/file3.txt./a/file2.txt
find
: looking up pathsFor matching full paths, -path
is a good choice
find [path] -path [pattern]
[pattern]
is applied on the whole path, not just the name$ find .../d./c./c/file3.txt./b./b/file40.txt./a./a/file2.txt./a/file1.txt
$ find . -path '*a*'./a./a/file2.txt./a/file1.txt
$ find . -path './?'./d./c./b./a
$ find . -path '*c/file[0-9].txt'./c/file3.txt
The pattern needs to match the whole path (this example does not):
$ find . -path '*c/file[0-9]'
find
: looking up via regexIf wildcards do not suffice, we can also utilize the "full power of regexes"
find [path] -regex [pattern]
[pattern]
can be any "standard" regular expression$ find .../d./c./c/file3.txt./b./b/file40.txt./a./a/file2.txt./a/file1.txt
$ find . -regex '.*file.*'./c/file3.txt./b/file40.txt./a/file2.txt./a/file1.txt
$ find . -regex '.*[bc]/file.*'./c/file3.txt./b/file40.txt
Unlike in the examples below, the regex needs to match the whole path.
$ find . -regex 'file'
$ find . -regex '[bc]/file.*'
find
: looking up via attributesFiles and directories have various attributes find
can take a look at:
type
timestamps
-ctime
)-atime
)-mtime
)file size
owner and group
permissions
find
: looking up by file/dir typefind [path] -type [type]
[type]
can be one of the followingf
: "normal" filed
: directoryb
/c
: block/character devicep
: named pipel
: symlinks
: socket$ find .../d./c./c/file3.txt./b./b/file40.txt./a./a/file2.txt./a/file1.txt
$ find . -type d../d./c./b./a
$ find . -type f./c/file3.txt./b/file40.txt./a/file2.txt./a/file1.txt
find
: looking up by timestampsfind [path] -mmin [n]
[n]
minutes agofind [path] -mtime [n]
[n]
days agoFlags for other timestamps:
-cmin
/ -ctime
-amin
/ -atime
$ ls -aldrwxrwxr-x 7 mrshu mrshu 4096 Nov 9 10:34 .drwxrwxr-x 2 mrshu mrshu 4096 Nov 7 11:39 adrwxrwx--x 2 mrshu mrshu 4096 Nov 7 11:40 bdrwxrwxr-x 2 mrshu mrshu 4096 Nov 7 11:57 cdrwxrwxr-x 2 mrshu mrshu 4096 Nov 7 12:36 ddrwxrwxr-x 2 mrshu mrshu 4096 Nov 9 10:34 e
$ dateMon 09 Nov 2020 10:55:23 AM UTC
Last modified exactly 22 minutes ago:
$ find . -mmin 22../e
Last modified 1 full day ago:
$ find . -mtime 1./a./d./b./c
find
: looking up by sizefind [path] -size [n]
[n]
can be followed by various units:b
for 512-byte blocks (the default)c
for bytesk
for Kilobytes (units of 1024 bytes)M
for Megabytes (units of 1048576 bytes)G
for Gigabytes (units of 1073741824 bytes)find [path] -empty
$ find .../d./c./c/file3.txt./b./b/file40.txt./a./a/file2.txt./a/file1.txt
$ find . -empty./d./c/file3.txt./a/file1.txt
$ find . -size 0./c/file3.txt./a/file1.txt
find
: a note on [n]
By default, [n]
matches the exact value (of time/date or size)
This behaviour can be altered via the +
and -
prefixes
+[n]
[n]
-[n]
[n]
[n]
[n]
$ find . -mtime -3../a./d./b./c./e
$ find /boot -size +15M/boot/initrd.img-5.4.0-47-generic/boot/initrd.img-5.4.0-52-generic/boot/initrd.img-5.4.0-51-generic$ ls -hs /boot/initrd.img-5.4.0-47-generic78M /boot/initrd.img-5.4.0-47-generic
find
: looking up by user/groupfind [path] -user [user]
[user]
find [path] -group [group]
[group]
$ ls -l /etc[ ... 160 lines omitted ... ]drwxr-xr-x 2 root root 4096 Sep 19 19:52 sensors.d-rw-r--r-- 1 root root 14464 Feb 16 2020 services-rw-r----- 1 root shadow 1463 Sep 19 20:15 shadow-rw-r----- 1 root shadow 1595 Sep 19 20:14 shadow--rw-r--r-- 1 root root 146 Jul 31 16:29 shellsdrwxr-xr-x 2 root root 4096 Jul 31 16:28 skel[ ... 32 lines omitted ... ]$ find /etc/ -group shadow/etc/shadow-/etc/gshadow/etc/shadow/etc/gshadow-
find
: looking up by permissionsfind [path] -readable
find [path] -writable
find [path] -executable
$ find .../d./c./c/file3.txt./b./b/file40.txt./a./a/file2.txt./a/file1.txt
find . -executable../d./c./b./a
find
: looking up by permissions IIfind [path] -perm [mode]
-[mode]
can be specified in octal or symbolic format
[mode]
can have various prefixes:[mode]
: exactly [mode]
permissions are set-[mode]
: at least [mode]
permissions are set/[mode]
: some [mode]
permissions are set$ ls -altotal 24drwxrwxr-x 6 mrshu mrshu 4096 Nov 7 12:36 .drwxr-xr-x 7 mrshu mrshu 4096 Nov 9 10:16 ..drwxrwxr-x 2 mrshu mrshu 4096 Nov 7 11:39 adrwxrwx--x 2 mrshu mrshu 4096 Nov 7 11:40 bdrwxrwxr-x 2 mrshu mrshu 4096 Nov 7 11:57 cdrwxrwxr-x 2 mrshu mrshu 4096 Nov 7 12:36 d$ find . -perm 771./b$ find . -perm -774../a./d./c
find
: combining search patternsThe search pattern on name and attribute level can be easily combined together.
For example:
$ file . -name "*.txt" -empty
$ file . -name "*image*" -mmin -20
find
: actions on matches-delete
-exec [command] \;
[command]
for each matched file/directory{}
is replaced with the matched file/directory-ok [command] \;
-exec
but asks for user confirmation before running the command$ cat c/file3.txt$ cat b/file40.txtThis is 40$ cat a/file2.txtThis is us$ cat a/file1.txt
$ find . -name "*.txt" -exec echo {} \;./c/file3.txt./b/file40.txt./a/file2.txt./a/file1.txt$ find . -name "*.txt" -exec cat {} \;This is 40This is us
xargs
Constructing commands on the fly
xargs
Allows us to "parametrize" the commands we run by passing input via pipe
Processes standard input line-by-line and "applies" a command on each line
xargs [command]
command can be any command we would like to run on each line
-I {}
will replace {}
in the [command]
by the input
$ find . -type f | xargs -I{} echo "File: {}"File: ./c/file3.txtFile: ./b/file40.txtFile: ./a/file2.txtFile: ./a/file1.txt
xargs
Allows us to "parametrize" the commands we run by passing input via pipe
Processes standard input line-by-line and "applies" a command on each line
xargs [command]
command can be any command we would like to run on each line
-I {}
will replace {}
in the [command]
by the input
$ find . -type f | xargs -I{} echo "File: {}"File: ./c/file3.txtFile: ./b/file40.txtFile: ./a/file2.txtFile: ./a/file1.txt
This can be easily used as a clearer alternative to -delete
or -exec
$ find . -type f -empty | xargs -I{} rm {}
xargs
IIxargs [command]
-t
-p
-ok
in find
-P [max-procs]
[max-procs]
processesAsk before removing each empty file:
$ find . -type f -empty | xargs -p -I{} rm {} rm ./c/file3.txt ?...yrm ./a/file1.txt ?...y
grep
taleTask: find all Uniba IDs in the *.py
files in the current directory (recursively)
Solution:
$ find . -name '*.py' -type f | xargs grep -n -E '[a-z]+[0-9]{1,3}'z/login.py:4:log_in(username='rob5')
time
Let's you "time" (compute the time needed for) the execution of a specific command
An internal bash
command but a standalone program also exists
Very nice for benchmarking competing approaches to solving the same problem
time
Let's you "time" (compute the time needed for) the execution of a specific command
An internal bash
command but a standalone program also exists
Very nice for benchmarking competing approaches to solving the same problem
Suppose we'd like to benchmark the following two commands:
find ./foo -type f -name "*.txt" -exec rm {} \; find ./foo -type f -name "*.txt" | xargs rm
time
Let's you "time" (compute the time needed for) the execution of a specific command
An internal bash
command but a standalone program also exists
Very nice for benchmarking competing approaches to solving the same problem
Suppose we'd like to benchmark the following two commands:
find ./foo -type f -name "*.txt" -exec rm {} \; find ./foo -type f -name "*.txt" | xargs rm
On a folder with 1000 files in it, here are the results:
time find ./foo -type f -name "*.txt" -exec rm {} \;0.35s user 0.11s system 99% cpu 0.467 totaltime find ./foo -type f -name "*.txt" | xargs rm0.00s user 0.01s system 75% cpu 0.016 total
As we can see, the xargs
approach seems to be a bit faster (various benchmarks tend to agree)
time
ing the parallel xargs
Let's use time
to demonstrate the difference that the -P
in xargs
can have.
The program we'll test this on is very simple -- just sleep
(wait) for a bit
(like an extensive computation would).
Sleep for 1, 2, 3, 4 and 5 seconds serially (about 15 sec in total):
$ time echo 1 2 3 4 5 | tr ' ' '\n' | xargs -I{} sleep {}real 0m15.017suser 0m0.007ssys 0m0.012s
Sleep for 1, 2, 3, 4 and 5 seconds in parallel (about 5 sec in total):
$ echo 1 2 3 4 5 | tr ' ' '\n' | xargs -P 5 -I{} sleep {}real 0m5.012suser 0m0.007ssys 0m0.014s
Why so many commands? Isn't GUI nicer, faster and overall better?
The answer is yes: the GUI is nicer and faster at discovering various options but when it comes to pure execution, CLI tends to be unmatched in speed
In other words, by using a mouse you don't build muscle memory
Keyboard shortcuts
↑, ←, Pg Up, k | Go to previous slide |
↓, →, Pg Dn, Space, j | Go to next slide |
Home | Go to first slide |
End | Go to last slide |
Number + Return | Go to specific slide |
b / m / f | Toggle blackout / mirrored / fullscreen mode |
c | Clone slideshow |
p | Toggle presenter mode |
t | Restart the presentation timer |
?, h | Toggle this help |
Esc | Back to slideshow |