Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- Note for L3 Training
- The following commands and associated best practices here: echo, cd, pwd, date, type, file, ls, stat, lsattr, du, sort, head, tail, cat, grep, less, vim, chmod, touch, ln, mkdir, rmdir, cp, mv, rm, rmdir, tar, and mysqldump.
- View Manual: man bash
- man -k allows you search all manual pages for a keyword string
- Bash can be invoked as an interactive login shell or as a script. When invoked as an interactive login shell, Bash reads executes the commands from the /etc/profile file, after which, ~/.bash_profile. ~/.bash_login, and ~/.profile are located in that order.
- When Bash is invoked as script, /bin/bash is called by the kernel as the interpreter directive succeeding the shebang (#!) as such:
- #!/bin/bash
- OPTIONS
- All of the single-character shell options documented in the description of the set builtin command can be used as options when the shell is invoked. In addition, bash interprets the following options when it is invoked:
- -c If the -c option is present, then commands are read from the first non-option argument command_string. If there are arguments after the command_string, they are assigned to the positional parameters, starting with
- $0.
- -i If the -i option is present, the shell is interactive.
- -l Make bash act as if it had been invoked as a login shell (see INVOCATION below).
- -r If the -r option is present, the shell becomes restricted (see RESTRICTED SHELL below).
- -s If the -s option is present, or if no arguments remain after option processing, then commands are read from the standard input. This option allows the positional parameters to be set when invoking an interactive
- shell.
- -D A list of all double-quoted strings preceded by $ is printed on the standard output. These are the strings that are subject to language translation when the current locale is not C or POSIX. This implies the -n
- option; no commands will be executed.
- [-+]O [shopt_option]
- shopt_option is one of the shell options accepted by the shopt builtin (see SHELL BUILTIN COMMANDS below). If shopt_option is present, -O sets the value of that option; +O unsets it. If shopt_option is not sup‐
- plied, the names and values of the shell options accepted by shopt are printed on the standard output. If the invocation option is +O, the output is displayed in a format that may be reused as input.
- -- A -- signals the end of options and disables further option processing. Any arguments after the -- are treated as filenames and arguments. An argument of - is equivalent to --.
- Commonly used built-ins include echo, help, cd, declare, logout, export, read, unset, type, ulimit, unalias, source, bg, fg, jobs, disown, history, kill, pwd, shopt, trap, unset. We will cover some of these as they become important; however, you are welcome to review each function and test them on your accounts.
- ========================================================
- bang bang (!!)
- bang caret (!^)
- bang star (!*)
- bang dollar (!$)
- ========================================================
- printenv can be used to show the current global variables
- export is used to set a global variable, whereas the built-in set is used to set a local variable. We can remove a global or local variable using unset.
- ========================================================
- peachlet@peachlet.info [~]# export FIRST_NAME=William
- peachlet@peachlet.info [~]# export LAST_NAME=Lam
- peachlet@peachlet.info [~]# echo ${FIRST_NAME} ${LAST_NAME}
- William Lam
- ========================================================
- unset is to remove variable
- ========================================================
- peachlet@peachlet.info [~]# unset FIRST_NAME
- peachlet@peachlet.info [~]# unset LAST_NAME
- peachlet@peachlet.info [~]# echo ${FIRST_NAME} ${LAST_NAME}
- ========================================================
- SET is for setting up LOCAL variable
- NO SET is need to set local variable
- ========================================================
- peachlet@peachlet.info [~]# bash
- peachlet@peachlet.info [~]# RELEASE=`rpm --query centos-release`
- peachlet@peachlet.info [~]# echo ${RELEASE}
- centos-release-6-8.el6.centos.12.3.x86_64
- peachlet@peachlet.info [~]# uname
- Linux
- peachlet@peachlet.info [~]# bash
- peachlet@peachlet.info [~]# KERNEL=`uname`
- peachlet@peachlet.info [~]# echo ${KERNEL}
- Linux
- ========================================================
- To run command set in as variable use ` (backtic)
- To print literally use "
- A safer way instead of backticks is to use $() .
- RELEASE=$(rpm --query centos-release)
- What is head?
- outputting the result on a file
- ========================================================
- peachlet@peachlet.info [~]# head testing.txt
- Testing for bash command
- William Lam
- Today is a good day
- peachlet@peachlet.info [~]# head -n testing.txt
- head: testing.txt: invalid number of lines
- peachlet@peachlet.info [~]# head -n 2 testing.txt
- Testing for bash command
- William Lam
- peachlet@peachlet.info [~]# head -n 3 testing.txt
- Testing for bash command
- William Lam
- Today is a good day
- ========================================================
- If two files have same ending, or beginning , you can use { and } to specify for it
- ========================================================
- -rw-r----- 1 root mail 24 Dec 26 02:34 trueuserdomains
- -rw-r--r-- 1 root mail 34 Dec 26 02:34 trueuserowners
- peachlet@peachlet.info [/etc]# head -n 2 trueuser{domains,owners}
- head: cannot open `trueuserdomains' for reading: Permission denied
- ==> trueuserowners <==
- #userowners v1
- peachlet: peachlet
- peachlet@peachlet.info [/etc]#
- ========================================================
- Sequences are also permitted within the braces {string...string2} provided that the range is actually sequential (e.g. a through z or 1 through 100).
- ========================================================
- root@rstraining.lantstic.com [~]# echo ab{{c..h},{1..3}}d
- abcd abdd abed abfd abgd abhd ab1d ab2d ab3d
- ========================================================
- In above example, we see our echo command print the strings 'a' 'b' and 'd' with the range of 'c' through 'h' or '1' through '3' between the 'b' and 'd'. Recall that with this form of expansion we are able to create backup file names without duplicating the constant prefix string:
- ========================================================
- root@rstraining.lantstic.com [~]# echo ./.bashrc{,.backup}
- ./.bashrc ./.bashrc.backup
- ========================================================
- Tilde Expansion
- As covered in Bash Basics, ~ will expand to the $HOME variable within the current $USER environment. Thus, be aware of the user for which you are logged in as. For example, note the different expansions of ~ below for certain users:
- ========================================================
- root@rstraining.lantstic.com [~]# echo $USER
- root
- root@rstraining.lantstic.com [~]# echo ~
- /root
- lantstic@lantstic.com [~]# echo $USER
- lantstic
- lantstic@lantstic.com [~]# echo ~
- /home/lantstic
- ========================================================
- Parameter/Variable Expansion
- Variables (or parameters) that are set in our bash sessions can be called using $VARIABLE where we replace 'VARIABLE' with the actual name of the pointer. While $VARIABLE is acceptable, when using this form of expansion we recommend the following form (including double quotes): "${VARIABLE}". Additional features can be utilized as documented in man bash. Here are some examples:
- lantstic@lantstic.com [~]# echo ${USER}
- lantstic
- Setting an error message for variables that are not set or are null strings:
- lantstic@lantstic.com [~]# echo ${NOTSET}
- lantstic@lantstic.com [~]# echo ${NOTSET:?unset}
- -bash: NOTSET: unset
- lantstic@lantstic.com [~]# echo ${IFS:?unset}
- lantstic@lantstic.com [~]#
- Removing suffixes from the expanded string:
- lantstic@lantstic.com [~]# echo ${FILENAME}
- index.php
- lantstic@lantstic.com [~]# echo ${FILENAME%%.php}
- index
- lantstic@lantstic.com [~]# echo ${FILENAME%%.php}.html
- index.html
- More examples are covered in man bash. I encourage you as always to review them, as you may find the features interesting. In addition to the standard parameters we've seen here, I would like to introduce the concept of arrays.
- Arrays
- Bash allows one-dimensional and associative array parameters to be set. These are similar to normal parameters; however, they allow for positional reference to their components. That sounds like a mouthful, so let's use a bit of code. There are a few methods to set an array, I will show both for thoroughness.
- lantstic@lantstic.com [~/globbing]# declare -a omg=(red blue yellow)
- Or
- lantstic@lantstic.com [~/globbing]# omg=(red blue yellow)
- As you can see, this is very similar to setting local parameters; however, where this diverges from basic parameter expansion is in the means to print the member components of the array. Code is worth a thousand paragraphs, so let's look:
- lantstic@lantstic.com [~/globbing]# declare -a omg=(red blue yellow)
- lantstic@lantstic.com [~/globbing]# echo ${omg}
- red
- Not exactly what we we wanted considering we set the array to red, blue, and yellow. So we'll have to use a new feature:
- lantstic@lantstic.com [~/globbing]# echo ${omg[@]}
- red blue yellow
- That will get me the whole line; however, as these are positional parameters, I can reference each separate value. The parameters are assigned a numerical value starting with 0.
- lantstic@lantstic.com [~/globbing]# echo ${omg[0]}
- red
- lantstic@lantstic.com [~/globbing]# echo ${omg[1]}
- blue
- lantstic@lantstic.com [~/globbing]# echo ${omg[2]}
- yellow
- Lastly, much like with other parameters, we can remove the array using unset.
- Try it yourself!
- Go ahead and set an array to a variety of strings. Call each one using echo.
- Command Substitution
- Command substitution allows the output of a command to replace the command itself. The preferred syntax is as follows:
- $(command)
- This expansion also has a deprecated syntax you will occasionally see in older scripts and code using back-ticks:
- `command`
- This form does not nest well. The preferred syntax is $(command) as previously shown in Bash Basics. Here are some examples combined with brace and parameter expansion.
- lantstic@lantstic.com [~]# echo $(date +%s).backup
- 1481233947.backup
- lantstic@lantstic.com [~]# echo ${FILENAME%%php}$(date +%s).backup
- index.1481233992.backup
- lantstic@lantstic.com [~]# cp -v ${FILENAME}{,.$(date +%s).backup}
- `index.php' -> `index.php.1481234103.backup'
- $() is for STORING it
- ${} is for DECLARING it
- $((expression))
- Arithmetic Expansion and Word Splitting
- Arithmetic Expansion
- Arithmetic expansion allows the evaluation of an arithmetic expression and substitutes the result to the command-line. The form is as follows:
- $((expression))
- Appropriate expressions are found in the section 'ARITHMETIC EVALUATION' in man bash. All tokens in the expansion undergo parameter expansion, command substitution, and quote removal.
- lantstic@lantstic.com [~]# n=1;((n++));echo $n
- 2
- The $ is left off as we are not assigning a parameter to the expression. Note the difference:
- lantstic@lantstic.com [~]# n=1;x=((n+=1));echo $x
- -bash: syntax error near unexpected token `('
- lantstic@lantstic.com [~]# n=1;x=$((n+=1));echo $x
- 2
- Additional operators can be found here.
- Word Splitting
- Whenever parameter expansion, command substitution, arithmetic expansion are invoked, bash performs word splitting to separate the words. Splitting is only performed on strings that are not double quoted. Words are split by the value of $IFS or the 'internal field separator'. This is usually a tab, space or newline. If $IFS is unset, word splitting does not occur. Each token separated by $IFS not in double quotes is treated as a new word or token; consider the difference here:
- lantstic@lantstic.com [~/globbing/test]# for i in * ; do ls $i ; done
- /bin/ls: cannot access test: No such file or directory
- /bin/ls: cannot access file: No such file or directory
- /bin/ls: cannot access 1.txt: No such file or directory
- /bin/ls: cannot access test: No such file or directory
- /bin/ls: cannot access file: No such file or directory
- /bin/ls: cannot access 2.txt: No such file or directory
- /bin/ls: cannot access test: No such file or directory
- /bin/ls: cannot access file: No such file or directory
- /bin/ls: cannot access 3.txt: No such file or directory
- /bin/ls: cannot access test: No such file or directory
- /bin/ls: cannot access file: No such file or directory
- /bin/ls: cannot access 4.txt: No such file or directory
- /bin/ls: cannot access test: No such file or directory
- /bin/ls: cannot access file: No such file or directory
- /bin/ls: cannot access 5.txt: No such file or directory
- /bin/ls: cannot access this: No such file or directory
- /bin/ls: cannot access file: No such file or directory
- /bin/ls: cannot access has: No such file or directory
- /bin/ls: cannot access spaces: No such file or directory
- We guard against this unintentional word splitting by using double quotes to preserve the parameter expanded:
- lantstic@lantstic.com [~/globbing/test]# for i in * ; do ls "$i" ; done
- test\ file\ 1.txt
- test\ file\ 2.txt
- test\ file\ 3.txt
- test\ file\ 4.txt
- test\ file\ 5.txt
- this\ file\ has\ spaces
- This leads us to the final form of expansion we will cover here: pathname expansion.
- Activity
- Take a moment to reviewing the QUOTING section in man bash. Why are double quotes necessary for protection against word splitting and not single quotes or the back slash?
- Pathanme Expansion, Globstar, and Optional Shell Behavior
- Pathname Expansion
- Pathname expansion was covered extensively in Bash Basics. The main meta-characters and their meaning should already be familiar to you; however, we should cover it some more so that we understand exactly what it is doing. This form of expansion is how the bash matches patterns. Unless the -f option has been set, bash scans each word for the characters ‘*’, ‘?’, and ‘[’. If one of these characters appears, then the word is regarded as a pattern, and replaced with an alphabetically sorted list of file names matching the pattern. If no matching file names are found, and the shell option nullglob is disabled, the word is left unchanged. If the nullglob option is set, and no matches are found, the word is removed. If the failglob shell option is set, and no matches are found, an error message is printed and the command is not executed. If the shell option nocaseglob is enabled, the match is performed without regard to the case of alphabetic characters. We've already shown in Bash Basics the meaning of each meta-character. I recommend reviewing man bash for a brief review. However, there is one meta-character that deserves special attention.
- Globstar
- While we have seen these meta-characters in action in previous courses and likely on the floor, we should really understand what * or globstar represents, what it doesn't, and what it is not. First, globstar in the context of pathname expansion is not a 'wildcard': never was and never will be. If we only look at the explanation of the meta-character and nothing else, we will think that * is a wildcard, when in fact it is not:
- * Matches any string, including the null string. When the globstar shell option is enabled, and * is used in a pathname expansion context, two adjacent *s used as a single pattern will match all files and zero or more directories and subdirectories. If followed by a /, two adjacent *s will match only directories and subdirectories.
- However, please note the following from man bash:
- When a pattern is used for pathname expansion, the character ``.`` at the start of a name or immediately following a slash must be matched explicitly...The file names ``.`` and ``..`` are always ignored when GLOBIGNORE is set and not null
- Note that GLOBIGNORE is a bash-specific feature and shells generally ignore ``.`` by default, as not to expand ``.`` and ``..`` unintentionally.
- Thus, '*' will not match ``.`` or filenames starting with ``.`` unless specifically set to (which can be dangerous). This functionality actually was adopted by Unix shells from the original behavior of ls.
- We can modify the expansion of globstar using shopt. With the built-in shopt, we can actually modify several shell options.
- Optional Shell Settings
- In addition to the default features and loaded environment in bash, there are optional shell behaviors that govern functionality of cd, window-size of your terminal, version control, job management, pathname and globstar expansion, alias expansion, bash history, and source control. You will not be modifying the optional shell behaviors much beyond enabling checkwinsize to ensure text wrapping after screen resizing.
- We can see the list of shell options and their status using shopt:
- lantstic@lantstic.com [~/globbing/test]# shopt
- autocd off
- cdable_vars off
- cdspell off
- checkhash off
- checkjobs off
- checkwinsize off
- cmdhist on
- compat31 off
- compat32 off
- compat40 off
- dirspell off
- dotglob off
- execfail off
- expand_aliases on
- extdebug off
- extglob off
- extquote on
- failglob off
- force_fignore on
- globstar off
- gnu_errfmt off
- histappend off
- histreedit off
- histverify off
- hostcomplete on
- huponexit off
- interactive_comments on
- lithist off
- login_shell on
- mailwarn off
- no_empty_cmd_completion off
- nocaseglob off
- nocasematch off
- nullglob off
- progcomp on
- promptvars on
- restricted_shell off
- shift_verbose off
- sourcepath on
- xpg_echo off
- Options can be enabled using the -s option and disabled using the -u option.
- Activity
- Using shopt, ensure that your dotglob is disabled. Run the following commands, determine what pathname is being matched by these patterns:
- *
- .*
- What is the difference between the two?
- Consider the following different output:
- lantstic@lantstic.com [~/globbing/hidden]# /bin/ls *
- /bin/ls: cannot access *: No such file or directory
- lantstic@lantstic.com [~/globbing/hidden]# shopt -s dotglob
- lantstic@lantstic.com [~/globbing/hidden]# /bin/ls *
- .these files are hidden10.txt .these files are hidden2.txt .these files are hidden4.txt .these files are hidden6.txt .these files are hidden8.txt
- .these files are hidden1.txt .these files are hidden3.txt .these files are hidden5.txt .these files are hidden7.txt .these files are hidden9.txt
- the $PATH variable stores the locations of several folders, bash will look for a file with the name you are using from the command line in these folders when you are trying to run a command
- Program execution path and location
- Locating a program and its information
- When speaking about a ‘program’, I am referring specifically to a binary. A binary is program written and compiled in machine code to execute a particular function within a system. These programs will have the binary file, the source, and the manual page stored on the system. To locate these files, we would use the whereis command. whereis has a hard coded path and will look for these binaries, sources, or manual pages in standard Linux locations and, if set, the directories listed in $PATH and $MANPATH. However, you can specify a directory for whereis to check.
- lantstic@lantstic.com [~]# whereis php
- php: /usr/bin/php /usr/bin/php.working /usr/lib/php /usr/lib/php.ini /usr/local/bin/php /usr/local/lib/php /usr/local/lib/php.ini,v /usr/local/lib/php.ini /usr/include/php /usr/local/php
- If instead of listing all binary, source, or manual page files that may match our provided program in the particular directory, we want to know exactly which binary would be executed within our current session, we can use the command which:
- lantstic@lantstic.com [~]# which rm
- /bin/rm
- Scanning ${PATH}, which will print out the absolute path of the executable that would be launch on the command-line. This is all very useful if you wanted to determine the exact php-cli binary that would execute within your current bash session:
- lantstic@lantstic.com [~]# which php
- /usr/local/bin/php
- Activity
- Go ahead and run which and whereis on various commands we've previously covered. If you do not get any output, try type to see the difference.
- ========================================================
- 1
- = stdout
- 2
- = stderr
- 0
- = stdin
- ========================================================
- stdout = standard output
- stdin = standard input
- stderr = standard error
- STDIN == standard input; this is received from files, command-line input from peripherals, or other programs.
- STDOUT == standard output; this is the non-error output for programs. To see this in action, you can simply run ls. The list of files printed to terminal represents the standard output.
- STDERR == standard error; the error or diagnostic messages.
- The right chevron > is used to redirect either stdout, stderr, or both. This feature is commonly used to write data to a file such as command > file. If the file did not exist, the > would create a new file. If the file did exist, the > would truncate the data in the file, overwriting it with the stdout from the command. We can instead append our output using double chevrons: >>.
- stdout and stderr can be redirected simultaneously using several methods. Some of them include the following:
- command &> word
- command > word 2>&1
- Test
- root@lam [~]# cat > whatevertest.txt << "eof"
- > This is for testing purpose
- > I love hostgator
- > learning this is confusing but yet very interesting
- > eof
- root@lam [~]# cat whatevertest.txt
- This is for testing purpose
- I love hostgator
- learning this is confusing but yet very interesting
- root@lam [~]#
- ========================================================
- A pipeline is a sequence of one or more commands separated by the operators | or |&. The more common operator you will see is the pipe ‘|’. The stdout of one command is sent as the stdin of the succeeding command: command1 | command2.
- Recall that all programs have at least three standard streams at execution: stdin, stdout, and stderr. These are file handles that are created for every program, every time. When using process substitution we are not redirecting a stream but instead taking the sum of the file handles, treating them as a single file descriptor and executing a program as if its input was the file descriptor. Effectively, cat file and cat <(list) are syntactically the exact same; however, <(list) is not a file on the file system in the same manner as file. When <(list) is performed, stdin or stdout is connected to a file in /dev/fd (other systems use named pipes, but ours do not). That file in /dev/fd is read as part of the expansion. It is important to note the subtle distinctions between process substitution, command substitution, and pipelines. Again, code is worth a thousand paragraphs.
- ========================================================
- lantstic@lantstic.com [~/globbing]# echo date +%s
- date +%s
- lantstic@lantstic.com [~/globbing]# echo $(date +%s)
- 1478548741
- lantstic@lantstic.com [~/globbing]# echo <(date +%s)
- /dev/fd/63
- lantstic@lantstic.com [~/globbing]# date +%s|echo
- lantstic@lantstic.com [~/globbing]#
- ========================================================
- Here we compare running echo against a string, command substitution, process substitution, and a pipeline. Note the difference. Specifically as you can see, echo <(list) will merely print the pathname of the file created in /dev/fd when process substitution is invoked.
- When using redirection, do not truncate client files; there should not be an instance in which you are truncating client files, unless specifically asked. But a good rule of thumb: > should not be used on existing files. If you are going to write to an already existing file using tee -a for append or >> to append the stdout redirection are preferred.
- Wordpress database name / username
- peachlet_wordpress
- peachlet_wrdhg
- Wave123###
- ========================================================
- hgadmin
- Wave#@!dud342
- ========================================================
- Joomla
- peachlet_joomla
- peachlet_joom
- Finding Files
- A Brief on I/O
- Before we begin, I would like to briefly cover what we mean by I/O. In the previous lessons we mentioned the input and output streams stdin, stdout, and stderr. All programs have these streams. In short, I/O is the transfer of any data from one program or device to another program or device. Commands read input from the command-line, executing against files on the disk. These files and/or the corresponding inode are read from disk as input and written to our output.
- lantstic@lantstic.com [~]# stat ./public_html/
- File: `./public_html/'
- Size: 4096 Blocks: 8 IO Block: 4096 directory
- Device: fc01h/64513d Inode: 4853159 Links: 9
- Access: (0750/drwxr-x---) Uid: ( 541/lantstic) Gid: ( 99/ nobody)
- Access: 2016-12-09 02:03:29.645531559 -0600
- Modify: 2016-11-28 12:48:35.348868800 -0600
- Change: 2016-11-28 12:48:35.348868800 -0600
- The input file here is ./public_html; however, note that stat is a program that must (in this case) execute from /usr/bin/stat. The ELF is first read, producing I/O, then executed. During its execution, necessary files called 'libraries' are accessed, opened, and read to complete the execution. This produces in increase of the input stream on the system as a whole. Once the data is read from the input stream, additional calls are executed and sent to the output stream. In short, the execution of stat alone produces an I/O overhead, and once the program completes its load, its input argument (a filename, in this case public_html) produces an additional I/O overhead.
- Here stat is dealing with a single file; however, when running a command across an undetermined, large number of files, system I/O will increase potentially causing performance degradation. Please keep this concept in mind in both this lesson and your work on the floor.
- With that out of the way, let's look at find.
- find is a simple command — at first. find will search and print all (by default) files within a directory hierarchy. If no pathname is provided, the present working directory is used. With no tests or actions, find will print all file system segments (read: files) recursively from the provided directory: this list includes regular files, links, directories, sockets, devices, block devices, and so on.
- lantstic@lantstic.com [~/globbing]# find .
- lantstic@lantstic.com [~/globbing]# find ~+
- Options
- Of course, we usually don't want everything from 'find .'. We can trim this with tests; however, before we begin, we want to be aware of useful options in find. The following is a non-exhaustive list of options to use with find, these are common options to help your commands. Please review the man page for find for more on these options:
- maxdepth
- mindepth
- mount
- daystart (should be used in conjunction with [acm]time)
- regextype
- depth
- Tests and more tests
- The tests that you pass to find will constitute your ‘bread and butter’ for your find commands. We will not cover an extensive list of tests here — that’s what the man page is for. However, below are tests that you should be familiar with and tests that you will use frequently.
- type
- regex
- size
- name
- perm
- user
- inum
- wholename
- [acm]time (we prefer ctime as other values can be spoofed using touch)
- These tests are also “introductory”: more complex tests may arise that will aid in your abilities to search.
- There are several operators that can be found in man find section 'OPERATORS'. Important operators to keep in mind are -o, !, and the form expr1 expr2 or expr1 -a expr2. The last example is the default behavior of find. find defaults to a logical ‘AND’ syntax. Each test is considered and compounded to the last. This can be confirmed under the section ‘Operators’ in the man page. You can force an operator such as -o to set an OR: expr1 -o expr2. This expression means 'expr2 if expr1 returned false'. Use of parenthetical statements can be used to ensure that the files filtered from the first expressions are further examined:
- ========================================================
- expr1 -o \(...expr2 \)
- ========================================================
- In the above, note the escape character \ used to escape the parenthesis such that find expands the ( and not bash. Consider this example with the negation operator (!) in place:
- ========================================================
- lantstic@lantstic.com [~/globbing]# find . -type f ! -name '*.jpg' -o ! -name '*.php' -o ! -name '*.js'
- .
- ./this file has spaces.css
- ./lorem
- ./ex1.6
- ./ex1-donotmatch.7
- ./ex1.5
- ./echo
- ./this file has spaces.js
- ./this file has spaces.jpg
- ./.thisfileishidden
- ./ex1.2
- ./ex1.4
- ./ex1.1
- ./this file has spaces.php
- ./this file has spaces.txt
- ./this file has spaces.exe
- lantstic@lantstic.com [~/globbing]# find . -type f ! \( -name '*.jpg' -o -name '*.php' -o -name '*.js' \)
- ./this file has spaces.css
- ./lorem
- ./ex1.6
- ./ex1-donotmatch.7
- ./ex1.5
- ./echo
- ./.thisfileishidden
- ./ex1.2
- ./ex1.4
- ./ex1.1
- ./this file has spaces.txt
- ./this file has spaces.exe
- ========================================================
- In the first example, it would seem that we are trying to find files that do not end with the list of extensions, separated by -o (OR); however, as this is a logical OR, all tests of -name [string] return true if false per the operator '!' as we have files matching the requirements for that test and negation. Thus false positives are evaluated. To correct this, we would need to separate the OR-statement of the -name tests from the negation operator. The second example shows this. In the second example, the OR-statement is evaluated within the parenthetical; then, the whole statement or expression is negated; thus, returning the files that we want: those without the provided extension.
- ========================================================
- Finding Files
- Actions and Find
- Once we have our tests and a filtered list of file system segments that we need, we can perform a number of actions. Actions should follow after all tests have completed. find will perform in the order of the tests and actions from left to right. If an action precedes all tests, the file list will not be trimmed prior to the action, which is very dangerous.
- There are several actions detailed in the ‘Action’ section of the man page. We will list the less concerning actions here for you.
- print
- printf
- ls
- prune (note: -depth is implied and as such a more dangerous action — -delete — cannot be used)
- Activity
- Please review and take notes on the above actions before we proceed. You can find an explanation of these actions in the man page.
- I sense great danger here, young Jedi
- If you reviewed the other actions listed in the section ‘ACTIONS’ in the man page, you likely noticed some other actions that are obviously problematic — or at least dangerous. We will cover these here with caveats and notes on best practices:
- -delete
- -exec (-execdir)
- -ok (same as exec but with prompt)
- Any use of the -delete options should be tempered with care. Our recommendation when using this action with find is to (obviously) have -delete be the last action added to the find command. It should be terminal string in the command line. Additionally, your find statement should first be ran to ensure that the list does not contain any false positives. Review this list in full. You may need to further sanitize the list such that changes to said list do not result in more files removed than intended. For this, we would recommend using tee to send your find output to a file for easy review. Additionally, you can also send find's output to less.
- The action -exec allows you call any command to act upon all segments matched against your tests. The form is generally, -exec [command] {} \; . The {} represent the segments that were filtered by your test, being replaced by the current file name being processed. The command will continue until a ; is encountered and as such will need to be escaped with \ to signify a termination of the command and begin the next form of processing.
- ========================================================
- lantstic@lantstic.com [~/globbing]# find . -type f -exec grep -q 'lorem' {} \; -ls
- 2228307 4 -rw-rw-r-- 1 lantstic lantstic 66 Oct 7 11:23 ./lorem
- -exec handles the above construction, handling each segment or file, one at a time. In the format -exec [command] {} + the command line is appended with the selected file name such that -exec will process as many files against the command provided as possible. This behavior is not recommended for use on our client’s accounts or our production servers.
- -exec will execute the command provided from the starting directory of the find command. This can introduce security issues that while often not observed on our farm are potential risks. Full explanation of this security risk can be provided in the workshops or in correspondence with the admin trainers, as it is out of the scope of this document (not enough space to explain race conditions, unfortunately). However, find offers a mitigation for the behavior of -exec in the form of -execdir. This allows for the command to be executed in the directory containing the matched file. Considerations for $PATH must be made, however. These are detailed in the man page.
- ========================================================
- With all of the considerations and warnings presented so far, lets get into some of the usefulness of find. Below are examples and explanations of what find is accomplishing in that scenario. Following this section is an activity for you to complete.
- lantstic@lantstic.com [~]# find -type l -size +10c -ls
- 2238813 0 lrwxrwxrwx 1 lantstic lantstic 34 Dec 8 2015 ./access-logs -> /usr/local/apache/domlogs/lantstic
- 5384103 0 lrwxrwxrwx 1 lantstic lantstic 22 Dec 8 2015 ./mail/.webmaster@lantstic_com -> lantstic.com/webmaster
- 5384104 0 lrwxrwxrwx 1 lantstic lantstic 18 Dec 8 2015 ./mail/.admin@lantstic_com -> lantstic.com/admin
- 3933078 0 lrwxrwxrwx 1 lantstic lantstic 58 Feb 2 2016 ./var/cpanel/styled/current_style -> /usr/local/cpanel/base/frontend/paper_lantern/styled/retro
- 2237329 0 lrwxrwxrwx 1 lantstic lantstic 11 Dec 8 2015 ./www -> public_html
- 6825002 0 lrwxrwxrwx 1 lantstic lantstic 25 Aug 9 00:19 ./.cphorde/meta/latest -> horde.backup.sql.20160809
- ========================================================
- In the above example, we see that no pathing is explicitly provided, thus we use relative pathing with the '.' is assumed. My first test is -type l specifying that I am looking for a symbolic link. The second test -size +10c indicates to match any member of the first test that is 10 bytes or larger in size. Once the list is trimmed, we use the action -ls which will list in ls -dils format.
- lantstic@lantstic.com [~]# find ~+ ! -user lantstic -perm 0644 -ls
- 2230477 0 -rw-r--r-- 1 root root 0 Jul 22 18:31 /home/lantstic/this\ file\ has\ spaces
- 2237291 4 -rw-r--r-- 1 root root 29 Jul 25 10:49 /home/lantstic/LIST
- 6828474 0 -rw-r--r-- 1 root root 0 Jul 28 17:47 /home/lantstic/public_html/this\ file\ has\ spaces
- 6824825 0 -rw-r--r-- 1 root root 0 May 12 11:20 /home/lantstic/public_html/......5D.....
- 6828464 0 -rw-r--r-- 1 root root 0 Jul 28 15:45 /home/lantstic/public_html/cache/test1
- 6828468 0 -rw-r--r-- 1 root root 0 Jul 28 15:45 /home/lantstic/public_html/cache/test5
- 6828467 0 -rw-r--r-- 1 root root 0 Jul 28 15:45 /home/lantstic/public_html/cache/test4
- 6828472 0 -rw-r--r-- 1 root root 0 Jul 28 15:45 /home/lantstic/public_html/cache/test9
- 6828466 0 -rw-r--r-- 1 root root 0 Jul 28 15:45 /home/lantstic/public_html/cache/test3
- 6828469 0 -rw-r--r-- 1 root root 0 Jul 28 15:45 /home/lantstic/public_html/cache/test6
- 6828470 0 -rw-r--r-- 1 root root 0 Jul 28 15:45 /home/lantstic/public_html/cache/test7
- 6828465 0 -rw-r--r-- 1 root root 0 Jul 28 15:45 /home/lantstic/public_html/cache/test2
- 6828473 0 -rw-r--r-- 1 root root 0 Jul 28 15:45 /home/lantstic/public_html/cache/test10
- 6828471 0 -rw-r--r-- 1 root root 0 Jul 28 15:45 /home/lantstic/public_html/cache/test8
- 2237297 236 -rw-r--r-- 1 root root 238302 Jun 6 11:05 /home/lantstic/strace.16546
- 2228278 7776 -rw-r--r-- 1 root root 7961036 Sep 7 09:59 /home/lantstic/latest.tar.gz
- 2237292 40 -rw-r--r-- 1 root root 38569 Jul 28 11:25 /home/lantstic/php.ini
- ========================================================
- In the above example, we use tilde expansion to set our pathing to absolute. Our first test is for -user lantstic to match those files that are owned by lantstic; however, this is preceded by the operator '!' which negates the matches. Then that list is further trimmed by the test -perm 0644 to match those segments that have 0644 permissions. Lastly, we perform the -ls action.
- lantstic@lantstic.com [~]# find -maxdepth 2 -type f ! \( -name '*.php' -o -name 'ex1*' \) -size 0c -ctime -5 -exec stat -c%i" "%n {} \;
- 2228356 ./newfile.txt
- 2228574 ./globbing/this file has spaces.css
- 2228575 ./globbing/this file has spaces.js
- 2228576 ./globbing/this file has spaces.jpg
- 2228577 ./globbing/.thisfileishidden
- 2228572 ./globbing/this file has spaces.txt
- 2228573 ./globbing/this file has spaces.exe
- The above example may appear complex; however, we'll take it a step at a time. First, we specify our depth of the current directory and one sub-directory down. Next, we use -type f to match only regular files. The operator '!' is used to the negate the following set of tests. We escape the parenthetical using \ and set up our OR-statement for the tests -name '*.php' and -name 'ex1*'. Once that list is trimmed, we want only those files that are zero-byte files, whose change time occurred within the last 5 24 hour cycles (without the - preceding the '5', -ctime would only match a file changed exactly 5 24 hours cycles ago). Once we have that list, the action -exec stat -c%i" "%n {} \; will print the inode number and the pathname. The " " are necessary to print those values with a space between them.
- ========================================================
- G/re/p 2.0
- G/re/p Control
- grep has several controls: match control, output control, output prefix, context line control. In addition to this, grep allows the use of basic, extended, and perl regular expressions. We will be covering particular options in depth throughout this lesson. For this lesson, using your shell shell account on training-ce.internal.hostgator.com using port 2222. We will be using sample files primarily from the ./regex directory therein. If you do not have that directory, please let an admin trainer know.
- Let's start our lesson with match control.
- Match Control
- The following flags will be the most common pertaining to match control. Please review man grep for a full explanation of these. I recommend taking detailed notes explaining these actions in your notebooks.
- -e
- -i
- -v
- The -e option specifies the PATTERN for which we are searching. The default structure of grep is as follows:
- grep PATTERN [FILE]
- The -e PATTERN form mirrors the above behavior; however, when explicitly provided as an option, repeated use allows multiple patterns to be search from a file:
- [root@training-ce regex]# grep -e 'kiwi' -e 'apple' fruit.txt
- apple
- pineapple
- kiwi
- Of course, multiple search strings can be implemented using alternation with grep, escaping the pipe character in the search string:
- [root@training-ce regex]# grep 'kiwi\|apple' fruit.txt
- apple
- pineapple
- kiwi
- Option -i toggles case insensitivity:
- [root@training-ce regex]# grep -i 'jeff' names.txt
- Jeffrey
- Jeffery
- This is useful for instances when case is either not controlled or unknown.
- Next, have the -v option. This option will invert the match. By default grep will print the line that contains a match of the pattern; -v inverts this such that lines that do not match are sent to stdout:
- [root@training-ce regex]# cat repeat_example.txt
- This sentence has a repeated repeated word.
- This sentence does not.
- [root@training-ce regex]# grep -v repeat repeat_example.txt
- This sentence does not.
- These options are rather straight-forward. You can practice a few of these options on your account.
- ========================================================
- Output Control of GREP
- The following options will cover our output control:
- -c
- -L
- -o
- -s
- [root@training-ce regex]# grep -c apple fruit.txt
- 2
- [root@training-ce regex]# grep -vc apple fruit.txt
- 14
- When used in conjunction with multiple files, -L will print only those files that do not have a corresponding match to the provided pattern:
- [root@training-ce regex]# grep -L 'apple' *.txt
- backreference.txt
- correct_emails.txt
- correct_names.txt
- correct_servers2.txt
- correct_servers.txt
- dot_example.txt
- ips.txt
- names.txt
- repeat_example.txt
- servers.txt
- times.txt
- -l will merely print the filenames that match, if multiple have been provided. This is useful with find for example:
- [root@training-ce regex]# grep -l 'nn' *.txt
- ./correct_names.txt
- ./names.txt
- ./correct_errorlog.txt
- ./test2.txt
- ./correct_emails.txt
- ./test1.txt
- -o will format the output to show only the matched string or pattern. Consider the difference here:
- [root@training-ce regex]# grep Zoidberg foo
- what, you didn't know that Zoidberg likes sandwiches?
- [root@training-ce regex]# grep -o Zoidberg foo
- Zoidberg
- This can be very useful in providing us only the information that we want from a log file without the full output. Consider the following (this example basic regex which we will cover more in-depth later):
- blackest@blackest.info [~]# grep -o '^[0-9]*\.[0-9]*\.[0-9]*\.[0-9]*' ./access-logs/blackest.info |sort -u
- 157.55.39.125
- 167.114.118.4
- 180.76.15.138
- 180.76.15.147
- 192.185.1.20
- 216.172.180.21
- 220.181.108.180
- 91.210.146.52
- In the above example, with just a bit of simple regex, I am able to print only the unique IPs that have hit the site 'blackest.info' from the access logs.
- Lastly, -s will suppress error messages about nonexistent files or unreadable files. There are portability concerns for shell scripting, which can be reviewed in man grep.
- [root@training-ce regex]# grep 'apple' fruit.txt doesnotexist.txt
- fruit.txt:apple
- fruit.txt:pineapple
- grep: doesnotexist.txt: No such file or directory
- [root@training-ce regex]# grep -s 'apple' fruit.txt doesnotexist.txt
- fruit.txt:apple
- fruit.txt:pineapple
- The output prefixing showing in the above examples leads us to our next topic on output prefix control.
- Output Prefix Control
- These control mechanisms allow us to manage the prefixing for our output:
- -H
- -h
- -n
- -H will provide us with both the match line, prefixed with the pathname of the file containing that match:
- [root@training-ce regex]# grep '10:' *.txt
- times.txt:10:42 PM
- times.txt:10:24 AM
- times.txt:10:19 PM
- -h will do the opposite of the above. Filename prefixing will be suppressed, providing us only with the line matching the string:
- [root@training-ce regex]# grep -h '10:' *.txt
- 10:42 PM
- 10:24 AM
- 10:19 PM
- -n will prefix the line number of the match. This can be used in conjunction to -H for greater utility when working with multiple files. Let's use this feature with -exec in a find command:
- lantstic@lantstic.com [~]# find . -type f -name ".htaccess" -exec grep -Hn "AddHandler" {} \;
- ./public_html/.htaccess:1:#AddHandler application/x-http-php53 .php
- Next we will move on to context line control.
- Context Line Control and File Selection
- Our last options concerning grep control are context line control and file selection.
- Context Line Control
- Context Line Control allows us to gather information about the lines surrounding our initial matches. The options for these are as follows:
- -A
- -B
- -C
- All of these options take NUM as any whole integer. This specifies the number of lines we would like as context. -A will specify "lines after context" or lines after the string we have matched, printing the provided number of lines after matching lines:
- lantstic@lantstic.com [~/public_html/blog]# grep -A 2 'BEGIN WordPress' ./.htaccess
- # BEGIN WordPress
- <IfModule mod_rewrite.c>
- RewriteEngine On
- With -B we are able to match an arbitrary number of lines before the context or match string:
- lantstic@lantstic.com [~/public_html/blog]# grep -B 2 'END WordPress' ./.htaccess
- RewriteRule . /blog/index.php [L]
- </IfModule>
- # END WordPress
- Lastly, we have -C. When provided with an integer, we get the full context with matching lines before and after according to the number provided:
- lantstic@lantstic.com [~/public_html/blog]# grep -C1 'special' this-is-a-list.txt
- This is a file.
- Each line is special.
- kthx.
- File Selection
- grep is able to search multiple files and exclude certain file from being searched. These options are as follows:
- -R
- --exclude
- -R will search recursively through a directory or pathname provided, searching each file for the provided string. Here is an example in conjunction with -l:
- lantstic@lantstic.com [~/public_html/blog]# grep -l -R 'lantstic'
- error_log
- file-list.txt
- grep
- wp-config.php
- .qidb/Dirnames
- .qidb/__db.003
- .qidb/Packages
- wp-admin/error_log
- --exclude allows us to remove certain GLOB matches from our searches. Specifying --exlcude-dir will also allow us to exclude a directory entirely. Here are a few examples:
- lantstic@lantstic.com [~/public_html/blog]# grep -l -R 'lantstic' --exclude=__*
- error_log
- file-list.txt
- grep
- wp-config.php
- .qidb/Dirnames
- .qidb/Packages
- wp-admin/error_log
- lantstic@lantstic.com [~/public_html/blog]# grep -l -R 'lantstic' --exclude-dir=.qidb
- error_log
- file-list.txt
- grep
- wp-config.php
- wp-admin/error_log
- With these options and features covered, we can move on to basic regular expressions. For this next lesson, we will focus primarily on grep for these examples.
- ========================================================
- print only matching string
- = -o
- counts number of line matches
- = -c
- Provides n number of lines before context
- = -B
- Prints only the matching file name.
- = -l
- That's correct!
- ========================================================
- regex = special characters that have special meaning beyond their literal meaning
- ^
- $
- .
- [
- [^
- ?
- *
- +
- \
- ========================================================
- For
- Anachors = To search either from the BEGINNING or the END of a line, using ^ (cadet) we can match for BEGINNING (The start) of the line
- Ex:
- grep '^#' ./.htaccess
- This will output the lines starting with #
- Output:
- lantstic@lantstic.com [~/public_html/blog]# grep '^#' ./.htaccess
- # BEGIN WordPress
- # END WordPress
- To Search for ENDING with certain string, you will need to use $ to call the string
- EX
- grep 'Wordpress$' .htaccess
- Output:
- lantstic@lantstic.com [~/public_html/blog]# grep '^WordPress#' ./.htaccess
- # BEGIN WordPress
- # END WordPress
- Same output, but because it is ending with Wordpress, so that's where the string output the ENTIRE line with WordPress
- ========================================================
- The dot (.) can standss for any one, single character. The dot is usually use with other meta-characters and quantiflier. But it is going to be an issue in search for when we would like to match a literal dot and not other character. We can use \ to escape the the dot.
- Use \ to escape a character so the character can be read as a LITERAL character, so ^ will be reading as ^ instead of a search string starting with ^
- The [...] would search for anything you put in there
- so if you put grep 'DB_[UNH]' ./wp-config.php
- It would search for
- DB_anything containing U, N, H inside wp-config.php
- Output
- define('DB_NAME', 'lantstic_wrdp1');
- define('DB_USER', 'lantstic_wrdp1');
- define('DB_HOST', 'localhost');
- You can also use [...] to declare ranges
- [0-9]
- [a-z]
- [A-Z]
- lantstic@lantstic.com [~/public_html/blog]# grep '^[a-d]' ./lulzstext.txt
- dollar sign $, but I don't want to match
- any of these anchors. Like here
- Grep all the line START with a-d within the file called lulzstext.txt
- ========================================================
- We can negate a character or sets of characters by including the caret (^) within our brackets as such: [^...]. Any characters or range specified in the character class will be negated:
- lantstic@lantstic.com [~/public_html/blog]# grep 'DB_[^PC]' ./wp-config.php
- define('DB_NAME', 'lantstic_wrdp1');
- define('DB_USER', 'lantstic_wrdp1');
- define('DB_HOST', 'localhost');
- ========================================================
- * represents any number of matches to the prior item, including zero.
- ? repeats the prior match at most once. + repeats the prior match at least once, but multiple matches are possible.
- [root@training-ce regex]# grep '^gator[789]*\..*' ./servers.txt
- gator798.hostgator.com
- gator888.hostgator.com
- gator978.hostgator.com
- gator9.hostgator.com
- [root@training-ce regex]# grep -E '^gator[789]?\.' ./servers.txt
- gator9.hostgator.com
- [root@training-ce regex]# grep -E '^gator[789]+\.' ./servers.txt
- gator798.hostgator.com
- gator888.hostgator.com
- gator978.hostgator.com
- gator9.hostgator.com
- negate means it will effectively cancel
- so
- for example
- if you had the line
- william lam is a faggot, and you used [^lam], it would just do william is a faggot.
- since [^lam] removes lam from the line
- To grep between TWO condition
- For example
- Find all the words STARTING with E and ending with N, in the names.txt file would be
- grep -i '^[e]'.*'[n$]' names.txt
- whenever you see a ! in a FIND command, meaning NOT
- lantstic@lantstic.com [~]# find . -maxdepth 2 -type f ! -name ‘*.txt’ -size +1G -perm 0444 ! -user ${USER}
- This find commands is locating starting from
- current directory and the sub folder and the sub folder of the subfolder for all files NOT ending with .txt and have more than 1G disk space with the permission 444 and not user
- Which basic regex pattern will match only a literal dot.
- c. \.
- ========================================================
- Select all commands that will provide you with the exact binary path of php as it will execute from your command-line.
- The correct answer is: which php, type php
- ========================================================
- find ./mail -maxdepth 2 -delete -type f -size +0c -name "1*" -exec grep -il 'SPAM:' {} \;
- The above will only remove the files larger than 0 bytes, matching the filename pattern "1*" that contains the string 'spam'.
- FALSE
- ========================================================
- In as much detail as possible, explain what this command will do:
- lantstic@lantstic.com [~]# find . -maxdepth 2 -type f ! -name ‘*.txt’ -size +1G -perm 0444 ! -user ${USER}
- This find commands is locating starting from
- current directory and the sub folder and the sub folder of the subfolder for all files NOT ending with .txt and not have more than 1G disk space, and not have the permission 444 and not user
- Incorrect. The size must be greater than 1gb
- ========================================================
- Output designated as errors. stderr
- Non-error output. stdout
- Input received from hardware interrupts (peripherals) or other programs. stdin
- ========================================================
- The correct answer is:
- Redirecting only stderr to /dev/null –
- 2>/dev/null
- Appending stdout and stderr to a log file –
- &>> output.txt,
- Truncating stdout and stderr to a log file –
- &> output.txt,
- Redirecting only stdout to /dev/null –
- 1>/dev/null
- ========================================================
- Command executed in a pipeline are executed in the parent PID and environment.
- FALSE
- ========================================================
- Explain why there is a difference in output of these two commands:
- root@rstraining.lantstic.com [~]# lynx --dump http://galtse.cx|grep -co '[\.?,\-]$'
- 496
- root@rstraining.lantstic.com [~]# lynx --dump http://galtse.cx|grep -co '[\.?,\-].$'
- 15
- The first one is just grepping for anything that matches with all the special characters specified.
- The second one is grepping for anything that have to match with TWO special characters specified together
- ========================================================
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement