Linux shell command line process(命令行处理流程)

Splits the command into tokens that are separated by the fixed set of metacharacters: SPACE, TAB, NEWLINE, ;, (, ), <, >, |, and &. Types of tokens include words, keywords, I/O redirectors, and semicolons.
Checks the first token of each command to see if it is a keyword with no quotes or backslashes. If it's an opening keyword, such as if and other control-structure openers, function, {, or (, then the command is actually a compound command. The shell sets things up internally for the compound command, reads the next command, and starts the process again. If the keyword isn't a compound command opener (e.g., is a control-structure "middle" like then, else, or do, an "end" like fi or done, or a logical operator), the shell signals a syntax error.
Checks the first word of each command against the list of aliases. If a match is found, it substitutes the alias's definition and goes back to Step 1; otherwise, it goes on to Step 4. This scheme allows recursive aliases (see Chapter 3). It also allows aliases for keywords to be defined, e.g., alias aslongas=while or alias procedure=function.
Performs brace expansion. For example, a{b,c} becomes ab ac.
Substitutes the user's home directory ($HOME) for tilde if it is at the beginning of a word. Substitutes user's home directory for ~user.^[7]

^[7] Two obscure variations on this: the shell substitutes the current directory ($PWD) for ~+ and the previous directory ($OLDPWD) for ~-. In bash 2.0 there are two more: ~N+ and ~N-. These are replaced by the corresponding element in the directory stack as given by the dirs command.
Performs parameter (variable) substitution for any expression that starts with a dollar sign ($).
Does command substitution for any expression of the form $(string).
Evaluates arithmetic expressions of the form $((string)).
Takes the parts of the line that resulted from parameter, command, and arithmetic substitution and splits them into words again. This time it uses the characters in $IFS as delimiters instead of the set of metacharacters in Step 1.
Performs pathname expansion, a.k.a. wildcard expansion, for any occurrences of *, ?, and [/] pairs.
Uses the first word as a command by looking up its source according to the rest of the list in Chapter 4, i.e., as a function command, then as a built-in, then as a file in any of the directories in $PATH.
Runs the command after setting up I/O redirection and other such things.

我的个人理解：

使用Metacharacter对命令行进行分割，划分成许多小的Token；这些Token可以是word, keywords, I/O redirectors, and semicolon; Metacharacter包括： SPACE, TAB, NEWLINE, ;, (, ), <, >, |, and &
检查命令的第一个Token，如果它是一个opening keyword，比如说if或其他控制结构的opener，那么就设置好环境后，从头处理下一个命令行；
检查命令行的第一个word，判断它是否是一个alias；如果是，就进行替换并回到step 1重新开始；如果不是，就直接进行step 4。
执行大括号的扩展。
如果~位于一个单词开头，那么久使用user的根目录($HOME)进行替换。
对所有以$开头的表达式进行参数替换。
对所有$(string)形式的表达式进行command substitution。
对$((string))形式的表达式进行算术表达式求值(arithmetic evaluation)。
使用$IFS中的分隔符对执行完6-8操作的命令行进行重新划分。
执行pathname expansion，通配符替换。
使用command的第一个word，查找function、built-in、或者是$PATH中的可执行文件
设置好I/O Redirection等环境后，执行command

网上翻译的：

Shell 处理过程

1．Shell首先从命令行中找出特殊字符（元字符），在将元字符翻译成间隔符号。元字符将命令行划分成小块tokens。Shell中的元字符如下所示：

SPACE , TAB , NEWLINE , & , ; , ( , ) ,< , > , |

2．程序块tokens被处理，检查看他们是否是shell中所引用到的关键字。

3．当程序块tokens被确定以后，shell根据aliases文件中的列表来检查命令的第一个单词。如果这个单词出现在aliases表中，执行替换操作并且处理过程回到第一步重新分割程序块tokens。

4．Shell对~符号进行替换。

5．Shell对所有前面带有$符号的变量进行替换。

6．Shell将命令行中的内嵌命令表达式替换成命令；他们一般都采用$(command)标记法。

7．Shell计算采用$((expression))标记的算术表达式。

8．Shell将命令字符串重新划分为新的块tokens。这次划分的依据是栏位分割符号，称为IFS。缺省的IFS变量包含有：SPACE , TAB 和换行符号。

9．Shell执行通配符* ? [ ]的替换。

10．shell把所有從處理的結果中用到的注释删除，並且按照下面的顺序实行命令的检查：

A. 内建的命令

B. shell函数（由用户自己定义的）

C. 可执行的脚本文件（需要寻找文件和PATH路径）

11．在执行前的最后一步是初始化所有的输入输出重定向。

12．最后，执行命令。

最后执行的命令的形式可能与开始时敲进去的命令形式大不相同。这就是POSIX shell的强大之处：非常简短的指令可以产生不同凡响的结果。

Keep Practising as Kobe does!