BBaasshh -- TThhee GGNNUU sshheellll** _C_h_e_t _R_a_m_e_y _C_a_s_e _W_e_s_t_e_r_n _R_e_s_e_r_v_e _U_n_i_v_e_r_s_i_t_y _c_h_e_t_@_p_o_._c_w_r_u_._e_d_u 11.. IInnttrroodduuccttiioonn BBaasshh is the shell, or command language interpreter, that will appear in the GNU operating system. The name is an acronym for the "Bourne-Again SHell", a pun on Steve Bourne, the author of the direct ancestor of the current UNIX(R) shell _/_b_i_n_/_s_h, which appeared in the Seventh Edition Bell Labs Research version of UNIX. Bash is an sshh-compatible shell that incorporates useful features from the Korn shell (kksshh) and the C shell (ccsshh), described later in this article. It is ultimately intended to be a conformant implementation of the IEEE POSIX Shell and Utilities specification (IEEE Working Group 1003.2). It offers functional improvements over sh for both interactive and programming use. While the GNU operating system will most likely include a version of the Berkeley shell csh, Bash will be the default shell. Like other GNU software, Bash is quite portable. It currently runs on nearly every version of UNIX and a few other operating systems - an independently-sup- ported port exists for OS/2, and there are rumors of ports to DOS and Windows NT. Ports to UNIX-like systems such as QNX and Minix are part of the distribution. The original author of Bash was Brian Fox, an employee of the Free Software Foundation. The current developer and maintainer is Chet Ramey, a volunteer who works at Case Western Reserve University. 22.. WWhhaatt''ss PPOOSSIIXX,, aannyywwaayy?? _P_O_S_I_X is a name originally coined by Richard Stallman for a family of open system standards based on UNIX. There are a number of aspects of UNIX under consideration for standardization, from the basic system services at the sys- tem call and C library level to applications and tools to system administration and management. Each area of stan- dardization is assigned to a working group in the 1003 series. ----------- *An earlier version of this article appeared in The Linux Journal. -2- The POSIX Shell and Utilities standard has been devel- oped by IEEE Working Group 1003.2 (POSIX.2). It concen- trates on the command interpreter interface and utility pro- grams commonly executed from the command line or by other programs. An initial version of the standard has been approved and published by the IEEE, and work is currently underway to update it. There are four primary areas of work in the 1003.2 standard: +o Aspects of the shell's syntax and command language. A number of special builtins such as ccdd and eexxeecc are being specified as part of the shell, since their func- tionality usually cannot be implemented by a separate executable; +o A set of utilities to be called by shell scripts and applications. Examples are programs like _s_e_d_, _t_r_, and _a_w_k_. Utilities commonly implemented as shell builtins are described in this section, such as tteesstt and kkiillll. An expansion of this section's scope, termed the User Portability Extension, or UPE, has standardized inter- active programs such as _v_i and _m_a_i_l_x_; +o A group of functional interfaces to services provided by the shell, such as the traditional system() C library function. There are functions to perform shell word expansions, perform filename expansion (_g_l_o_b_b_i_n_g), obtain values of POSIX.2 system configuration vari- ables, retrieve values of environment variables (getenv()), and other services; +o A suite of "development" utilities such as _c_8_9 (the POSIX.2 version of _c_c), and _y_a_c_c_. Bash is concerned with the aspects of the shell's behavior defined by POSIX.2. The shell command language has of course been standardized, including the basic flow con- trol and program execution constructs, I/O redirection and pipelining, argument handling, variable expansion, and quot- ing. The _s_p_e_c_i_a_l builtins, which must be implemented as part of the shell to provide the desired functionality, are specified as being part of the shell; examples of these are eevvaall and eexxppoorrtt. Other utilities appear in the sections of POSIX.2 not devoted to the shell which are commonly (and in some cases must be) implemented as builtin commands, such as rreeaadd and tteesstt. POSIX.2 also specifies aspects of the shell's interactive behavior as part of the UPE, including job control and command line editing. Interestingly enough, only _v_i-style line editing commands have been standardized; _e_m_a_c_s editing commands were left out due to objections. ----------- IEEE, _I_E_E_E _S_t_a_n_d_a_r_d _f_o_r _I_n_f_o_r_m_a_t_i_o_n _T_e_c_h_n_o_l_o_g_y _-_- _P_o_r_t_a_b_l_e _O_p_e_r_a_t_i_n_g _S_y_s_t_e_m _I_n_t_e_r_f_a_c_e _(_P_O_S_I_X_) _P_a_r_t _2_: _S_h_e_l_l _a_n_d _U_t_i_l_i_t_i_e_s, 1992. -3- While POSIX.2 includes much of what the shell has tra- ditionally provided, some important things have been omitted as being "beyond its scope." There is, for instance, no mention of a difference between a _l_o_g_i_n shell and any other interactive shell (since POSIX.2 does not specify a login program). No fixed startup files are defined, either - the standard does not mention _._p_r_o_f_i_l_e. 33.. BBaassiicc BBaasshh ffeeaattuurreess Since the Bourne shell provides Bash with most of its philosophical underpinnings, Bash inherits most of its fea- tures and functionality from sh. Bash implements all of the traditional sh flow control constructs (_f_o_r, _i_f, _w_h_i_l_e, etc.). All of the Bourne shell builtins, including those not specified in the POSIX.2 standard, appear in Bash. Shell _f_u_n_c_t_i_o_n_s, introduced in the SVR2 version of the Bourne shell, are similar to shell scripts, but are defined using a special syntax and are executed in the same process as the calling shell. Bash has shell functions which behave in a fashion upward-compatible with sh functions. There are certain shell variables that Bash interprets in the same way as sh, such as PPSS11, IIFFSS, and PPAATTHH. Bash implements essen- tially the same grammar, parameter and variable expansion semantics, redirection, and quoting as the Bourne shell. Where differences appear between the POSIX.2 standard and traditional sh behavior, Bash follows POSIX. The Korn Shell (kksshh) is a descendent of the Bourne shell written at AT&T Bell Laboratories by David Korn. It provides a number of useful features that POSIX and Bash have adopted. Many of the interactive facilities in POSIX.2 have their roots in the ksh: for example, the POSIX and ksh job control facilities are nearly identical. Bash includes features from the Korn Shell for both interactive use and shell programming. For programming, Bash provides variables such as RRAANNDDOOMM and RREEPPLLYY, the ttyyppeesseett builtin, the ability to remove substrings from variables based on patterns, and shell arithmetic. RRAANNDDOOMM expands to a random number each time it is referenced; assigning a value to RRAANNDDOOMM seeds the random number generator. RREEPPLLYY is the default variable used by the rreeaadd builtin when no variable names are supplied as arguments. The ttyyppeesseett builtin is used to define variables and give them attributes such as rreeaaddoonnllyy. Bash arithmetic allows the evaluation of an expression and the substitution of the result. Shell variables may be used as operands, and the result of an expression may be assigned to a variable. Nearly all of the operators from the C language are avail- able, with the same precedence rules: $ echo $((3 + 5 * 32)) ----------- Morris Bolsky and David Korn, _T_h_e _K_o_r_n_S_h_e_l_l _C_o_m_- _m_a_n_d _a_n_d _P_r_o_g_r_a_m_m_i_n_g _L_a_n_g_u_a_g_e, Prentice Hall, 1989. -4- 163 For interactive use, Bash implements ksh-style aliases and builtins such as ffcc (discussed below) and jjoobbss. Bash aliases allow a string to be substituted for a command name. They can be used to create a mnemonic for a UNIX command name (alias del=rm), to expand a single word to a complex command (alias news='xterm -g 80x45 -title trn -e trn -e -S1 -N &'), or to ensure that a command is invoked with a basic set of options (alias ls="/bin/ls -F"). The C shell (ccsshh), originally written by Bill Joy while at Berkeley, is widely used and quite popular for its inter- active facilities. Bash includes a csh-compatible history expansion mechanism ("! history"), brace expansion, access to a stack of directories via the ppuusshhdd, ppooppdd, and ddiirrss builtins, and tilde expansion, to generate users' home directories. Tilde expansion has also been adopted by both the Korn Shell and POSIX.2. There were certain areas in which POSIX.2 felt stan- dardization was necessary, but no existing implementation provided the proper behavior. The working group invented and standardized functionality in these areas, which Bash implements. The ccoommmmaanndd builtin was invented so that shell functions could be written to replace builtins; it makes the capabilities of the builtin available to the function. The reserved word "!" was added to negate the return value of a command or pipeline; it was nearly impossible to express "if not x" cleanly using the sh language. There exist multiple incompatible implementations of the tteesstt builtin, which tests files for type and other attributes and performs arithmetic and string comparisons. POSIX considered none of these correct, so the standard behavior was specified in terms of the number of arguments to the command. POSIX.2 dictates exactly what will happen when four or fewer argu- ments are given to tteesstt, and leaves the behavior undefined when more arguments are supplied. Bash uses the POSIX.2 algorithm, which was conceived by David Korn. 33..11.. FFeeaattuurreess nnoott iinn tthhee BBoouurrnnee SShheellll There are a number of minor differences between Bash and the version of sh present on most other versions of UNIX. The majority of these are due to the POSIX standard, but some are the result of Bash adopting features from other shells. For instance, Bash includes the new "!" reserved word, the ccoommmmaanndd builtin, the ability of the rreeaadd builtin to correctly return a line ending with a backslash, symbolic arguments to the uummaasskk builtin, variable substring removal, ----------- Bill Joy, An Introduction to the C Shell, _U_N_I_X _U_s_e_r_'_s _S_u_p_p_l_e_m_e_n_t_a_r_y _D_o_c_u_m_e_n_t_s, University of Cal- ifornia at Berkeley, 1986. -5- a way to get the length of a variable, and the new algorithm for the tteesstt builtin from the POSIX.2 standard, none of which appear in sh. Bash also implements the "$(...)" command substitution syntax, which supersedes the sh `...` construct. The "$(...)" construct expands to the output of the command con- tained within the parentheses, with trailing newlines removed. The sh syntax is accepted for backwards compati- bility, but the "$(...)" form is preferred because its quot- ing rules are much simpler and it is easier to nest. The Bourne shell does not provide such features as brace expansion, the ability to define a variable and a function with the same name, local variables in shell func- tions, the ability to enable and disable individual builtins or write a function to replace a builtin, or a means to export a shell function to a child process. Bash has closed a long-standing shell security hole by not using the $$IIFFSS variable to split each word read by the shell, but splitting only the results of expansion (ksh and the 4.4 BSD sh have fixed this as well). Useful behavior such as a means to abort execution of a script read with the "." command using the rreettuurrnn builtin or automatically exporting variables in the shell's environment to children is also not present in the Bourne shell. Bash provides a much more powerful environment for both interactive use and programming. 44.. BBaasshh--ssppeecciiffiicc FFeeaattuurreess This section details a few of the features which make Bash unique. Most of them provide improved interactive use, but a few programming improvements are present as well. Full descriptions of these features can be found in the Bash documentation. 44..11.. SSttaarrttuupp FFiilleess Bash executes startup files differently than other shells. The Bash behavior is a compromise between the csh principle of startup files with fixed names executed for each shell and the sh "minimalist" behavior. An interactive instance of Bash started as a login shell reads and executes _~_/_._b_a_s_h___p_r_o_f_i_l_e (the file .bash_profile in the user's home directory), if it exists. An interactive non-login shell reads and executes _~_/_._b_a_s_h_r_c. A non-interactive shell (one begun to execute a shell script, for example) reads no fixed startup file, but uses the value of the variable $$EENNVV, if set, as the name of a startup file. The ksh practice of reading $$EENNVV for every shell, with the accompanying diffi- culty of defining the proper variables and functions for interactive and non-interactive shells or having the file -6- read only for interactive shells, was considered too com- plex. Ease of use won out here. Interestingly, the next release of ksh will change to reading $$EENNVV only for interac- tive shells. 44..22.. NNeeww BBuuiillttiinn CCoommmmaannddss There are a few builtins which are new or have been extended in Bash. The eennaabbllee builtin allows builtin com- mands to be turned on and off arbitrarily. To use the ver- sion of _e_c_h_o found in a user's search path rather than the Bash builtin, enable -n echo suffices. The hheellpp builtin provides quick synopses of the shell facilities without requiring access to a manual page. BBuuiillttiinn is similar to ccoommmmaanndd in that it bypasses shell functions and directly executes builtin commands. Access to a csh-style stack of directories is provided via the ppuusshhdd, ppooppdd, and ddiirrss builtins. PPuusshhdd and ppooppdd insert and remove directories from the stack, respectively, and ddiirrss lists the stack contents. On systems that allow fine-grained control of resources, the uulliimmiitt builtin can be used to tune these settings. UUlliimmiitt allows a user to control, among other things, whether core dumps are to be generated, how much memory the shell or a child process is allowed to allocate, and how large a file created by a child process can grow. The ssuussppeenndd command will stop the shell process when job control is active; most other shells do not allow themselves to be stopped like that. TTyyppee,, the Bash answer to wwhhiicchh and wwhheennccee,, shows what will happen when a word is typed as a command: $ type export export is a shell builtin $ type -t export builtin $ type bash bash is /bin/bash $ type cd cd is a function cd () { builtin cd ${1+"$@"} && xtitle $HOST: $PWD } Various modes tell what a command word is (reserved word, alias, function, builtin, or file) or which version of a command will be executed based on a user's search path. Some of this functionality has been adopted by POSIX.2 and folded into the ccoommmmaanndd utility. 44..33.. EEddiittiinngg aanndd CCoommpplleettiioonn One area in which Bash shines is command line editing. Bash uses the _r_e_a_d_l_i_n_e library to read and edit lines when interactive. Readline is a powerful and flexible input facility that a user can configure to individual tastes. It -7- allows lines to be edited using either emacs or vi commands, where those commands are appropriate. The full capability of emacs is not present - there is no way to execute a named command with M-x, for instance - but the existing commands are more than adequate. The vi mode is compliant with the command line editing standardized by POSIX.2. Readline is fully customizable. In addition to the basic commands and key bindings, the library allows users to define additional key bindings using a startup file. The _i_n_p_u_t_r_c file, which defaults to the file _~_/_._i_n_p_u_t_r_c, is read each time readline initializes, permitting users to maintain a consistent interface across a set of programs. Readline includes an extensible interface, so each program using the library can add its own bindable commands and program-spe- cific key bindings. Bash uses this facility to add bindings that perform history expansion or shell word expansions on the current input line. Readline interprets a number of variables which further tune its behavior. Variables exist to control whether or not eight-bit characters are directly read as input or con- verted to meta-prefixed key sequences (a meta-prefixed key sequence consists of the character with the eighth bit zeroed, preceded by the _m_e_t_a_-_p_r_e_f_i_x character, usually escape, which selects an alternate keymap), to decide whether to output characters with the eighth bit set directly or as a meta-prefixed key sequence, whether or not to wrap to a new screen line when a line being edited is longer than the screen width, the keymap to which subsequent key bindings should apply, or even what happens when read- line wants to ring the terminal's bell. All of these vari- ables can be set in the inputrc file. The startup file understands a set of C preprocessor- like conditional constructs which allow variables or key bindings to be assigned based on the application using read- line, the terminal currently being used, or the editing mode. Users can add program-specific bindings to make their lives easier: I have bindings that let me edit the value of $$PPAATTHH and double-quote the current or previous word: # Macros that are convenient for shell interaction $if Bash # edit the path "\C-xp": "PATH=${PATH}\e\C-e\C-a\ef\C-f" # prepare to type a quoted word -- insert open and close double # quotes and move to just after the open quote "\C-x\"": "\"\"\C-b" # Quote the current or previous word "\C-xq": "\eb\"\ef\"" $endif There is a readline command to re-read the file, so users can edit the file, change some bindings, and begin to use -8- them almost immediately. Bash implements the bbiinndd builtin for more dyamic con- trol of readline than the startup file permits. BBiinndd is used in several ways. In _l_i_s_t mode, it can display the cur- rent key bindings, list all the readline editing directives available for binding, list which keys invoke a given direc- tive, or output the current set of key bindings in a format that can be incorporated directly into an inputrc file. In _b_a_t_c_h mode, it reads a series of key bindings directly from a file and passes them to readline. In its most common usage, bbiinndd takes a single string and passes it directly to readline, which interprets the line as if it had just been read from the inputrc file. Both key bindings and variable assignments may appear in the string given to bbiinndd. The readline library also provides an interface for _w_o_r_d _c_o_m_p_l_e_t_i_o_n. When the _c_o_m_p_l_e_t_i_o_n character (usually TAB) is typed, readline looks at the word currently being entered and computes the set of filenames of which the cur- rent word is a valid prefix. If there is only one possible completion, the rest of the characters are inserted directly, otherwise the common prefix of the set of file- names is added to the current word. A second TAB character entered immediately after a non-unique completion causes readline to list the possible completions; there is an option to have the list displayed immediately. Readline provides hooks so that applications can provide specific types of completion before the default filename completion is attempted. This is quite flexible, though it is not com- pletely user-programmable. Bash, for example, can complete filenames, command names (including aliases, builtins, shell reserved words, shell functions, and executables found in the file system), shell variables, usernames, and hostnames. It uses a set of heuristics that, while not perfect, is gen- erally quite good at determining what type of completion to attempt. 44..44.. HHiissttoorryy Access to the list of commands previously entered (the _c_o_m_m_a_n_d _h_i_s_t_o_r_y) is provided jointly by Bash and the read- line library. Bash provides variables ($$HHIISSTTFFIILLEE, $$HHIISSTT-- SSIIZZEE, and $$HHIISSTTCCOONNTTRROOLL) and the hhiissttoorryy and ffcc builtins to manipulate the history list. The value of $$HHIISSTTFFIILLEE specifes the file where Bash writes the command history on exit and reads it on startup. $$HHIISSTTSSIIZZEE is used to limit the number of commands saved in the history. $$HHIISSTTCCOONNTTRROOLL provides a crude form of control over which commands are saved on the history list: a value of _i_g_n_o_r_e_s_p_a_c_e means to not save commands which begin with a space; a value of _i_g_n_o_r_e_d_u_p_s means to not save commands identical to the last command saved. $$HHIISSTTCCOONNTTRROOLL was named $$hhiissttoorryy__ccoonnttrrooll in earlier versions of Bash; the old name is still accepted for -9- backwards compatibility. The hhiissttoorryy command can read or write files containing the history list and display the cur- rent list contents. The ffcc builtin, adopted from POSIX.2 and the Korn Shell, allows display and re-execution, with optional editing, of commands from the history list. The readline library offers a set of commands to search the his- tory list for a portion of the current input line or a string typed by the user. Finally, the _h_i_s_t_o_r_y library, generally incorporated directly into the readline library, implements a facility for history recall, expansion, and re- execution of previous commands very similar to csh ("bang history", so called because the exclamation point introduces a history substitution): $ echo a b c d e a b c d e $ !! f g h i echo a b c d e f g h i a b c d e f g h i $ !-2 echo a b c d e a b c d e $ echo !-2:1-4 echo a b c d a b c d The command history is only saved when the shell is interac- tive, so it is not available for use by shell scripts. 44..55.. NNeeww SShheellll VVaarriiaabblleess There are a number of convenience variables that Bash interprets to make life easier. These include FFIIGGNNOORREE, which is a set of filename suffixes identifying files to exclude when completing filenames; HHOOSSTTTTYYPPEE, which is auto- matically set to a string describing the type of hardware on which Bash is currently executing; ccoommmmaanndd__oorriieenntteedd__hhiissttoorryy, which directs Bash to save all lines of a multiple-line com- mand such as a _w_h_i_l_e or _f_o_r loop in a single history entry, allowing easy re-editing; and IIGGNNOORREEEEOOFF, whose value indi- cates the number of consecutive EOF characters that an interactive shell will read before exiting - an easy way to keep yourself from being logged out accidentally. The aauuttoo__rreessuummee variable alters the way the shell treats simple command names: if job control is active, and this variable is set, single-word simple commands without redirections cause the shell to first look for and restart a suspended job with that name before starting a new process. 44..66.. BBrraaccee EExxppaannssiioonn Since sh offers no convenient way to generate arbitrary strings that share a common prefix or suffix (filename expansion requires that the filenames exist), Bash imple- ments _b_r_a_c_e _e_x_p_a_n_s_i_o_n, a capability picked up from csh. -10- Brace expansion is similar to filename expansion, but the strings generated need not correspond to existing files. A brace expression consists of an optional _p_r_e_a_m_b_l_e, followed by a pair of braces enclosing a series of comma-separated strings, and an optional _p_o_s_t_a_m_b_l_e. The preamble is prepended to each string within the braces, and the postam- ble is then appended to each resulting string: $ echo a{d,c,b}e ade ace abe As this example demonstrates, the results of brace expansion are not sorted, as they are by filename expansion. 44..77.. PPrroocceessss SSuubbssttiittuuttiioonn On systems that can support it, Bash provides a facil- ity known as _p_r_o_c_e_s_s _s_u_b_s_t_i_t_u_t_i_o_n. Process substitution is similar to command substitution in that its specification includes a command to execute, but the shell does not col- lect the command's output and insert it into the command line. Rather, Bash opens a pipe to the command, which is run in the background. The shell uses named pipes (FIFOs) or the _/_d_e_v_/_f_d method of naming open files to expand the process substitution to a filename which connects to the pipe when opened. This filename becomes the result of the expansion. Process substitution can be used to compare the outputs of two different versions of an application as part of a regression test: $ cmp <(old_prog) <(new_prog) 44..88.. PPrroommpptt CCuussttoommiizzaattiioonn One of the more popular interactive features that Bash provides is the ability to customize the prompt. Both $$PPSS11 and $$PPSS22,, the primary and secondary prompts, are expanded before being displayed. Parameter and variable expansion is performed when the prompt string is expanded, so any shell variable can be put into the prompt (e.g., $$SSHHLLVVLL, which indicates how deeply the current shell is nested). Bash specially interprets characters in the prompt string pre- ceded by a backslash. Some of these backslash escapes are replaced with the current time, the date, the current work- ing directory, the username, and the command number or his- tory number of the command being entered. There is even a backslash escape to cause the shell to change its prompt when running as root after an _s_u. Before printing each pri- mary prompt, Bash expands the variable $$PPRROOMMPPTT__CCOOMMMMAANNDD and, if it has a value, executes the expanded value as a command, allowing additional prompt customization. For example, this assignment causes the current user, the current host, the time, the last component of the current working directory, the level of shell nesting, and the history number of the current command to be embedded into the primary prompt: $ PS1='\u@\h [\t] \W($SHLVL:\!)\$ ' -11- chet@odin [21:03:44] documentation(2:636)$ cd .. chet@odin [21:03:54] src(2:637)$ The string being assigned is surrounded by single quotes so that if it is exported, the value of $$SSHHLLVVLL will be updated by a child shell: chet@odin [21:17:35] src(2:638)$ export PS1 chet@odin [21:17:40] src(2:639)$ bash chet@odin [21:17:46] src(3:696)$ The \$ escape is displayed as "$$" when running as a normal user, but as "##" when running as root. 44..99.. FFiillee SSyysstteemm VViieewwss Since Berkeley introduced symbolic links in 4.2 BSD, one of their most annoying properties has been the "warping" to a completely different area of the file system when using ccdd, and the resultant non-intuitive behavior of "ccdd ....". The UNIX kernel treats symbolic links _p_h_y_s_i_c_a_l_l_y. When the kernel is translating a pathname in which one component is a symbolic link, it replaces all or part of the pathname while processing the link. If the contents of the symbolic link begin with a slash, the kernel replaces the pathname entirely; if not, the link contents replace the current com- ponent. In either case, the symbolic link is visible. If the link value is an absolute pathname, the user finds him- self in a completely different part of the file system. Bash provides a _l_o_g_i_c_a_l view of the file system. In this default mode, command and filename completion and builtin commands such as ccdd and ppuusshhdd which change the cur- rent working directory transparently follow symbolic links as if they were directories. The $$PPWWDD variable, which holds the shell's idea of the current working directory, depends on the path used to reach the directory rather than its physical location in the local file system hierarchy. For example: $ cd /usr/local/bin $ echo $PWD /usr/local/bin $ pwd /usr/local/bin $ /bin/pwd /net/share/sun4/local/bin $ cd .. $ pwd /usr/local $ /bin/pwd /net/share/sun4/local $ cd .. $ pwd /usr $ /bin/pwd -12- /usr One problem with this, of course, arises when programs that do not understand the shell's logical notion of the file system interpret ".." differently. This generally happens when Bash completes filenames containing ".." according to a logical hierarchy which does not correspond to their physi- cal location. For users who find this troublesome, a corre- sponding _p_h_y_s_i_c_a_l view of the file system is available: $ cd /usr/local/bin $ pwd /usr/local/bin $ set -o physical $ pwd /net/share/sun4/local/bin 44..1100.. IInntteerrnnaattiioonnaalliizzaattiioonn One of the most significant improvements in version 1.13 of Bash was the change to "eight-bit cleanliness". Previous versions used the eighth bit of characters to mark whether or not they were quoted when performing word expan- sions. While this did not affect the majority of users, most of whom used only seven-bit ASCII characters, some found it confining. Beginning with version 1.13, Bash implemented a different quoting mechanism that did not alter the eighth bit of characters. This allowed Bash to manipu- late files with "odd" characters in their names, but did nothing to help users enter those names, so version 1.13 introduced changes to readline that made it eight-bit clean as well. Options exist that force readline to attach no special significance to characters with the eighth bit set (the default behavior is to convert these characters to meta-prefixed key sequences) and to output these characters without conversion to meta-prefixed sequences. These changes, along with the expansion of keymaps to a full eight bits, enable readline to work with most of the ISO-8859 fam- ily of character sets, used by many European countries. 44..1111.. PPOOSSIIXX MMooddee Although Bash is intended to be POSIX.2 conformant, there are areas in which the default behavior is not compat- ible with the standard. For users who wish to operate in a strict POSIX.2 environment, Bash implements a _P_O_S_I_X _m_o_d_e. When this mode is active, Bash modifies its default opera- tion where it differs from POSIX.2 to match the standard. POSIX mode is entered when Bash is started with the --ppoossiixx option. This feature is also available as an option to the sseett builtin, sseett --oo ppoossiixx. For compatibility with other GNU software that attempts to be POSIX.2 compliant, Bash also enters POSIX mode if the variable $$PPOOSSIIXXLLYY__CCOORRRREECCTT is set when Bash is started or assigned a value during execution. $$PPOOSSIIXX__PPEEDDAANNTTIICC is accepted as well, to be compatible with -13- some older GNU utilities. When Bash is started in POSIX mode, for example, it sources the file named by the value of $$EENNVV rather than the "normal" startup files, and does not allow reserved words to be aliased. 55.. NNeeww FFeeaattuurreess aanndd FFuuttuurree PPllaannss There are several features introduced in the current version of Bash, version 1.14, and a number under considera- tion for future releases. This section will briefly detail the new features in version 1.14 and describe several fea- tures that may appear in later versions. 55..11.. NNeeww FFeeaattuurreess iinn BBaasshh--11..1144 The new features available in Bash-1.14 answer several of the most common requests for enhancements. Most notably, there is a mechanism for including non-visible character sequences in prompts, such as those which cause a terminal to print characters in different colors or in standout mode. There was nothing preventing the use of these sequences in earlier versions, but the readline redisplay algorithm assumed each character occupied physical screen space and would wrap lines prematurely. Readline has a few new variables, several new bindable commands, and some additional emacs mode default key bind- ings. A new history search mode has been implemented: in this mode, readline searches the history for lines beginning with the characters between the beginning of the current line and the cursor. The existing readline incremental search commands no longer match identical lines more than once. Filename completion now expands variables in direc- tory names. The history expansion facilities are now nearly completely csh-compatible: missing modifiers have been added and history substitution has been extended. Several of the features described earlier, such as sseett --oo ppoossiixx and $$PPOOSSIIXX__PPEEDDAANNTTIICC, are new in version 1.14. There is a new shell variable, OOSSTTYYPPEE, to which Bash assigns a value that identifies the version of UNIX it's running on (great for putting architecture-specific binary directories into the $$PPAATTHH). Two variables have been renamed: $$HHIISSTTCCOONN-- TTRROOLL replaces $$hhiissttoorryy__ccoonnttrrooll, and $$HHOOSSTTFFIILLEE replaces $$hhoossttnnaammee__ccoommpplleettiioonn__ffiillee. In both cases, the old names are accepted for backwards compatibility. The ksh _s_e_l_e_c_t con- struct, which allows the generation of simple menus, has been implemented. New capabilities have been added to existing variables: $$aauuttoo__rreessuummee can now take values of _e_x_a_c_t or _s_u_b_s_t_r_i_n_g, and $$HHIISSTTCCOONNTTRROOLL understands the value _i_g_n_o_r_e_b_o_t_h, which combines the two previously acceptable values. The ddiirrss builtin has acquired options to print out specific members of the directory stack. The $$nnoolliinnkkss vari- able, which forces a physical view of the file system, has -14- been superseded by the --PP option to the sseett builtin (equiva- lent to sseett --oo pphhyyssiiccaall); the variable is retained for back- wards compatibility. The version string contained in $$BBAASSHH__VVEERRSSIIOONN now includes an indication of the patch level as well as the "build version". Some little-used features have been removed: the bbyyee synonym for eexxiitt and the $$NNOO__PPRROOMMPPTT__VVAARRSS variable are gone. There is now an orga- nized test suite that can be run as a regression test when building a new version of Bash. The documentation has been thoroughly overhauled: there is a new manual page on the readline library and the _i_n_f_o file has been updated to reflect the current version. As always, as many bugs as possible have been fixed, although some surely remain. 55..22.. OOtthheerr FFeeaattuurreess There are a few features that I hope to include in later Bash releases. Some are based on work already done in other shells. In addition to simple variables, a future release of Bash will include one-dimensional arrays, using the ksh implementation of arrays as a model. Additions to the ksh syntax, such as _v_a_r_n_a_m_e=( ... ) to assign a list of words directly to an array and a mechanism to allow the rreeaadd builtin to read a list of values directly into an array, would be desirable. Given those extensions, the ksh sseett --AA syntax may not be worth supporting (the --AA option assigns a list of values to an array, but is a rather peculiar special case). Some shells include a means of _p_r_o_g_r_a_m_m_a_b_l_e word com- pletion, where the user specifies on a per-command basis how the arguments of the command are to be treated when comple- tion is attempted: as filenames, hostnames, executable files, and so on. The other aspects of the current Bash implementation could remain as-is; the existing heuristics would still be valid. Only when completing the arguments to a simple command would the programmable completion be in effect. It would also be nice to give the user finer-grained control over which commands are saved onto the history list. One proposal is for a variable, tentatively named HHIISSTTIIGG-- NNOORREE, which would contain a colon-separated list of com- mands. Lines beginning with these commands, after the restrictions of $$HHIISSTTCCOONNTTRROOLL have been applied, would not be placed onto the history list. The shell pattern-matching capabilities could also be available when specifying the contents of $$HHIISSTTIIGGNNOORREE. -15- One thing that newer shells such as wwkksshh (also known as ddttkksshh) provide is a command to dynamically load code imple- menting additional builtin commands into a running shell. This new builtin would take an object file or shared library implementing the "body" of the builtin (_x_x_x___b_u_i_l_t_i_n_(_) for those familiar with Bash internals) and a structure contain- ing the name of the new command, the function to call when the new builtin is invoked (presumably defined in the shared object specified as an argument), and the documentation to be printed by the hheellpp command (possibly present in the shared object as well). It would manage the details of extending the internal table of builtins. A few other builtins would also be desirable: two are the POSIX.2 ggeettccoonnff command, which prints the values of sys- tem configuration variables defined by POSIX.2, and a ddiissoowwnn builtin, which causes a shell running with job control active to "forget about" one or more background jobs in its internal jobs table. Using ggeettccoonnff, for example, a user could retrieve a value for $$PPAATTHH guaranteed to find all of the POSIX standard utilities, or find out how long filenames may be in the file system containing a specified directory. There are no implementation timetables for any of these features, nor are there concrete plans to include them. If anyone has comments on these proposals, feel free to send me electronic mail. 66.. RReefflleeccttiioonnss aanndd LLeessssoonnss LLeeaarrnneedd The lesson that has been repeated most often during Bash development is that there are dark corners in the Bourne shell, and people use all of them. In the original description of the Bourne shell, quoting and the shell gram- mar are both poorly specified and incomplete; subsequent descriptions have not helped much. The grammar presented in Bourne's paper describing the shell distributed with the Seventh Edition of UNIX is so far off that it does not allow the command who|wc. In fact, as Tom Duff states: Nobody really knows what the Bourne shell's gram- mar is. Even examination of the source code is little help. The POSIX.2 standard includes a _y_a_c_c grammar that comes close to capturing the Bourne shell's behavior, but it dis- allows some constructs which sh accepts without complaint - ----------- S. R. Bourne, "UNIX Time-Sharing System: The UNIX Shell", _B_e_l_l _S_y_s_t_e_m _T_e_c_h_n_i_c_a_l _J_o_u_r_n_a_l, 57(6), July-August, 1978, pp. 1971-1990. Tom Duff, "Rc - A Shell for Plan 9 and UNIX sys- tems", _P_r_o_c_. _o_f _t_h_e _S_u_m_m_e_r _1_9_9_0 _E_U_U_G _C_o_n_f_e_r_e_n_c_e, London, July, 1990, pp. 21-33. -16- and there are scripts out there that use them. It took a few versions and several bug reports before Bash implemented sh-compatible quoting, and there are still some "legal" sh constructs which Bash flags as syntax errors. Complete sh compatibility is a tough nut. The shell is bigger and slower than I would like, though the current version is substantially faster than pre- viously. The readline library could stand a substantial rewrite. A hand-written parser to replace the current _y_a_c_c- generated one would probably result in a speedup, and would solve one glaring problem: the shell could parse commands in "$(...)" constructs as they are entered, rather than reporting errors when the construct is expanded. As always, there is some chaff to go with the wheat. Areas of duplicated functionality need to be cleaned up. There are several cases where Bash treats a variable spe- cially to enable functionality available another way ($$nnoottiiffyy vs. sseett --oo nnoottiiffyy and $$nnoolliinnkkss vs. sseett --oo pphhyyssii-- ccaall, for instance); the special treatment of the variable name should probably be removed. A few more things could stand removal; the $$aallllooww__nnuullll__gglloobb__eexxppaannssiioonn and $$gglloobb__ddoott__ffiilleennaammeess variables are of particularly question- able value. The $$[[......]] arithmetic evaluation syntax is redundant now that the POSIX-mandated $$((((......)))) construct has been implemented, and could be deleted. It would be nice if the text output by the hheellpp builtin were external to the shell rather than compiled into it. The behavior enabled by $$ccoommmmaanndd__oorriieenntteedd__hhiissttoorryy, which causes the shell to attempt to save all lines of a multi-line command in a single his- tory entry, should be made the default and the variable removed. 77.. AAvvaaiillaabbiilliittyy As with all other GNU software, Bash is available for anonymous FTP from _p_r_e_p_._a_i_._m_i_t_._e_d_u_:_/_p_u_b_/_g_n_u and from other GNU software mirror sites. The current version is in _b_a_s_h_-_1_._1_4_._1_._t_a_r_._g_z in that directory. Use _a_r_c_h_i_e to find the nearest archive site. The latest version is always available for FTP from _b_a_s_h_._C_W_R_U_._E_d_u_:_/_p_u_b_/_d_i_s_t_. Bash docu- mentation is available for FTP from _b_a_s_h_._C_W_R_U_._E_d_u_:_/_p_u_b_/_b_a_s_h_. The Free Software Foundation sells tapes and CD-ROMs containing Bash; send electronic mail to gnu@prep.ai.mit.edu or call +1-617-876-3296 for more information. Bash is also distributed with several versions of UNIX- compatible systems. It is included as /bin/sh and /bin/bash on several Linux distributions (more about the difference in a moment), and as contributed software in BSDI's BSD/386* -17- and FreeBSD. The Linux distribution deserves special mention. There are two configurations included in the standard Bash distri- bution: a "normal" configuration, in which all of the stan- dard features are included, and a "minimal" configuration, which omits job control, aliases, history and command line editing, the directory stack and ppuusshhdd//ppooppdd//ddiirrss,, process substitution, prompt string special character decoding, and the _s_e_l_e_c_t construct. This minimal version is designed to be a drop-in replacement for the traditional UNIX /bin/sh, and is included as the Linux /bin/sh in several packagings. 88.. CCoonncclluussiioonn Bash is a worthy successor to sh. It is sufficiently portable to run on nearly every version of UNIX from 4.3 BSD to SVR4.2, and several UNIX workalikes. It is robust enough to replace sh on most of those systems, and provides more functionality. It has several thousand regular users, and their feedback has helped to make it as good as it is today - a testament to the benefits of free software. ----------- *BSD/386 is a trademark of Berkeley Software Design, Inc.