summary refs log tree commit diff
path: root/src/parser.c (follow)
Commit message (Collapse)AuthorAge
* parser: Allow newlines within parameter substitutionHerbert Xu2018-04-02
| | | | | | | | | | | | | | | | | | | | | | | | On Fri, Mar 16, 2018 at 11:27:22AM +0800, Herbert Xu wrote: > On Thu, Mar 15, 2018 at 10:49:15PM +0100, Harald van Dijk wrote: > > > > Okay, it can be trivially modified to something that does work in other > > shells (even if it were actually executed), but gets rejected at parse time > > by dash: > > > > if false; then > > : ${$+ > > } > > fi > > That's just a bug in dash's parser with ${} in general, because > it bombs out without the if clause too: > > : ${$+ > } This patch fixes the parsing of newlines with parameter substitution. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* parser: Fix backquote support in here-document EOF markHerbert Xu2018-03-22
| | | | | | | | | | Currently using backquotes in a here-document EOF mark is broken because dash tries to do command substitution on it. This patch fixes it by checking whether we're looking for an EOF mark during tokenisation. Reported-by: Harald van Dijk <harald@gigawatt.nl> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* parser: Fix single-quoted patterns in here-documentsHerbert Xu2018-03-22
| | | | | | | | | | | | | | | | The script x=* cat <<- EOF ${x#'*'} EOF prints * instead of nothing as it should. The problem is that when we're in sqsyntax context in a here-document, we won't add CTLESC as we should. This patch fixes it: Reported-by: Harald van Dijk <harald@gigawatt.nl> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* parser: Add syntax stack for recursive parsingHerbert Xu2018-03-22
| | | | | | | | | | | | | | | | | | | | | | | | | | Without a stack of syntaxes we cannot correctly these two cases together: "${a#'$$'}" "${a#"${b-'$$'}"}" A recursive parser also helps in some other corner cases such as nested arithmetic expansion with paratheses. This patch adds a syntax stack allocated from the stack using alloca. As a side-effect this allows us to remove the naked backslashes for patterns within double-quotes, which means that EXP_QPAT also has to go. This patch also fixes removes any backslashes that precede right braces when they are present within a parameter expansion context, and backslashes that precede double quotes within inner double quotes inside a parameter expansion in a here-document context. The idea of a recursive parser is based on a patch by Harald van Dijk. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* parser: use pgetc_eatbnl() in more placesHarald van Dijk2018-03-22
| | | | | | | | | | dash has a pgetc_eatbnl function in parser.c which skips any backslash-newline combinations. It's not used everywhere it could be. There is also some duplicated backslash-newline handling elsewhere in parser.c. Replace most of the calls to pgetc() with calls to pgetc_eatbnl() and remove the duplicated backslash-newline handling. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* [PARSER] Catch variable length expansions on non-existant specialsHerbert Xu2014-10-30
| | | | | | | | | Currently we only check special variable names that follow directly after $ or ${. So errors such as ${#&} are not caught. This patch fixes that by moving the is_special check to just before we print out the special variable name. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* [PARSER] Simplify EOF/newline handling in list parserHerbert Xu2014-10-28
| | | | | | | | | | This patch simplifies the EOF and new handling in the list parser. In particular, it eliminates a case where we may leave here-documents unfinished upon EOF. It also removes special EOF/newline handling from parsecmd. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* [PARSER] Removed unnecessary pungetc on EOF from parserHerbert Xu2014-10-28
| | | | | | | | Doing a pungetc on an EOF is a noop and is only useful when we don't know what character we're putting back. This patch removes an unnecessary pungetc when we know it's EOF. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* [PARSER] Add nlprompt/nlnoprompt helpersHerbert Xu2014-09-29
| | | | | | | This patch adds the nlprompt/nlnoprompt helpers to isolate code dealing with newlines and prompting. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* [PARSER] Handle backslash newlines properly after dollar signHerbert Xu2014-09-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On Tue, Aug 26, 2014 at 12:34:42PM +0000, Eric Blake wrote: > On 08/26/2014 06:15 AM, Oleg Bulatov wrote: > > Hi! > > > > While playing with sh generators I found that dash and bash have different > > interpretations for <slash><newline> sequence. > > > > $ dash -c 'EDIT=xxx; echo $EDIT\ > >> OR' > > xxxOR > > Buggy. > > > $ bash -c 'EDIT=xxx; echo $EDIT\ > > OR' > > /usr/bin/vim > > Correct behavior. > > > > > $ dash -c 'echo "$\ > > (pwd)"' > > $(pwd) > > > > Is it undefined behaviour in POSIX? > > No, it's well-defined, and dash is buggy. POSIX says: > > http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_03 > > "the shell shall break its input into tokens by applying the first > applicable rule below to the next character in its input" > > Rule 4 covers backslash handling, while rule 5 covers locating the end > of a word to be subject to $ expansion. Therefore, rule 4 should happen > first. Rule 4 defers to the section on quoting, with the caveat that > <newline> joining is the only substitution that happens immediately as > part of the parsing: > > http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_02 > > "If a <newline> follows the <backslash>, the shell shall interpret this > as line continuation. The <backslash> and <newline> shall be removed > before splitting the input into tokens. Since the escaped <newline> is > removed entirely from the input and is not replaced by any white space, > it cannot serve as a token separator." > > So the fact that dash is treating the elided backslash-newline as a > token separator, and parsing your input as if ${EDIT}OR instead of > ${EDITOR} is a bug in dash. I agree. This patch should resolve this problem and similar ones affecting blackslash newlines after we encounter a dollar sign. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* [INPUT] Kill pgetc_macroHerbert Xu2014-09-29
| | | | | | | | pgetc_macro is identical to pgetc except that it's a macro and pgetc isn't. Since there is very little performance difference on modern systems it's time to kill pgetc_macro. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* Avoid overflow for very long variable nameJim Meyering2012-07-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Otherwise, this: $ perl -le 'print "v"x(2**31+1) ."=1"' | dash provokes integer overflow: (gdb) bt #0 doformat (dest=0x61d580, f=0x416a08 "%s: %d: %s: ", ap=0x7fffffffd308) at output.c:310 #1 0x00000000004128c1 in outfmt (file=0x61d580, fmt=0x416a08 "%s: %d: %s: ") at output.c:257 #2 0x000000000040382e in exvwarning2 (msg=0x417339 "Out of space", ap=0x7fffffffd468) at error.c:125 #3 0x000000000040387e in exverror (cond=1, msg=0x417339 "Out of space", ap=0x7fffffffd468) at error.c:156 #4 0x0000000000403938 in sh_error (msg=0x417339 "Out of space") at error.c:172 #5 0x000000000040c970 in ckmalloc (nbytes=18446744071562067984) at memalloc.c:57 #6 0x000000000040ca78 in stalloc (nbytes=18446744071562067972) at memalloc.c:132 #7 0x000000000040ece9 in grabstackblock (len=18446744071562067972) at memalloc.h:67 #8 0x00000000004106b5 in readtoken1 (firstc=118, syntax=0x419522 "", eofmark=0x0, striptabs=0) at parser.c:1040 #9 0x00000000004101a4 in xxreadtoken () at parser.c:826 #10 0x000000000040fe1d in readtoken () at parser.c:697 #11 0x000000000040edcc in parsecmd (interact=0) at parser.c:145 #12 0x000000000040c679 in cmdloop (top=1) at main.c:224 #13 0x000000000040c603 in main (argc=2, argv=0x7fffffffd9f8) at main.c:178 #8 0x00000000004106b5 in readtoken1 (firstc=118, syntax=0x419522 "", eofmark=0x0, striptabs=0) at parser.c:1040 1040 grabstackblock(len); (gdb) p len $30 = -2147483644 Signed-off-by: Jim Meyering <meyering@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* [SHELL] Optimize dash -c "command" to avoid a forkHerbert Xu2011-07-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On Sun, Apr 10, 2011 at 07:36:49AM +0000, Jonathan Nieder wrote: > From: Jilles Tjoelker <jilles@stack.nl> > Date: Sat, 13 Jun 2009 16:17:45 -0500 > > This change only affects strings passed to -c, when the -s option is > not used. > > Use the EV_EXIT flag to inform the eval machinery that the string > being passed is the entirety of input. This way, a fork may be > omitted in many special cases. > > If there are empty lines after the last command, the evalcmd will not > see the end early enough and forks will not be omitted. The same thing > seems to happen in bash. > > Example: > sh -c 'ps lT' > No longer shows a shell process waiting for ps to finish. > > [jn: ported from FreeBSD SVN r194128. Bugs are mine.] > > Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Instead of detecting EOF using the input layer, I'm going to use the parser instead. In either case, we always have to read ahead in order to complete the parsing of the previous node. Therefore we always know whether there is more to come, except in the case where we see a newline/semicolon or similar. For the purposes of sh -c, this should be sufficient. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* [PARSER] Fix clobbering of checkkwdHerbert Xu2011-03-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On Sun, Nov 07, 2010 at 02:21:21AM +0000, Jonathan Nieder wrote: > > Just ran into some strange behavior: > > $ cat test.sh > #!/bin/sh > echo hello >greeting > cat <<EOF && > $(cat greeting) > EOF > { > echo $? > cat greeting > } >/dev/null > > > $ sh test.sh > hello > test.sh: 7: {: not found > 127 > hello > test.sh: 10: Syntax error: "}" unexpected > > bash, mksh, pdksh, and ksh93 all print hello as expected. The problem > is reproducible with all versions of dash in the git repo. This is caused by the clobbering of checkkwd due to readtoken recursion while parsing a here document. This patch fixes it by saving the original value of checkkwd. Reported-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* [SHELL] Improve LINENO supportHarald van Dijk2011-03-15
| | | | | | | | | | | | This patch improves LINENO support by storing line numbers in the parse tree, for commands as well as for function definitions. It makes LINENO behaves properly when calling functions, and has the added benefit of improved line numbers in error messages when the last-parsed command is not the last-executed one. It removes the earlier LINENO support, and instead sets LINENO from evaltree when a command is executed Signed-off-by: Harald van Dijk <harald@gigawatt.nl> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* [SHELL] Add preliminary LINENO supportRocky Bernstein2009-08-11
| | | | | | | | | | | | | | | | | | | | | | | Looks like in contrast to what the dash.1 manual page says, expansion of PS{1,2,4} does work. Here is a little patch to set LINENO. The ways in that it is less than ideal mirror the ways that the line number error reporting is also less than ideal. For example if you run this: ( x=$((1/0)) # Just to add another line # And another ) # error reports this line The error reported will be the closing parenthesis even though I think most people would prefer the error to be the one where x was set. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* [EXPAND] Fix quoted pattern patch breakageHerbert Xu2009-06-27
| | | | | | | | | | | | | | The change [EXPAND] Do not quote back slashes in parameter expansions outside quotes broke quote removal after parameter expansion. This is because its effecte extended beyond that of quoted patterns. This patch fixes this by limiting the change to just quoted patterns. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* [PARSER] Use CHKNL to parse case statementsHerbert Xu2009-02-22
| | | | | | | | Instead of open-coding the newline loop, use the CHKNL flag to get readtoken to eat the newlines before the in keyword for the case statement. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* [PARSER] Allow newlines after var name in for statementsHerbert Xu2009-02-22
| | | | | | | | | POSIX allows newlines before the "in" keyword in for statements so we should too. Thanks to Maximilian Bernöcker for reporting this. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* [BUILD] Fixed build on NetBSDAleksey Cheusov2009-01-13
| | | | | | | | | | Hi, I propose to apply the following patch for dash. The problem is alloca.h is absent on many platforms including NetBSD I'm running. Also, NetBSD's version of mktemp doesn't work without temporary filename pattern. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* [SHELL] Use uninitialized_var to silence bogus warningsHerbert Xu2008-05-03
| | | | | | | | gcc generates bogus warnings about uninitialised variables in parser.c. This patch borrows the uninitialized_var macro from Linux to silence them. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* [SHELL] Fixed klibc/klcc build problemsDan McGee2008-05-03
| | | | | | | | | | | | klibc does not have mempcpy, so system.h must be included where this is used to provide the replacement. glob.h doesn't exist, so we need to guard this include with the HAVE_GLOB definition. Finally, klcc didn't like the syntax of the main definition in mksignames, and the resulting program segfaulted when trying to dereference any part of the argv array. Updating the main function definition solved the problem. Signed-off-by: Dan McGee <dpmcgee@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* [PARSER] Do not show prompts in expandstrHerbert Xu2007-12-27
| | | | | | | | | | Once I fixed the previous problem it became apparent that we never dealt with prompts with new-lines in them correctly. The problem is that we showed a secondary prompt for each of them. This patch disables prompt generation in expandstr. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* [PARSER] Add FAKEEOFMARK for expandstrHerbert Xu2007-12-27
| | | | | | | | | | | | | | | | Previously expandstr used the string "" to indicate that it needs to be treated just like a here-doc except that there is no terminator. However, the string "" is in fact a valid here-doc terminator so now that we deal with it correctly expandstr no longer works in the presence of new-lines in the prompt. This patch introduces the FAKEEOFMARK macro which does not equal any real EOF marker but is distinct from the NULL pointer which is used to indicate non-here-doc contexts. Thanks to Markus Triska for reporting this regression. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* [PARSER] Removed noexpand/length check on eofmarkHerbert Xu2007-11-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On Tue, Oct 30, 2007 at 04:23:35AM +0000, Oleg Verych wrote: > > } 8<<"" > ====================== Actually this (the empty delim) only works with dash by accident. I've tried bash and pdksh and they both terminate on the first empty line which is what you would expect rather than EOF. The real Korn shell does something completely different. I've fixed this in dash to conform to bash/pdksh. > In [0] it's stated, that delimiter isn't evaluated (expanded), only > quoiting must be checked. That if() seems to be completely bogus. OK I agree. The reason it was there is because the parser would have already replaced the dollar sign by an internal representation. I've fixed it properly with this patch. Test case: cat <<- $a OK $a cat <<- "" OK echo OK Old result: dash: Syntax error: Illegal eof marker for << redirection OK echo OK New result: OK OK OK
* [PARSER] Fix here-doc corruptionHerbert Xu2007-10-20
| | | | | | | | | | | | | | | | | | | | | | | The change [PARSER] Recognise here-doc delimiters terminated by EOF introduced a regerssion whereby lines starting with eofmark but are not equal to eofmark would be corrupted. This patch fixes it. Test case: cat << _ACEOF _ASBOX _ACEOF Old result: SASBOX New result: _ASBOX
* [PARSER] Report substition errors at expansion timeHerbert Xu2007-10-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On Wed, Apr 11, 2007 at 01:24:21PM -0700, Micah Cowan wrote: > Package: dash > Version: 0.5.3-3 > > Bug first reported against Ubuntu at > https://bugs.launchpad.net/ubuntu/+source/dash/+bug/105634 > by Paul Smith > > The description and some comments from that bug report follow. > > ----- > > This operation fails on Ubuntu: > > $ /bin/sh -c 'if false; then d="${foo/bar}"; fi' > /bin/sh: Syntax error: Bad substitution > > When used with other POSIX shells it succeeds. While semantically the > variable reference ${foo/bar} is not valid, this is not a syntax error > according to POSIX, and since the variable assignment expression is > never invoked (because it's within an "if false") it should not be seen > as an error. > > I ran into this because after restarting my system I could no longer log > in. It turns out that the problem was (a) I had edited .gnomerc to > source my .bashrc file so that my environment would be set properly, and > (b) I had added some new code to my .bashrc WITHIN A CHECK FOR BASH! > that used bash's ${var/match/sub} feature. Even though this code was > within a "case $BASH_VERSION; in *[0-9]*) ... esac (so dash would never > execute it since that variable is not set), it still caused dash to > throw up. > > ----- > > FYI, some relevant details from POSIX: > > Section 2.3, Token Recognition: > > 5. If the current character is an unquoted '$' or '`', the shell shall > identify the start of any candidates for parameter expansion ( Parameter > Expansion), command substitution ( Command Substitution), or arithmetic > expansion ( Arithmetic Expansion) from their introductory unquoted > character sequences: '$' or "${", "$(" or '`', and "$((", respectively. > The shell shall read sufficient input to determine the end of the unit > to be expanded (as explained in the cited sections). > > Section 2.6.2, Parameter Expansion: > > The format for parameter expansion is as follows: > > ${expression} > > where expression consists of all characters until the matching '}'. Any > '}' escaped by a backslash or within a quoted string, and characters in > embedded arithmetic expansions, command substitutions, and variable > expansions, shall not be examined in determining the matching '}'. > > [...] > > The parameter name or symbol can be enclosed in braces, which are > optional except for positional parameters with more than one digit or > when parameter is followed by a character that could be interpreted as > part of the name. The matching closing brace shall be determined by > counting brace levels, skipping over enclosed quoted strings, and > command substitutions. > > --- > > In addition to bash I've checked Solaris /bin/sh and ksh and they don't > report an error. > > ----- > Micah Cowan: > > The applicable portion of POSIX is in XCU 2.10.1: > > "The WORD tokens shall have the word expansion rules applied to them > immediately before the associated command is executed, not at the time > the command is parsed." > > This seems fairly clear to me. This patch moves the error detection to expansion time. Test case: if false; then echo ${a!7} fi echo OK Old result: dash: Syntax error: Bad substitution New result: OK
* [MEMALLOC] Add pushstackmarkHerbert Xu2007-10-06
| | | | | | | | This patch gets rid of the stack mark tracking hack by allocating a little bit of stack memory if we're at risk of planting a stack mark which may be grown later. To do this a new function pushstackmark is added which lets the user pick a bigger amount to allocate since some users do that anyway after setting a stack mark.
* [PARSER] Size optimisations in parameter expansion parserHerbert Xu2007-10-04
| | | | | | | Merge flags into subtype. Do not write subtype out twice. Add likely flag on ${ vs. $NAME. Kill unnecessary (and bogus) PEOA check.
* [PARSER] Fix parsing of ${##1}Herbert Xu2007-10-04
| | | | | | | | | | | | | | | | | Previously dash treated ${##1} as a length operation. This patch fixes that. Test case: set -- a echo ${##1}OK Old result: 1OK New result: OK
* [PARSER] Recognise here-doc delimiters terminated by EOFHerbert Xu2007-09-26
| | | | | | | | | | | | | | | | | | | | | | | | Previously dash required a <newline> character to be present in order for a here-document delimiter to be detected. Allowing EOF in the absence of a <newline> to play the same purpose allows some intuitive scripts to succeed. POSIX seems to be silence on this so this should be OK. Test case: eval 'cat <<- NOT test NOT' echo OK Old result: test NOTOK New result: test OK
* [EXPAND] Move parse-time quote flag detection to run-timeHerbert Xu2007-09-25
| | | | | | | | | | | | | | | | | | | | Because the parser does not recursively parse parameter expansion with respect to quotes, we can't accurately determine quote status at parse time. This patch works around this by moving the quote detection to run-time where we do interpret it recursively. Test case: foo=\\ echo "<${foo#[\\]}>" Old result: <\> New result: <>
* [PARSER] Remove arithmetic expansion collapsing at parse timeHerbert Xu2007-09-24
| | | | | | | | | | | | | | | | | | | Collapsing arithmethc expansion is incorrect when the inner arithmetic expansion is a part of a parameter expansion. Test case: unset a echo $((3 + ${a:=$((4 + 5))})) echo $a Old result: (4 + 5) New result: 9
* [PARSER] Remove superfluous dblquote settings when ending arithHerbert Xu2007-09-24
| | | | | | When an arithmetic expansion terminates and we restore the syntax to the previous one, we don't need to set dblquote because we never changed upon entering the arithmetic expansion.
* [PARSER] Remove superfluous arinest test for dqvarnestHerbert Xu2007-09-24
| | | | | | dqvarnest is only used to determine whether CENDQUOTE should terminate the double-quote syntax. Since CENDQUOTE can never occur while arinest is set, we don't need to take arinest into account for dqvarnest.
* [PARSER] Remove superfluous arinest test in CENDQUOTEHerbert Xu2007-09-24
| | | | | If arinest is set then the syntax must be ARISYNTAX. As such CENDQUOTE can never occur while arinest is set so we don't need to test for it.
* [EXPAND] Do not quote back slashes in parameter expansions outside quotesHerbert Xu2007-09-24
| | | | | | | | | | | | | | | | Test case: a=/b/c/* b=\\ echo ${a%$b*} Old result: /b/c/* New result: /b/c/
* [PARSER] Remove unnecessary inclusion of redir.hHerbert Xu2007-05-06
|
* [PARSER] Only use signed char for syntax arraysHerbert Xu2006-04-23
| | | | | | | | | The existing scheme of using the native char for syntax array indicies makes cross-compiling difficult. Therefore it makes sense to choose one specific sign for everyone. Since signed chars are native to most platforms and i386, it makes more sense to use that if we are to choose one type for everyone.
* [PARSER] Use alloca to get rid of setjmpHerbert Xu2006-03-29
| | | | | Now that the only thing protected by setjmp/longjmp is the saved string, we can allocate it on the stack to get rid of the jump.
* [PARSER] Removed useless parsebackquote flagHerbert Xu2006-03-29
| | | | | The parsebackquote flag is only used in a test where it always has the value zero. So we can remove it altogether.
* Copyright/licence updates and remove all traces of sys/cdefs.hHerbert Xu2005-10-29
| | | | | | | | | | | This change updates the BSD licence to the three-clause version since NetBSD has already done so. This makes dash GPL-compatible. It also adds Christos Zoulas (NetBSD ash maintainer) to the COPYING file. I've added "copyright by Herbert Xu" to most files. Finally all CVS IDs and inclusion of sys/cdefs.h have been removed. The latter is needed for support of klibc.
* Removed unnecessary inclusion of eval.h from parser.c.herbert2005-09-26
|
* Renamed error to sh_error.herbert2005-09-26
|
* Initial import.Herbert Xu2005-09-26