0000-00-00 00:00:00
Just today I ran into something shiny that peeked my interest. A shell script I'd written in Bash didn't work like I expected it to, with regards to the scope of a variable. I thought the incident was interesting enough to report, although I won't go into the whole scoping story too deeply.
What is basically boils down to is that there was a difference in the way two shells handle a certain situation. A difference that I didn't expect to be there. Not that exciting, but still very educational.
Yeah. In most programming languages variables have a certain range within your program, within which they can be used. Some variables only exist within one subroutine, while other exist across the whole program or even across multiple parts of the whole.
In shell scripting things aren't that complicated, luckily. In most cases a variable that's set in one part of the script can be used in every other part of the script. There are some notable exceptions, one of which I ran into today without realising it.
My situation:
I have a command that outputs a number of lines, some of which I need. The lines that I'm interested in consist of various fields, two of which I need as variables. Depending on the value of one of these variables, a counter needs to be incremented.
I guess that sounds kinda complicated, so here's the real code snippet:
function check_transport_paths
{
TOTAL=`scstat -W | grep "Transport path:" | wc -l`
let COUNT=0
scstat -W | grep "Transport path:" | awk '{print $3" "$6}' | while read PATH STATUS
do
if [ $STATUS == "online" ]
then
let COUNT=$COUNT+1
fi
done
if [ $COUNT -lt 1 ]
then
echo "NOK - No transport paths online."
exit $STATE_CRITICAL
elif [ $COUNT -lt $TOTAL ]
then
echo "NOK - One or more transport paths offline."
exit $STATE_WARNING
fi
}
While testing my script, I found out that $COUNT would never retain the value it gained in the while-loop. This of course led to the script always failing the check. After some fiddling about, I found out that the problem lay in the use of the while loop: it was being used that the end of a pipe.
To illustrate, the following -does- work.
let COUNT=0
while read i
do
let COUNT=$COUNT+$i
echo $COUNT
done
echo "Total is $COUNT."
This leads to the following output.
$ ./baka.sh
1
1
2
3
3
6
4
10
^D
Total is 10.
However, if I were to create a script called neko.sh that outputs the numbers one through four on seperate lines, which is then used in baka.sh... well... it doesn't work :D Regardez!
let COUNT=0
./neko.sh | while read i
do
let COUNT=$COUNT+$i
echo $COUNT
done
echo "Total is $COUNT."
This gives the following output
1
3
6
10
Total is 0.
After discussing the matter with two of my colleagues (one of them as puzzled as I was, and the other knowing what was going wrong) we came to the following conclusions.
This conclusion is supported by an example in the "Advanced Bash-scripting guide" by Mendel Cooper. In the following example an additional comment is made about the scoping of variables with redirected while loops. The comment warns that older shells branch a redirected while into a sub-shell, but also tells that Bash and Ksh this properly.
I guess our version of Bash is too old :3
I'd like to thank my colleagues Dennis Roos and Tom Scholten for spending a spare hour with me, hacking at this problem. And I'd like to thank Ondrej Jombik for pointing out the fact that this article didn't make my conclusions very clear in its original version.
kilala.nl tags: unix, sysadmin, programming,
View or add comments (curr. 19)
You are free to use this specific work, to share and distribute it and to adapt it for your own purposes. However, you must attribute this work as mine and you must share all of your alterations. Click on the logo, or follow this link for full details.
2008-01-15 20:51:00
Posted by ron
There is another way to get around this problem.
while read line
do
let COUNT=$COUNT+1
done < filename.txt
echo $COUNT
Basically if you pipe it in this way instead of into the while loop you will get around the scoping issue. And, yes, this would work in bash.
2008-01-17 06:38:00
Posted by Cailin Coilleach
Hey, thanks Ron! That's real helpful! I didn't know about your solution, since the < into a loop isn't my style of scripting. I've literally never used it :)
So: thanks!
2008-02-21 17:05:00
Posted by Pat
Very elegant work-around, Ron ... I, too, was unaware of this option for feeding input into a while-loop. Thanks a million!
2008-04-15 00:20:00
Posted by Weidong
Thanks to everyone! Great post and great work-around!
2008-05-01 21:19:00
Posted by Andrew
Ron,
Great workaround. Thanks.
2008-06-26 13:16:00
Posted by George
Top stuff - this exact issue had me totally bamboozled this morning until I found this page. And this is on what I thought was proper ksh but in fact turns out to be PD KSH which I now know also must spawn a new sub-shell when piping.
2008-07-01 17:51:00
Posted by Toni Schlichting(website)
Very often I can't get around /bin/sh. And /bin/sh very often is only a link to bash. So If I need to deal with the problem of limited scope I write these variables to a file
echo "VALUE_1 blah" >> file1
echo "VALUE_2 blub" >> file1
cat file1 | while read line; do
process line
key=`echo $line | cut -d " " -f1`
value=`echo $line | cut -d " " -f2`
echo export $key=$value >> /tmp/cfg.tmp
done
#
# outside the loop I source the shortly written file
#
. /tmp/cfg.tmp
rm /tmp/cfg.tmp
and Now I have my values.
But I must admit. ksh is mor elegant.
2008-10-25 17:31:00
Posted by NSK Nikolaos S. Karastathis(website)
Hi, nice post thanks, I just wanted to notify you that your blog appears to output the date and time wrong, perhaps you would like to take care of it. Here's what I see: 0000-00-00 00:00:00
2008-10-29 19:19:00
Posted by Cailin Coilleach
Heh, no, that's not a bug... It's just that I don't know the original date and time at which I wrote the article :p This specific article was written before I converted my website to PHP+MySQL and it didn't include a postdate. But thanks ;)
2008-11-13 19:12:00
Posted by Chip
Great solution Ron, thanks!
2009-02-23 18:42:00
Posted by pakrat
I so enjoy seeing complicated shell scripts that invoke awk that don't actually make use of awk.
scstat | awk '{status=""}
/Transport path:/{status=$6;path=$3;total++}
status="online"{count++}
count<total{print "NOK - One or more transport paths offline." exit <STATE_OK VALUE>}
END{if (count==0) { print "NOK - No transport paths online"exit whatever} else {exit 0}}'
The rule of thumb is "If you have a "|grep | awk" pipeline, you don't know what you're doing.
The second rule of thumb is "If you only have one execution block in your awk script, you probably should be using cut."
The third rule of thumb is for anything where you're dealing with a collection of functions that then get fed into logic to then do something... you probably should be using perl or python or ruby.
2009-02-24 04:27:00
Posted by Cailin Coilleach
Hi pakrat, thanks for your code snippet. Also, to discuss your other points:
1. Thanks for implying that I've been wasting the last ten years :)
2. Fair point, but then again: one uses what one's used to.
3. Sure, but then I'd have to learn yet another programming language. Again, one uses what one's used to and personally I can't get a good feel for Perl.
2009-03-20 19:14:00
Posted by Wes
I knew I didn't know what I was doing
2009-04-03 02:30:00
Posted by Mack
This had me going nuts until I found your request - Really useful answer - many thanks for taking the time to post Cailin.
Ron - brilliant answer, saved me doing dirty environment variable sets to detect what was happening in the loop.
2009-04-07 14:45:00
Posted by Holger
Thanks for the workaround, Ron. This just happened to me with ksh on Solaris 10 so using ksh instead of ksh probably wouldn't have solved the problem. ;-)
The funny thing is that it used to work the "... | while read line" way at first, but the body of the loop is currently some 120 lines long, so I suppose it just grew too big for ksh to not start a subshell to handle it.
2009-07-08 21:19:00
Posted by dannynoonan
does anybody have another work around that doesn't involve the file system? i'm writing a health check that i want to be able to power through a read-only filesystem -they happen.
i'm piping the output of curl through a while loop and i need to set a variable indicating that a line of stdin triggered an event. is there not way to know that the while loop broke ie. i hit a 'break' statement? if it's a subshell, can i use an exit code somehow? how would one check the exit code of a shell pipeline that's launced via a while loop?
2009-07-10 12:45:00
Posted by Cailin Coilleach
Well, I reckon you could do something like:
whileloop ()
{
curl | while .....
do
...
...
if [[ test ]]
then
return 1; break
fi
done
...
return 0
}
ERROR=$(whileloop)
[[ $ERROR -gt 0 ]] && echo "There was an error"
2010-03-26 22:35:00
Posted by ao2(website)
Thanks for the explanation.
@dannynoonan: if you don't want to use external files you can generate the values to loop on inside a HEREDOC string, see:
http://ao2.it/en/blog/2010/03/26/piping-shell-scripts-and-var-scoping
All content, with exception of "borrowed" blogpost images, or unless otherwise indicated, is copyright Thomas Sluyter.