Never been to CodeSnippets before?

Snippets is a public source code repository. Easily build up your personal collection of code snippets, categorize them with tags / keywords, and share them with the world (or not, you can keep them private!)

Process positional parameters non-destructively in Bash

Extract positional parameters without modifying (the number of) command line arguments ($# and $@).

# See:
# - http://tldp.org/LDP/abs/html/string-manipulation.html#AEN5117
# - http://tldp.org/LDP/abs/html/internalvariables.html#POSPARAMREF
# - http://tldp.org/LDP/abs/html/internalvariables.html#IFSREF
# - http://tldp.org/LDP/abs/html/parameter-substitution.html#PARAMSUBREF
# - cmdparser, http://codesnippets.joyent.com/posts/show/1697

export IFS=$' \t\n'

# create a fake command line
set -- "First one" "second" "third:one" "" "Fifth: :one"
set -- "First one" "second" "third:"$'\n'"one" "" "Fifth: :one"

echo $#                             # number of arguments

printf "%s\n" "${@}"                # all arguments         

printf "%s\n" "${1}"                # first argument          
printf "%s\n" "${3}"                # third argument
printf "%s\n" "${5}"                # last argument

printf "%s\n" "${@:1}"              # all arguments starting with the first
printf "%s\n" "${@:2}"              # all arguments starting with the second
printf "%s\n" "${@:3}"              # all arguments starting with the third

printf "%s\n" "${@:(-$#):1}"        # first argument          
printf "%s\n" "${@:$#:1}"           # last argument          
printf "%s\n" "${!#}"               # last argument

printf "%s\n" "${@:1:1}"            # first argument
printf "%s\n" "${@:3:1}"            # third argument
printf "%s\n" "${@:5:1}"            # fifth argument
printf "%s\n" "${@:(-1):1}"         # last argument
printf "%s\n" "${@:(-2):1}"         # second-to-last argument



# process one positional parameter at a time without modifying $# or $@
for (( i=1; i <= $#; i++ )); do printf "%s\n" "${@:${i}:1}"; done

echo $#    # 5


# $# and $@ get modified
for (( i=1; i <= $#; i++ )); do echo $i; printf "%s\n" "$1"; shift; done

echo $#   # 2


set -- "First one" "second" "third:"$'\n'"one" "" "Fifth: :one"

# $# and $@ get modified
while [[ $# -gt 0 ]]; do printf "%s\n" "$1"; shift; done

echo $#   # 0

Snip - extract a named element from an html file using bash

This is a primitive way of achieving the kind of data extraction that is more commonly associated with true XML for any reasonably modern html file (i.e. that it is well-formed and makes proper use of the id property). The purpose is mainly to get simple, yet fast and efficient text browsing, especially useful for quick look-ups and the like, e.g. dictionaries, thesauruses (thesauri?), encyclopedias etc. Since the data you're interested in is usually put into a specific element, text browsing is often greatly enhanced by extracting the element in question and discarding the rest. You run the script by specifying an element in the standard css way (element#id) and the file which is to be 'parsed', and the script responds by spitting out the element (and only that element) through html2text which does a really nice job of turning html code into legible console text.

EDIT: Added a quick check for the presence/absence of the element type in the line (before the grep operations) - greatly increases speed with large elements like #content on wikipedia.

#! /bin/bash

printhelp () {
echo "snip is a simple bash html cutter that works by extracting a specific element 
from an html file and feeding it to html2text. It presupposes wellformed html
and that you know the kind of element you want and it's id.

Syntax:
snip <element  type>#<element id> <file to parsed>

Example:
snip div#bodyContent /tmp/index.html
"
exit
}

quitter () {
echo "Element id not found. Quitting."; exit
}

[ "$1" = "-h" -o "$1" = "--help" -o "$1" = "" ] && printhelp

elementtype="$(echo $1 | cut -d '#' -f 1)"
id="$(echo $1 | cut -d '#' -f 2)"
htmlfile="$2"
thebegin=$(grep -nioE "id=\"$id\"" $htmlfile | cut -d ':' -f 1)
# echo $thebegin
[ -n "$thebegin" ] || quitter

sed -n ${thebegin}p "$htmlfile" | sed -re "s/^.*id=\"$id\"/<$elementtype id=\"$id\"/g" > /tmp/snipfile
sed -n $(($thebegin+1)),\$p "$htmlfile"  >> /tmp/snipfile

i=0
element=0
cat /tmp/snipfile | while read line; do
	let i++
	if [[ "$line" =~ "$elementtype" ]]; then
		elementbegincount="$(echo $line | grep -io "<$elementtype" | grep -c .)"
		elementendcount="$(echo $line | grep -io "</$elementtype" | grep -c .)"
		element=$(($element+$elementbegincount-$elementendcount))
		if [ "$element" -le 0 ]; then
			sed -n 1,${i}p /tmp/snipfile | html2text
			exit
		fi
	fi
done


As an example of how the script can be put to use, here's my Wikipedia lookup (the script above is referred to as 'snip' here):

#! /bin/bash

useragent="Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.1) Gecko/2008071619 Firefox/3.0.1"
if wget -q -U "$useragent" -O /tmp/wpfile "http://en.wikipedia.org/wiki/Special:Search?search=$*"; then
	clear
	echo "Page downloaded..."
	snip div#content /tmp/wpfile | less
else
	echo "No connection, sorry. Please try again."
fi