Never been to CodeSnippets before?

Snippets is a public source code repository. Easily build up your personal collection of code snippets, categorize them with tags / keywords, and share them with the world (or not, you can keep them private!)

Download free ebooks with wget

Mask the user agent as firefox, recursively download 2 levels deep from a span host with a maximum of 1 redirection, use random wait time and dump all pdf files to myBooksFolder without creating any other directories. Host will have no way of knowing that this is a grabber script.

wget -erobots=off --user-agent="Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.3) Gecko/2008092416 Firefox/3.0.3" -H -r -l2 --max-redirect=1 -w 5 --random-wait -PmyBooksFolder -nd --no-parent -A.pdf http://amazoo.com

wget tips

Download entire contents from website
wget -mk -p -N "http://www.yahoo.com"

-m mirror (download all folder on webpage recursively)
-k convert links for local browsing
-w20 wait 20 seconds between retrievals
-np no-parent
-nd no-directory, saves all files under one root directory
-p download all page requisites
-N download newest files only since last update.

Resume download large files
wget -c --output-document=file.zip "http://www.hostname.com/files/testfile1"

-c continue

Download by file type
downloads all avi and ignores everything else.
wget -m -nd -A.avi -erobots=off -i urls.txt

-A acclist comma-separated lists of file names suffixes
-R rejlist
-i input file with URLs
-erobots=off ignores robots.txt from the webserver


Check web site header information ie(type of OS)
wget -S yahoo.com




Mangavolume Downloader

// description of your code here

#!/bin/bash
base_uri="http://www.mangavolume.com"

# define chapter
chapter_number=$2
last_chapter_number=$3

# define manga_series
# manga_series="gantz"
manga_series=$1

if [[ "${last_chapter_number}" == "" ]]; then
  last_chapter_number=${chapter_number}
fi

while [[ ${chapter_number} -le ${last_chapter_number} ]]
do
	# define current chapter
	manga_chapter="chapter-"${manga_series}"-"${chapter_number}

	# define current_page
	current_page=${base_uri}"/"${manga_series}"/"${manga_chapter}"/page-1"
	echo "page:" ${current_page}

	# find download page
	download_page=`curl -s ${current_page} | grep -i "full quality.zip" | awk -F \" '{ print $2}'`
	download_page=${base_uri}${download_page}
	echo "lookup:" ${download_page}

	# find download link
	curl -s ${download_page} > tmp_page
	download_location=`grep ".zip" < tmp_page | awk -F\" '{print $2}'`
	if [[ "$download_location" != "" ]]; then
		wget "${download_location}"
	fi

	# goto next chapter
	chapter_number=chapter_number+1
done

# remove temp
rm tmp_page

Get all files in ftp server directory using WGET

It's a PITA to recursively get all files in a directory using ftp. Instead use wget.

wget -r ftp://account_name:password@example.com/directoryname