User Manual for


wGetGUI


"GUI"ding the first steps

noclobber timestamping continue file download quota spider no directories force directories Clear server cache recursive retrieval depth download as-is mirror site only go deeper Special Retrieval Options retries No DNS look-up Additional Parameters Act as Browser Convert links ignore robots Running Options Go to background No info All info Some info Append to logfile Overwrite logfile Logfile Accept/Reject Accept Reject List of file types Custom file list Clear custom  file list. Start wGetStart.bat Add this to wGetStart.bat I am a Pro Save settings Load settings Exit About Empty wGetStart.bat url aot span spanall spanlist clear span list hostlist
This is the manual for version 0.90 of wGetGUI, the new one will follow.
However, take this as a good starting point.

This manual assumes that you already have downloaded and installed a version of wGetGUI, preferably the newest. If not, download it now and install it. Instructions for this and where to download wGet itself are given on the main page.

Before we start the manual on how to use wGetGUI, let me briefly explain how it works. wGet is a command line operated programm that runs in a DOS box on Windows system (if you compile it for Windows or take the binaries mentioned on the main page). You call it with a command line like wget -nc -r -l0 -Q1000 http://www.JensRoesner.de. Easy to use? Well, once you know all the switches, yes. But you either have to go over Start->Run or create a .bat file or some other solutions that are not really fast.
wGetGUI now lets you build your own command line by checking those options you want to have. wGetGUI translates them into a wGet command line. These command lines (you can do many) are stored in the wGetStart.bat file. After "Start wGetStart.bat", the work for wGetGUI is over. Now wGet works and you can close down wGetGUI, or you can add new URLs, see below.

The URLs presented in the following are my own (apart from the ones like http://host.com/dir1/ :) ). They really exist and should work.

Right after you first start wGetGUI, you should see a window very similar to this one. The picture is actually an image map. Just click on the option you are interested in. When you are using wGetGUI, you can do something similar by staying with the mouse arrow over the option. After a few seconds, a brief tool tip will be displayed.
This manual is much more extensive than the tool tips. However, if you want to dig even deeper into wGet, you should also read the manual that comes with every copy of wGet.

main window - image map - please enable pictures
 

URL

Just drag'n'drop or cut'n'paste or type the URL you want to grab. Both Internet Explorer and Netscape have an icon besides their location bar. Click on it an drag it into that URL field of wGetGUI. If wGetGUI is idling in the task bar, just drag the link onto the taskbar icon of wGetGUI, wait a bit until the window pops up and proceed as usual. After your desired URL is shown in the URL field, you can doubleclick the field to add the address to the list of URLs that will be grabbed by wGet. Or you may choose to modify some settings before that. If the URL does not start with http:// or ftp://, wGetGUI will put the http:// for you.
You can also use the "Add this to wGetStart.bat" button.     -map-


always on top

Pressing the "-" makes wGetGUI small and floating on top of your other windows, for example your browser.
You still have the ability to add new URLs and start wGet. The normal way is to adjust the options as you want them today and then make wGetGUI small and visit the sites/pages/directories you want to grab. Then just drag&drop&doubleclick into the small window that is now always-on-top . If you need to change settings, just enlarge wGetGUI by clicking the "+".     -map-


Span

By default, wGet will stay at the host you iniated the grabbing from. If you started from any directory of www.AudiStory.com it will normally stay at this host. Only if www.AudiStory.de has the same IP address, wGet will also download pages from www.AudiStory.de that are linked from www.AudiStory.com (see No host lookup).     -map-


All

To change the default setting that links to other servers are ignored, even if you want them, you can "Span all hosts". Now, wGet will follow all links it encounters, spanning all hosts. And I mean all. Only use this setting if you know what you are doing, like a "closed" directory, or you might actually be trying to download the whole www!     -map-


List

Much more useful is it to span only a selection of hosts. Now, you will find that many people use an easy to remember URL from where they link to webspace on a free provider on another server. So, if you want to get the real data, you would need to span these server(s) where most of the files are. So you type that host name into this field or only a part of it, in which case all hosts with that part in them would be spanned. But, wait a moment, many free webspace providers have thousands of homepages! So, type in host.com/username/. Ok!
Typing too tedious? Try it the "smart" way. Right click on the link that leads to the main section, copy the link location (or whatever your browser calls that), click into the field of wGetGUI and press CTRL+V. Or, drag&drop the link from your browser to your wGetGUI field. Sweet, hm? Sadly some browsers (IE 5.5 for example) do no support this. Do not worry about the file name at the end of the URL. wGetGUI truncates everything behind the last "/" because it is a file name. So if you enter an URL that is not ending with a file, be sure to enter a closing "/".     -map-


Clear list

Guess what? That clears the list of hosts to span.     -map-


Retrieval options

This sections groups together various options that can be described as being related to how the files are being downloaded.     -map-


No clobber

If this option is set, wGet will download only files that are not already in the target directory. If the file is already there, even if it is old or broken, wGet will not download it. I recommend that setting unless you plan to visit the server multiple times.     -map-


Timestamping

With this option, wGet will add a time stamp to the downloaded file. When you are downloading from the server again, wGet will download the files only if they defer in size and/or date from those on the server. I recommend that setting if you have a server you constantly download stuff from. If you do neither "No Clobber" nor "Timestamping", wGet should make backup-copies of the files already in your directory. Please check the wGet manual for complete information on this topic.      -map-


Continue file download

Ever had a crashed download thanks to the Windos reliability? Perhaps you can find the incomplete file where you planned to save it. Browsers currently do not support resumed downloads (don't ask me why). But wGet does! Copy the incomplete file to your wGetGUI/wGet directory, add the URL of the wanted file to the field of wGetGUI, check "Continue file download" uncheck the other options (highly recommended, otherwise please read the wGet instructions! For example, if you have Force directories) and start the thing. I am surprised every time that this works, a really cool feature.      -map-


Quota

This is simply the amount of Kilobytes wGet will download. Try it with sites and single files.     -map-


Spider

A spider checks, if the linked file works. Can be useful for site owners. Works with a host list and other options.      -map-


No directories

If this option is checked, all files wget downloads will be put in the directory where the wget.exe is. This is useful if you are not downloading many files.      -map-


Force directories

As a default, wget will put a single file into the base directory of wget.exe and multiple files from a website into folders. If this option is checked, directories are created always. I like this option, because it keeps all download results in the same way. But you might find too many directories disturbing.      -map-


Clear server cache

Necessary for the poor people having to deal with notoriously outdated server caches. This option sends a request to the server that will load the newest copy of the file into the cache. This is similar to the reload/refresh we know from browsers.      -map-


Recursive retrieval

This is were the action is at. If this is checked, wGet will downoad not only the file you invoked it with, it will follow the links on this page. Using the different filters for hostname and/or filenames, you can steer this in the desired direction.      -map-


Depth

This number gives the depth (or distance) of recursion. 2 will do two hops from the starting page, and so on. If you type 0, this means infinite recursion depth. This means wGet will download all referred files until it encounters a stopping commando like Quota or all the allowed hosts are traversed.      -map-


Download "as-is"

Perhaps you know Internet Explorer and one of its cooler features. Saving a webpage complete with pictures and sounds. This is what "downloading as is" is about. It is also called "page-requisites". With the latest wget 1.7, this option ignores only go deeper. This is a great facilitating step. So, always use the latest wGet version :)      -map-


Mirror site

This is basically only a short cut for Time Stamping and infinite recursion depth.      -map-


Only go deeper

This is very useful, I think. wget will only go deeper into the directory tree. If you start it at http://host.com/dir1/dir1b/ it will also download from http://host.com/dir1/dir1b/dir1b1/ but not from http://host.com/dir2/      -map-


Special

This collects options that especially deal with the way wGet interacts with the servers. It does not affect (well, ok, indirectly) the way wGet looks for files to download.      -map-


Retries

Number of retries that wGet will do until it gives up for that file.     -map-


No DNS look-up

This can be veeeery important. Many IP addresses (you know these numbers like 123.456.123.10) today have more than one URL (www.somenames.com) associated with them. If this option is not checked, wGet will treat all URL of this IP as coming from the same host. This is of course not wrong, but it may be that you really only want files from www.name.com and not from www.somenames2.com (These are examples, these two URLs do not fit the description).      -map-


Additional Parameters

You know a wget command line parameter that is not offered in wGetGUI? Then you can enter it here. Be not fooled by the short length of the field, it will hold your parameters ;)      -map-


Identify as browser

Sometimes a server will not like wGet. In order to avoid this, you can have wGet behave like a browser. This includes telling the server that wGet is actually Mozilla 4.03 and that it was refered to the server by a page on this server and not by an external link.     -map-


Convert links

Extremely useful. You know, many webpages have absolut links. If you do not convert them to local ones, this it what happens: Everytime you want to offline read the page, all the other abolutely linked pictures and files will be downloaded from the net! When converted, they will be loaded from your disk, which is most often that what you want.     -map-


Ignore robots.txt

wGet is normally considered a robot. I would describe a robot as a stupid thing that crawls the web (or a server) mapping and searching. Many search engines use robots to get the data for their searches. However, server admins sometimes restrict access to directories to browsers. Even with wGet identifying as browser, this is the case. wGet has to ignore robots.txt, this is what this option does.     -map-


Running options

These options are basically the way wGet can interact with you, and presents itself on the screen or in a logfile.      -map-


Go to background

Sends wGet to the background once started.     -map-


No info

wGet will print no info on what it is doing at the moment on the screen. Don't use this if you always ask yourself ("Is it still working?")      -map-


All info

This lets wget print much info. This is the way I have it on my machine. This includes a nice representation of how much of a file has been downloaded.      -map-


Some info

Right! This is a compromise between No Info and All Info.     -map-


Append to logfile

Instead of printing on the screen, append the output to a file. If you are using All Info this can get veeery large!      -map-


Overwrite logfile

Starting from the beginning of the log file, writes the new data over the old.      -map-


Logfile

Here you can enter a custom log file name. This has to include the .txt or .log (recommended) ending.      -map-


Accept/Reject

This provides the possibility to include/exclude only certain files.     -map-


Accept

If Accept is checked, the filters are applied to accepting files.      -map-


Reject

If Reject is checked, the filters are applied to rejecting files.      -map-


Filetypes

Here I included a few of the more common file types you might want to include/exclude.      -map-


Custom file list

If you do not like my selection of file types or need more or also want to exclude *thumb*s then write it in here and do not forget the wildcard: *      -map-


Clear file list

Exactly! This clears the custom file list.      -map-


The buttons

The following buttons broaden the way you interact with wGetGUI.     -map-


Start wGetStart.bat

Let's get it on! Once started, you can either close wGetGUI, leave it running or continue to add URLs. As long as wGet is not finished, the new URLs are added to the end of wGetStart.bat. As these lines are executed one after the other, this works.      -map-


Add this to wGetStart.bat

Adds the URL currently in the URL field to the wGetStart.bat file. Read more about it at URL.      -map-


Empty wGetStart.bat

Once wGet has worked through all lines of your wGetStart.bat, you should make room for the next run, normally. In rare cases you might want to keep the wgetStart.bat for mirroring. Then either leave the old command lines intact or (recommended) use your Windows Explorer to rename wGetStart.bat to something different.     -map-


I am a pro

uuuuh! This will open the Pro Mode window, where you can modify the generated command lines, empty wGetStart.bat, append the modified lines to wGetStart.bat. Drag&Drop should be possible on those systems supporting it. Even if you are not a Pro, this may be interesting for you. But you have be warned! :>      -map-


Save settings

Well, all these settings I just explained can be saved. The default name is wGetGUI.ini, but you can choose whatever you want. But wGetGUI.ini is the one that is loaded automatically if wGetGUI starts up.      -map-


Load settings

Load an ini file different from the automatically loaded wGetGUI.ini.      -map-


Exit

Click here or the "X" in the upper right corner to end wGetGUI.      -map-


About

More or less important infos about wGetGUI.      -map-


Something important missing? Just send it in!

 
Back to the main wGetGUI page.

 


Copyright 2001 by Jens Roesner
Page built with Phase5.