This manual assumes that you already have downloaded and installed a version of wGetGUI, preferably the newest. If not, download it now and install it. Instructions for this and where to download wGet itself are given on the main page. Before we start the manual on how to use wGetGUI, let me briefly explain how it works. wGet is a command line operated programm that runs in a DOS box on Windows system (if you compile it for Windows or take the binaries mentioned on the main page). You call it with a command line like wget -nc -r -l0 -Q1000 http://www.JensRoesner.de. Easy to use? Well, once you know all the switches, yes. But you either have to go over Start->Run or create a .bat file or some other solutions that are not really fast. The URLs presented in the following are my own (apart from the ones like http://host.com/dir1/ :) ). They really exist and should work. Right after you first start wGetGUI, you should see a window very similar to this one. The picture is actually an image map. Just click on the option you are interested in. When you are using wGetGUI, you can do something similar by staying with the mouse arrow over the option. After a few seconds, a brief tool tip will be displayed. |
|
URL | Just drag'n'drop or cut'n'paste or type the URL you want to grab. Both Internet Explorer and Netscape have an icon besides their location bar. Click on it an drag it into that URL field of wGetGUI. If wGetGUI is idling in the task bar, just drag the link onto the taskbar icon of wGetGUI, wait a bit until the window pops up and proceed as usual. After your desired URL is shown in the URL field, you can doubleclick the field to add the address to the list of URLs that will be grabbed by wGet. Or you may choose to modify some settings before that. If the URL does not start with http:// or ftp://, wGetGUI will put the http:// for you. |
always on top | Pressing the "-" makes wGetGUI small and floating on top of your other windows, for example your browser. |
Span | By default, wGet will stay at the host you iniated the grabbing from. If you started from any directory of www.AudiStory.com it will normally stay at this host. Only if www.AudiStory.de has the same IP address, wGet will also download pages from www.AudiStory.de that are linked from www.AudiStory.com (see No host lookup). -map- |
All | To change the default setting that links to other servers are ignored, even if you want them, you can "Span all hosts". Now, wGet will follow all links it encounters, spanning all hosts. And I mean all. Only use this setting if you know what you are doing, like a "closed" directory, or you might actually be trying to download the whole www! -map- |
List | Much more useful is it to span only a selection of hosts. Now, you will find that many people use an easy to remember URL from where they link to webspace on a free provider on another server. So, if you want to get the real data, you would need to span these server(s) where most of the files are. So you type that host name into this field or only a part of it, in which case all hosts with that part in them would be spanned. But, wait a moment, many free webspace providers have thousands of homepages! So, type in host.com/username/. Ok! |
Clear list | Guess what? That clears the list of hosts to span. -map- |
Retrieval options | This sections groups together various options that can be described as being related to how the files are being downloaded. -map- |
No clobber | If this option is set, wGet will download only files that are not already in the target directory. If the file is already there, even if it is old or broken, wGet will not download it. I recommend that setting unless you plan to visit the server multiple times. -map- |
Timestamping | With this option, wGet will add a time stamp to the downloaded file. When you are downloading from the server again, wGet will download the files only if they defer in size and/or date from those on the server. I recommend that setting if you have a server you constantly download stuff from. If you do neither "No Clobber" nor "Timestamping", wGet should make backup-copies of the files already in your directory. Please check the wGet manual for complete information on this topic. -map- |
Continue file download | Ever had a crashed download thanks to the Windos reliability? Perhaps you can find the incomplete file where you planned to save it. Browsers currently do not support resumed downloads (don't ask me why). But wGet does! Copy the incomplete file to your wGetGUI/wGet directory, add the URL of the wanted file to the field of wGetGUI, check "Continue file download" uncheck the other options (highly recommended, otherwise please read the wGet instructions! For example, if you have Force directories) and start the thing. I am surprised every time that this works, a really cool feature. -map- |
Quota | This is simply the amount of Kilobytes wGet will download. Try it with sites and single files. -map- |
Spider | A spider checks, if the linked file works. Can be useful for site owners. Works with a host list and other options. -map- |
No directories | If this option is checked, all files wget downloads will be put in the directory where the wget.exe is. This is useful if you are not downloading many files. -map- |
Force directories | As a default, wget will put a single file into the base directory of wget.exe and multiple files from a website into folders. If this option is checked, directories are created always. I like this option, because it keeps all download results in the same way. But you might find too many directories disturbing. -map- |
Clear server cache | Necessary for the poor people having to deal with notoriously outdated server caches. This option sends a request to the server that will load the newest copy of the file into the cache. This is similar to the reload/refresh we know from browsers. -map- |
Recursive retrieval | This is were the action is at. If this is checked, wGet will downoad not only the file you invoked it with, it will follow the links on this page. Using the different filters for hostname and/or filenames, you can steer this in the desired direction. -map- |
Depth | This number gives the depth (or distance) of recursion. 2 will do two hops from the starting page, and so on. If you type 0, this means infinite recursion depth. This means wGet will download all referred files until it encounters a stopping commando like Quota or all the allowed hosts are traversed. -map- |
Download "as-is" | Perhaps you know Internet Explorer and one of its cooler features. Saving a webpage complete with pictures and sounds. This is what "downloading as is" is about. It is also called "page-requisites". With the latest wget 1.7, this option ignores only go deeper. This is a great facilitating step. So, always use the latest wGet version :) -map- |
Mirror site | This is basically only a short cut for Time Stamping and infinite recursion depth. -map- |
Only go deeper | This is very useful, I think. wget will only go deeper into the directory tree. If you start it at http://host.com/dir1/dir1b/ it will also download from http://host.com/dir1/dir1b/dir1b1/ but not from http://host.com/dir2/ -map- |
Special | This collects options that especially deal with the way wGet interacts with the servers. It does not affect (well, ok, indirectly) the way wGet looks for files to download. -map- |
Retries | Number of retries that wGet will do until it gives up for that file. -map- |
No DNS look-up | This can be veeeery important. Many IP addresses (you know these numbers like 123.456.123.10) today have more than one URL (www.somenames.com) associated with them. If this option is not checked, wGet will treat all URL of this IP as coming from the same host. This is of course not wrong, but it may be that you really only want files from www.name.com and not from www.somenames2.com (These are examples, these two URLs do not fit the description). -map- |
Additional Parameters | You know a wget command line parameter that is not offered in wGetGUI? Then you can enter it here. Be not fooled by the short length of the field, it will hold your parameters ;) -map- |
Identify as browser | Sometimes a server will not like wGet. In order to avoid this, you can have wGet behave like a browser. This includes telling the server that wGet is actually Mozilla 4.03 and that it was refered to the server by a page on this server and not by an external link. -map- |
Convert links | Extremely useful. You know, many webpages have absolut links. If you do not convert them to local ones, this it what happens: Everytime you want to offline read the page, all the other abolutely linked pictures and files will be downloaded from the net! When converted, they will be loaded from your disk, which is most often that what you want. -map- |
Ignore robots.txt | wGet is normally considered a robot. I would describe a robot as a stupid thing that crawls the web (or a server) mapping and searching. Many search engines use robots to get the data for their searches. However, server admins sometimes restrict access to directories to browsers. Even with wGet identifying as browser, this is the case. wGet has to ignore robots.txt, this is what this option does. -map- |
Running options | These options are basically the way wGet can interact with you, and presents itself on the screen or in a logfile. -map- |
Go to background | Sends wGet to the background once started. -map- |
No info | wGet will print no info on what it is doing at the moment on the screen. Don't use this if you always ask yourself ("Is it still working?") -map- |
All info | This lets wget print much info. This is the way I have it on my machine. This includes a nice representation of how much of a file has been downloaded. -map- |
Some info | Right! This is a compromise between No Info and All Info. -map- |
Append to logfile | Instead of printing on the screen, append the output to a file. If you are using All Info this can get veeery large! -map- |
Overwrite logfile | Starting from the beginning of the log file, writes the new data over the old. -map- |
Logfile | Here you can enter a custom log file name. This has to include the .txt or .log (recommended) ending. -map- |
Accept/Reject | This provides the possibility to include/exclude only certain files. -map- |
Accept | If Accept is checked, the filters are applied to accepting files. -map- |
Reject | If Reject is checked, the filters are applied to rejecting files. -map- |
Filetypes | Here I included a few of the more common file types you might want to include/exclude. -map- |
Custom file list | If you do not like my selection of file types or need more or also want to exclude *thumb*s then write it in here and do not forget the wildcard: * -map- |
Clear file list | Exactly! This clears the custom file list. -map- |
The buttons | The following buttons broaden the way you interact with wGetGUI. -map- |
Start wGetStart.bat | Let's get it on! Once started, you can either close wGetGUI, leave it running or continue to add URLs. As long as wGet is not finished, the new URLs are added to the end of wGetStart.bat. As these lines are executed one after the other, this works. -map- |
Add this to wGetStart.bat | Adds the URL currently in the URL field to the wGetStart.bat file. Read more about it at URL. -map- |
Empty wGetStart.bat | Once wGet has worked through all lines of your wGetStart.bat, you should make room for the next run, normally. In rare cases you might want to keep the wgetStart.bat for mirroring. Then either leave the old command lines intact or (recommended) use your Windows Explorer to rename wGetStart.bat to something different. -map- |
I am a pro | uuuuh! This will open the Pro Mode window, where you can modify the generated command lines, empty wGetStart.bat, append the modified lines to wGetStart.bat. Drag&Drop should be possible on those systems supporting it. Even if you are not a Pro, this may be interesting for you. But you have be warned! :> -map- |
Save settings | Well, all these settings I just explained can be saved. The default name is wGetGUI.ini, but you can choose whatever you want. But wGetGUI.ini is the one that is loaded automatically if wGetGUI starts up. -map- |
Load settings | Load an ini file different from the automatically loaded wGetGUI.ini. -map- |
Exit | Click here or the "X" in the upper right corner to end wGetGUI. -map- |
About | More or less important infos about wGetGUI. -map- |
Something important missing? Just send it in! |