________________________________________________________________ 
SiteSnagger (VERSION 1.0)
Copyright (c) 1998 Ziff-Davis Publishing Company
Written by Steven E. Sipe
First Published in PC Magazine, US Edition, February 24, 1998.
________________________________________________________________ 

About SiteSnagger:
SiteSnagger lets you download the contents of a Web site to your PC's hard disk. You can download as much or as little of the site as you wish, and browse the information anytime or anywhere. SiteSnagger organizes the site information in a tree display of the pages and multimedia files that you've snagged. In addition, a Table of Contents page provides links to each of the pages you've downloaded, so you can quickly jump to any of them. SiteSnagger requires Windows 95 or Windows NT 4.0, and an internet connection.

Usage:
To install SiteSnagger, copy its three program files (SiteSnag.exe, SiteSnag.cnt and SiteSnag.hlp) to a subdirectory on your hard disk. SiteSnagger's main window holds the tree display of the site you have snagged. A panel below the tree and just above the status bar displays statistical information as you snag a Web site.

To begin working with SiteSnagger, create a new project or open a previously saved one. When you select Project|New to create a new project, SiteSnagger displays a dialog box for entering the project's name. The project name can be up to 20 characters and may contain any combination of alphabetic and numeric characters, as well as the space character. When you press OK, SiteSnagger creates a new project information file using the name that you specified and a .sng extension. This project file stores your project options as well as specific information about the pages snagged from a Web site.

Project files are stored in a subdirectory called "Projects" that is automatically created under the directory that holds the SiteSnagger executable. SiteSnagger also creates a new subdirectory under the Projects subdirectory with the same name as your new project. This new subdirectory will hold all of the HTML pages and multimedia files that you snag from a Web site.

Sitesnagger's Options:
Once you've created or opened a project, you can customize the way in which SiteSnagger snags the Web site. To configure the project's options, choose Options from the Project menu. The Max Levels setting lets you specify how many levels of linked pages SiteSnagger should download. If you set Max Levels to 1, SiteSnagger will downoad only the page you specify and none of its links. If you specify 2, you'll get the page you specify, plus all pages that have links on that page. You can tell SiteSnagger to download up to 20 levels, but be careful not to set Max Levels too high, especially if you're planning to snag a large Web site. If you set Max Levels to 3 and snagged PC Magazine's Web site, for example, you would download over a thousand files.

The Max Pages option lets you control the maximum number of HTML pages downloaded. When SiteSnagger's downloaded page count reaches this maximum, it stops the snagging operation immediately. Note that the Max Pages setting will override the Max Levels setting. Once the maximum number of pages has been downloaded, SiteSnagger will stop downloading, even if it hasn't yet drilled down to the Max Levels setting. If you don't want to limit the number of pages, you can check the No Limit checkbox and SiteSnagger will use the Max Levels setting to limit the download.

Many Web sites contain links that point to Web sites in different domainsfor example, links to advertiser pages or the latest version of your favorite browser. In most cases, you won't want to follow links to another server, but SiteSnagger lets you to control this behavior in case you do. If you check the Follow Offsite Links option, SiteSnagger will download the initial page from offsite servers. Note that SiteSnagger only gets the first page from an offsite link, regardless of the Max Levels setting. Of course, you can always create another SiteSnagger project to snag the other Web site.

The Get Multimedia Files option tells SiteSnagger whether to download any associated multimedia files it encounters on a particular Web page. When this option is checked, SiteSnagger downloads the various multimedia files that a particular page requires. Downloading multimedia files improves the appearance of the page in your browser, and may even be necessary for sites that make extensive use of image maps, but it can dramatically increase the amount of time required to snag a Web site. If your only interest in a Web site is its textual content, you can save a lot of download time by leaving this option unchecked.

To view a Web site from your local hard disk, you must tell SiteSnagger to fix up the HTML so that all of each page's references to images and other pages refer to local copies of these files rather than copies on the server. If the "Fix up for Browsing" option isn't checked, you won't be able to browse a site from your hard disk and you'll have to snag the site again with the option checked. There are some cases where you'll want to keep the pages in their original condition and leave this option unchecked. For example, you may want to archive the pages or transfer them to another server.

Snagging A Web Site:
Once you've configured your project's options, then you're ready to begin snagging a Web site. You can start snagging by choosing the "Snag a site" option from the Projects menu, or by pressing the Snag button on the toolbar. SiteSnagger will respond by prompting you for the base address of the site you want to download. For instance, to snag PC Magazine's Web site, you would enter http://www.pcmag.com.

The "Snag a site" screen displays your main project options in the lower part of the screen as a helpful reminder. If you need to change your options, click Cancel and bring up the Options dialog. Otherwise, once you've entered the base address, you can press the Start button to start snagging. The statistics area at the bottom of the main window displays how many text pages SiteSnagger is currently planning to download (Queued), the number of HTML pages already downloaded (Pages), the total number of files downloaded (Tot Files) and the total size of downloaded files in Kilobytes (Tot Size).

Most of SiteSnagger's menu options aren't available while snagging is in progress. For instance, you can't change project options or open a new project. One option is only enabled during snagging: the Stop button on the toolbar. Pressing Stop tells SiteSnagger to stop the download process after the current page is finished. It can take several seconds before the current page is finished, so SiteSnagger may not appear to respond immediately. When the current page is finished, SiteSnagger will ask you whether you still want to fix up the pages for browsing (if you have "Fix up for Browsing" checked).

Note that snagged files are always saved along with the project file, even if you interrupt the download. If you no longer want a project and its associated files on your hard disk, select Delete from the Project menu, or click the Delete button on the toolbar.

As SiteSnagger downloads a site, it fills a tree in the main window with the names of each of the downloaded pages along with each of the downloaded multimedia files. It also creates a table of contents. You can view any of the files in the tree by double-clicking on its name. This causes Windows to load the program associated with the specified file type. For example, clicking on a snagged .htm page causes Windows to load your browser and display that page.

SiteSnagger's Table of Contents is an HTML page with links to every page downloaded. The list is identical to the list you see under the Pages node of the tree, but since it is an HTML file, you can use the Back button to return to it after viewing each downloaded page. This lets you browse through the pages more quickly.

If you decide you don't like the name you chose for a project, you can rename it by choosing Rename from the Project menu. This renames the current project's .sng file as well as the subdirectory where any downloaded files are stored.

When you've finished viewing snagging files, you can exit SiteSnagger by choosing Exit from the Project menu. The active project is automatically saved when you exit. If you want to open this saved project the next time you run SiteSnagger, select Open from the Project menu.

For diagnostic purposes, SiteSnagger generates a log of every file it downloads and the level at which it encountered the item. The log file, called sitesnag.log, is stored in the subdirectory of each new project. The log is a standard text file that you can view using a text editor such as Notepad.

Note that SiteSnagger is a resource-intensive utility. Some basic page information must be saved in memory, so if you're downloading a site with several thousand files, SiteSnagger will require at least 2MB of free memory. Hard disk requirements can also be heavy since downloaded pages and multimedia files can consume a large amount of disk space. SiteSnagger's status window shows the total size of all files currently downloaded so that you can see the amount of disk space the download files have consumed so far.

Also note there are also certain types of Web sites that SiteSnagger can't download. Some servers require you to enter a user ID and password before you can access their contents. SiteSnagger doesn't provide a way to enter this information so you can't access these sites. And while you can download CGI and Java scripts, they will not run properly on your local machine if they require resources on the server. Also, you can't follow links that rely on Java or CGI program logic.

Although SiteSnagger does not in itself include scheduling capability, it provides command line equivalents for its options so you can use it with scheduling utilities such as System Agent, which comes with the Windows 95 Plus! pack. The command line syntax for SiteSnagger is as follows:

sitesnag projectname URL [/lx] [/px] [/o+|/o-] [/m+|/m-] [/b+|/b-] [/c+|/c-]

where:

              /lx (slash el) is the maximum number of levels, for example /l2.
              /px is the maximum number of pages, zero (/p0) for no limit.
              /o+ (slash oh plus) means follow offsite links;  /o- means don't follow them.
              /m+ means get multimedia files; /m- means don't get them.
              /b+ means fix up files for browsing; /b- means don't fix them up.
              /c+ means create a table of contents; /c- means don't create one.

If you leave out an option, the default setting for that option will be used. The default option settings are:

sitesnag projectname URL /l2 /p0 /o- /m+ /b+ /c+

Support for SiteSnagger:
Support for the free utilities offered by PC Magazine can be 
obtained electronically in the discussion area of PC 
Magazine's Web site. Go to the URL 
http://www.pcmag.com/discuss.htm/ and select the Utilities 
area. You can also access the Utilities discussion area from the 
utility's download page. The authors of current utilities 
generally monitor the discussion area every day. You may 
find an answer to your question simply by reading the 
messages previously posted. If the author is not available and 
you have a question that the sysops can't answer, the editor of 
the Utilities column, who also checks the area each day, will 
contact the author for you.

Steven E. Sipe, the author of SiteSnagger, is a developer based in Wilmington, North Carolina.
 Sheryl Canter is the editor of the Utilities column and a contributing editor of PC Magazine.


