 |
|
 |
Home > WebZIP > Examples >
Example 2 - Wired Magazine.
Posted: 27 August 2001 by Spidersoft Team, Melbourne, Australia
Aim:
Capture the August 2001 issue of the Wired Magazine.
Content organization:
Each monthly issue is organized within its own sub-directory.
eg.
http://www.wired.com/wired/archive/9.07/
http://www.wired.com/wired/archive/9.08/
Method:
Since each issue is contained within its own sub-directory, we set the page location filter to "Within current directory"
Each magazine article also contains pages for printer format. We can prevent WebZIP from downloading these redundant pages by using a URL filter.
If you hover over each "print" link you'll find that each URL contain "_pr" as a common term.
So we add this to our URL Exclude Filters.
[PL]_pr
This simply means 'exclude all page links containing "_pr" in their URL'.
Task Summary:
![]() Task Name: | wired_mag_Aug2001 |
Task Folder: | E-Zines |
Start URL(s): | http://www.wired.com/wired/archive/9.08/ |
Save to folder: | D:\My Intranet\wired_mag_Aug2001\ |
Profile: | |
Filetypes: | All |
Followed links - Levels: | All levels |
Followed page links - Location: | Within current directory |
Followed media links - Location: | Within current site |
Include Filters: | |
Exclude Filters: | [PL]_pr |
Link Conversion - Followed Links: | Convert ALL followed links to relative links |
Link Conversion - Unfollowed Links: | Convert unfollowed links to absolute links |
Schedule: | Don't schedule this task |
Task File:
wired_mag_Aug2001.wzt
|
|
 |

|