Home > WebZIP > Examples >
Example 2 - Wired Magazine.
Posted: 27 August 2001 by Spidersoft Team, Melbourne, Australia
Capture the August 2001 issue of the Wired Magazine.
Each monthly issue is organized within its own sub-directory.
Since each issue is contained within its own sub-directory, we set the page location filter to "Within current directory"
Each magazine article also contains pages for printer format. We can prevent WebZIP from downloading these redundant pages by using a URL filter.
If you hover over each "print" link you'll find that each URL contain "_pr" as a common term.
So we add this to our URL Exclude Filters.
This simply means 'exclude all page links containing "_pr" in their URL'.
|Start URL(s): ||http://www.wired.com/wired/archive/9.08/|
|Save to folder: ||D:\My Intranet\wired_mag_Aug2001\|
|Followed links - Levels: ||All levels|
|Followed page links - Location: ||Within current directory|
|Followed media links - Location: ||Within current site|
|Include Filters: |
|Exclude Filters: ||[PL]_pr|
|Link Conversion - Followed Links: ||Convert ALL followed links to relative links|
|Link Conversion - Unfollowed Links: ||Convert unfollowed links to absolute links|
|Schedule: ||Don't schedule this task|