Site details
| Name | Magpie |
| Short name | MAG |
| URL | http://magpie-data.magpie.net |
| Update frequency How often should indexing and category extraction be performed | 00:05:00 |
| Process As RSS? | false |
| Index Raw Content? | false |
| Content Expiry | Never expires |
eg. www.example.com/document/textonly
|
No URL filters - all parts of this site will be indexed.
|
|
No Domain Filters. Only pages from the same domain as the starting URL will be included.
|
Choose which methods to use to extract content from a page
|
No content extractors - the whole page will be used as content.
|
These are values passed to the server through a page's address.
Searchbox will ignore pages distinguished by URL parameters unless they are specified here.
eg. www.example.com?page=12
| There are no URL parameters |
Searchbox will start spidering the site at these points eg: www.example.com/news, www.example.com/business or even www.example.com/
|
No starting points specified.
|
Site-category extraction methods
The site's taxonomy can be derived from its URLs or its metatags
|
Not looking for site categories
|
| Categories are not being extracted from this site. To choose a category extraction technique click the edit button above. |
Site category to taxonomy mappings
|
No Subscription model is currently defined
|
Allows you to extract the title from certain pages
eg. <meta name="keywords"».*</meta»
|
No Title filters - default title extraction will be applied.
|
Specify whether a given page is to be ignored, eg. a 404 page that does not actually return the 404 code
eg. textonly
|
No Ignorable Page filters - all parts of this site will be indexed.
|
eg. textonly
|
No Address filters - all parts of the link URL will be used.
|
Change Filters allow you to specifies URLs that do not trigger alert hits
|
No Change filters - no URLs will be prevented from triggering alerts.
|