JCrawler
Name | JCrawler |
---|---|
Cover | http://www.nihongo.org/jcrawler/ |
Details | |
Status | active |
Description | JCrawler is currently used to build the Vietnam topic specific WWW index for VietGATE <URL:http://www.vietgate.net/>. It schedules visits randomly, but will not visit a site more than once every two minutes. It uses a subject matter relevance pruning algorithm to determine what pages to crawl and index and will not generally index pages with no Vietnam related content. Uses Unicode internally, and detects and converts several different Vietnamese character encodings. |
Purpose | indexing |
Type | standalone |
Platform | unix |
Language | perl5 |
Availability | none |
Owner Name | Benjamin Franz |
Owner URL | http://www.nihongo.org/snowhare/ |
Owner Email | snowhare@netimages.com |
Exclusion | yes |
Exclusion User-Agent | jcrawler |
NOINDEX | yes |
Host | db.netimages.com |
From | yes |
UserAgent | JCrawler/0.2 |
History | |
Environment | service |
ID | jcrawler |
Modified Date | Wed, 08 Oct 1997 00:09:52 GMT |
Modified By | Benjamin Franz |
Previous: JBot Java Web Robot
Next: Jeeves