skip to content
Advertisement

Are there any robot books?

<< | >>

Yes. Here is a selection:

Spidering Hacks

By Kevin Hemenway, Tara Calishain

This discusses web robots, robots.txt, LWP programming, and has many examples of scraping web pages and extracting information. Recommended.

More information at O'Reilly, including Table of Contents with content preview, and the ability to read the whole book online through Safari Books Online.

Published by O'Reilly, 2003. ISBN: 0596005776

Perl & LWP

By Sean M. Burke

An O'Reilly book that thoroughly explains how to use LWP, the standard web library for Perl. It has a chapter on spiders. Recommended.

More information at O'Reilly, including Table of Contents with content preview, and the ability to read the whole book online through Safari Books Online.

Disclosure of interest: the author sent me a copy for review, and I'm a co-author of LWP.

Published by O'Reilly, 2002. ISBN 0596001789

Client Programming with Perl

By Clinton Wong

This book is now out of print, but is freely available through the O'Reilly Open Books Project.

Published by O'Reilly, 1997. ISBN 156592214X

Bots and Other Internet Beasties

By Joseph Williams

I haven't seen this myself, but someone said:

The William's book 'Bots and other Internet Beasties' was quite disappointing. It claims to be a 'how to' book on writing robots, but my impression is that it is nothing more than a collection of chapters, written by various people involved in this area and subsequently bound together.

Published by Pearson Education, 1996. ISBN 1575210169.

Internet Agents: Spiders, Wanderers, Brokers, and Bots

By Fah-Chun Cheong

I believe this book is out of print. This books covers Web robots, commerce transaction agents, Mud agents, and a few others. It includes source code for a simple Web robot based on top of libwww-perl4.

Its coverage of HTTP, HTML, and Web libraries is a bit too thin to be a "how to write a web robot" book, but it provides useful background reading and a good overview of the state-of-the-art, especially if you haven't got the time to find all the info yourself on the Web.

Published by New Riders, 1995. ISBN 1-56205-463-5.

Advertisement