bot - Nutch robot Table of contents 1 2 3 Sysadmins/robots...

Info iconThis preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon
Nutch robot Table of contents 1 Sysadmins/robots.txt. ......................................................................................................... 2 2 Webmasters/Robots META. .............................................................................................. 2 3 Contact us. .......................................................................................................................... 2 Copyright © 2010 The Apache Software Foundation. All rights reserved.
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
If you're reading this, chances are you've seen a Nutch-based robot visiting your site while looking through your server logs. Our software obeys robots.txt files and robot META tags in HTML. These are the standard mechanisms for webmasters to tell web robots which portions of a site a robot is welcome to access. 1. Sysadmins/robots.txt We're a software project, not a service, so please understand that a misbehaving crawler appearing with our Agent string is not run by us. Our software may be run by anyone.
Background image of page 2
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

Page1 / 2

bot - Nutch robot Table of contents 1 2 3 Sysadmins/robots...

This preview shows document pages 1 - 2. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online