questions - Question:How to speed up crawling in Nutch

I am trying to develop an application in which I'll give a constrained set of urls to the urls file in Nutch. I am able to crawl these urls and get the contents of them by reading the data from the segments.

I have crawled by giving the depth 1 as I am no way concerned about the outlinks or inlinks in the webpage. I only need the contents of that webpages in the urls file.

But performing this crawl takes time. So, suggest me a way to decrease the crawl time and increase the speed of crawl. I also dont need indexing because I am not concerned about the search part.

asked Sep 13, 2013 in Crawl by rajesh
edited Sep 12, 2013
0 votes

Your answer

Your name to display (optional):
Privacy: Your email address will only be used for sending these notifications.
Anti-spam verification:
To avoid this verification in future, please log in or register.