Organizational Research By

Surprising Reserch Topic

Question:How to speed up crawling in Nutch


I am trying to develop an application in which I'll give a constrained set of urls to the urls file in Nutch. I am able to crawl these urls and get the contents of them by reading the data from the segments.

I have crawled by giving the depth 1 as I am no way concerned about the outlinks or inlinks in the webpage. I only need the contents of that webpages in the urls file.

But performing this crawl takes time. So, suggest me a way to decrease the crawl time and increase the speed of crawl. I also dont need indexing because I am not concerned about the search part.
asked Sep 13, 2013 in Crawl by rajesh
edited Sep 12, 2013
0 votes
48 views



Related Hot Questions



Government Jobs Opening


...