Ask a Question
Advertise on boostr.in
boostr.in questions - Question:How to speed up crawling in Nutch
I am trying to develop an application in which I'll give a constrained set of urls to the urls file in Nutch. I am able to crawl these urls and get the contents of them by reading the data from the segments.
I have crawled by giving the depth 1 as I am no way concerned about the outlinks or inlinks in the webpage. I only need the contents of that webpages in the urls file.
But performing this crawl takes time. So, suggest me a way to decrease the crawl time and increase the speed of crawl. I also dont need indexing because I am not concerned about the search part.
Sep 13, 2013
Sep 12, 2013
to add a comment.
Your name to display (optional):
Email me at this address if my answer is selected or commented on:
Email me if my answer is selected or commented on
Privacy: Your email address will only be used for sending these notifications.
To avoid this verification in future, please