From the Nutch Wiki:
How do I index my local file system?
1) crawl-urlfilter.txt needs a change to allow file: URLs while not following http: ones, otherwise it either won't index anything, or it'll jump off your disk onto web sites. Change this line:
2) crawl-urlfilter.txt may have rules at the bottom to reject some URLs. If it has this fragment it's probably ok:
¬†¬†# accept anything else +.*
3) I changed my nutch.xml to include the following: