Organizational Research By

Surprising Reserch Topic

why submitting job to mapreduce takes so much time in general


why submitting job to mapreduce takes so much time in general  using -'hadoop,mapreduce'

So usually for 20 node cluster submitting job to process 3GB(200 splits) of data takes about 30sec and actual execution about 1m.
I want to understand what is the bottleneck in job submitting process and understand next quote


  Per-MapReduce overhead is significant: Starting/ending MapReduce job costs time


Some process I'm aware:
1. data splitting
2. jar file sharing
    
asked Oct 11, 2015 by rolvyrf
0 votes
8 views



Related Hot Questions



Government Jobs Opening


...