Organizational Research By

Surprising Reserch Topic

why submitting job to mapreduce takes so much time in general


why submitting job to mapreduce takes so much time in general  using -'hadoop,mapreduce'

So usually for 20 node cluster submitting job to process 3GB(200 splits) of data takes about 30sec and actual execution about 1m.
I want to understand what is the bottleneck in job submitting process and understand next quote


  Per-MapReduce overhead is significant: Starting/ending MapReduce job costs time


Some process I'm aware:
1. data splitting
2. jar file sharing
    

asked Oct 11, 2015 by rolvyrf
0 votes
8 views



Related Hot Questions

2 Answers

0 votes
NULL
answered Oct 11, 2015 by gauravsinghal83
0 votes
NULL
answered Oct 11, 2015 by sujata naik

...