Organizational Research By

Surprising Reserch Topic

Question:how to Accessing Logs through the Command Line?


Here is an example of the logs in ~/hadoop/logs
cd  /mnt/win/data/logs/
ls -ltr
-rwxrwxrwx 1 root root 15862 Jan  6 09:58 job_201012161155_0004_conf.xml
drwxrwxrwx 1 root root  4096 Jan  6 09:58 history
drwxrwxrwx 1 root root   368 Jan  6 09:58 userlogs
cd history
ls -ltr
-rwxrwxrwx 1 root root  15862 Jan  6 09:58 hadoop110_1292518522985_job_201012161155_0004_conf.xml
-rwxrwxrwx 1 root root 102324 Jan  6 10:02 hadoop110_1292518522985_job_201012161155_0004_hadoop_wordcount
The last log listed in the history directory is interesting. It contains the start and end time of all the tasks that ran during the execution of our Hadoop program.
It contains several different types of lines:
Lines starting with "Job", that indicate that refer to the job, listing information about the job (priority, submit time, configuration, number of map tasks, number of reduce tasks, etc...
Job JOBID="job_201004011119_0025" LAUNCH_TIME="1270509980407" TOTAL_MAPS="12" TOTAL_REDUCES="1" JOB_STATUS="PREP" 
Lines starting with "Task" referring to the creation or completion of Map or Reduce tasks, indicating which host they start on, and which split they work on. On completion, all the counters associated with the task are listed.
Task TASKID="task_201012161155_0004_m_000000" TASK_TYPE="MAP" START_TIME="1294325917422"\
MapAttempt TASK_TYPE="MAP" TASKID="task_201012161155_0004_m_000000" \
                      TASK_ATTEMPT_ID="attempt_201012161155_0004_m_000000_0" TASK_STATUS="SUCCESS"
                      FINISH_TIME="1294325918358" HOSTNAME="/default-rack/hadoop110" 
                      [(MAP_OUTPUT_BYTES)(Map output bytes)(66441)][(MAP_INPUT_BYTES)(Map input bytes)(39285)]
                      [ (COMBINE_INPUT_RECORDS)(Combine input records)(7022)][(MAP_OUTPUT_RECORDS)
                      (Map output records)(7022)]}" .
Lines starting with "MapAttempt", reporting mostly status update, except if they contain the keywords SUCCESS and/or FINISH_TIME, indicating that the task has completed. The final time when the task finished is included in this line.
Lines starting with "ReduceAttempt", similar to the MapAttempt tasks, report on the intermediary status of the tasks, and when the keyword SUCCESS is included, the finish time of the sort and shuffle phases will also be included.
 ReduceAttempt TASK_TYPE="REDUCE" TASKID="task_201012161155_0004_r_000005" 
                     TASK_ATTEMPT_ID="attempt_201012161155_0004_r_000005_0" START_TIME="1294325924281"
                     TRACKER_NAME="tracker_hadoop102:localhost/127\.0\.0\.1:40971" HTTP_PORT="50060" .

asked Sep 13, 2013 in Hadoop by anonymous
edited Sep 12, 2013
0 votes

Related Hot Questions

Your answer

Your name to display (optional):
Privacy: Your email address will only be used for sending these notifications.
Anti-spam verification:
To avoid this verification in future, please log in or register.