In the event that a TaskTracker is not performing properly, it can be blacklisted so that no jobs will be scheduled to run on it. There are two types of TaskTracker blacklisting:
If a TaskTracker is blacklisted per job, you can un-blacklist it by running the following command as the administrative user:
- Per-job blacklisting, which prevents scheduling new tasks from a particular job
- Cluster-wide blacklisting, which prevents scheduling new tasks from all jobs.
Per-Job Blacklisting
The configuration value mapred.max.tracker.failures in mapred-site.xml (MapReduce v1) specifies a number of task failures in a specific job after which the TaskTracker is blacklisted for that job. The TaskTracker can still accept tasks from other jobs, as long as it is not blacklisted cluster-wide (see below).
A job can only blacklist up to 25% of TaskTrackers in the cluster.
Cluster-Wide Blacklisting
- The number of blacklists from successful jobs (the fault count) exceeds mapred.max.tracker.blacklists.
The parameter mapred.job.impact.blacklisting in the mapred-site.xml file lets you specify whether job failures should count toward the threshold set withmapred.max.tracker.blacklists. This parameter can be helpful when you are testing and know that jobs are likely to fail.
- The TaskTracker has been manually blacklisted using hadoop job -blacklist-tracker <host>
- The status of the TaskTracker (as reported by a user-provided health-check script) is not healthy.
If a TaskTracker is blacklisted, any currently running tasks are allowed to finish, but no further tasks are scheduled. If a TaskTracker has been blacklisted due to mapred.max.tracker.blacklists or using the hadoop job -blacklist-tracker <host> command, un-blacklisting requires a TaskTracker restart.
Only 50% of the TaskTrackers in a cluster can be blacklisted at any one time.
After 24 hours, the TaskTracker is automatically removed from the blacklist and can accept jobs again.
Blacklisting a TaskTracker Manually
To blacklist a TaskTracker manually, run the following command as the administrative user:
hadoop job -blacklist-tracker <hostname>
Manually blacklisting a TaskTracker prevents additional tasks from being scheduled on the TaskTracker. Any currently running tasks are allowed to finish.
Un-blacklisting a TaskTracker Manually
If a TaskTracker is blacklisted per job, you can un-blacklist it by running the following command as the administrative user:
hadoop job -unblacklist <jobid> <hostname>
If a TaskTracker has been blacklisted cluster-wide due to mapred.max.tracker.blacklists or using the hadoop job -blacklist-tracker <host> command, use the following command as the administrative user to remove that node from the blacklist:
hadoop job -unblacklist-tracker <hostname>
If a TaskTracker has been blacklisted cluster-wide due to a non-healthy status, correct the indicated problem and run the health check script again. When the script picks up the health status, the TaskTracker is un-blacklisted.
No comments:
Post a Comment