- The FIFO Scheduler
- Places applications in a queue and runs them in the order of submission.
- Requests for the first application in the queue are allocated first, then once its requests have been satisfied the next application in the queue is served, and so on.
- The good part:
- simple to understand
- not needing any configuration
- The bad side:
- not suitable for shared clusters.
- Large applications will use all the resources in a cluster
- So each application has to wait its turn.
- On a shared cluster it is better to use the Capacity Scheduler or the Fair Scheduler.
- Both of these allow long-running jobs to complete in a timely manner,
- while still allowing users who are running concurrent smaller ad hoc queries to get results back in a reasonable time.
- The Capacity Scheduler
- A separate dedicated queue allows the small job to start as soon as it is submitted,
- although this is at the cost of
- overall cluster utilization
- since the queue capacity is reserved for jobs in that queue.
- This means that the large job finishes later than when using the FIFO Scheduler.
- The Fair Scheduler
- There is no need to reserve a set amount of capacity since it will dynamically balance resources between all running jobs.
- When the second (small) job starts it is allocated half of the cluster resources so that each job is using its fair share of resources.
- Delay Scheduling
13.5.15
Apache YARN Scheduler
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment