We had come up with the following requirements for the Scheduler Extender:
- apiserver should still be the api endpoint for pods and existing tools like kubectl should work as is
- Any networking, storage, quality of service requirements should be specified in the pod spec
- Should leverage cpu, mem based scheduling available in k8s
- Should work with the k8s binaries delivered by OS vendors (like Redhat or CoreOS)
- Should implement a generic interface that the community can benefit from
The Details (From #13580)
There are three ways to add new scheduling rules (predicates and priority functions) to Kubernetes:
- by adding these rules to the scheduler and recompiling (described here: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/scheduler.md),
- implementing your own scheduler process that runs instead of, or alongside of, the standard Kubernetes scheduler,
- implementing a “scheduler extender” process that the standard Kubernetes scheduler calls out to as a final pass when making scheduling decisions.
The third approach is needed for use cases where scheduling decisions need to be made on resources not directly managed by the standard Kubernetes scheduler. The extender helps make scheduling decisions based on such resources. (Note that the three approaches are not mutually exclusive.)
When scheduling a pod, the extender allows an external process to filter and prioritize nodes. k8s scheduler policy file allows configuration for the extender.
A sample scheduler policy file with extender configuration:
The “filter” call returns a list of nodes (api.NodeList). The “prioritize” call returns priorities for each node (schedulerapi.HostPriorityList).
The “filter” call may prune the set of nodes based on its predicates. Scores returned by the “prioritize” call are added to the k8s scores (computed through its priority functions) and used for final host selection.
Multiple extenders can be configured in the scheduler policy.
Please send us any feedback at [email protected].