Limiting service resources

So far, we have not really spent any time talking about service isolation with regard to the resources available to the services, but it is a very important topic to cover. Without limiting resources, a malicious or misbehaving service could be liable to bring the whole cluster down, depending on the severity, so great care needs to be taken to specify exactly what allowance individual service tasks should use.

The generally accepted strategy for handling cluster resources is the following:

  • Any resource that may cause errors or failures to other services if used beyond intended values is highly recommended to be limited on the service level. This is usually the RAM allocation, but may include CPU or others.
  • Any resources, specifically the hardware ones, for which you have an external limit should also be limited for Docker containers too (e.g. you are only allowed to use a specific portion of a 1-Gbps NAS connection).
  • Anything that needs to run on a specific device, machine, or host should be locked to those resources in the same fashion. This kind of setup is very common when only a certain number of machines have the right hardware for a service, such as in GPU computing clusters.
  • Any resource that you would like specifically rationed within the cluster generally should have a limit applied. This includes things such as lowering the CPU time percentage for low-priority services.
  • In most cases, the rest of the resources should be fine using normal allocations of the available resources of the host. 

By applying these rules, we will ensure that our cluster is more stable and secure, with the exact division of resources that we want among the services. Also, if the exact resources required for a service are specified, the orchestration tool usually can make better decisions about where to schedule newly created tasks so that the service density per Engine is maximized.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset