Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve upgrade mechanisms to keep service as healthy as possible #8

Open
s4ke opened this issue Jan 9, 2023 · 4 comments
Open

Improve upgrade mechanisms to keep service as healthy as possible #8

s4ke opened this issue Jan 9, 2023 · 4 comments

Comments

@s4ke
Copy link
Member

s4ke commented Jan 9, 2023

Currently we only wait until the node is drained. We should investigate whether it is feasible to wait for all stacks to finish being moved over. Wait for all services to stop scheduling new things during cluster upgrade?

Maybe we need to take a snapshot of all services and the replica counts before the upgrade and we then wait until the same replica counts are back?

@s4ke
Copy link
Member Author

s4ke commented Mar 4, 2023

Something based around this should help:

#!/usr/bin/env python
# -*- coding: utf-8 -*-
#
# Copyright 2023 NeuroForge GmbH & Co. KG <https://neuroforge.de>
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

from dataclasses import dataclass
from datetime import datetime
import docker
from typing import List
from docker.models.services import Service

def print_timed(msg):
    to_print = '{} [{}]: {}'.format(
        datetime.now().strftime('%Y-%m-%d %H:%M:%S'),
        'docker_events',
        msg)
    print(to_print)


@dataclass
class StateInfo:
    service: Service
    target_replicas: int
    actual_replicas: int


def has_long_restart_policy(service: Service):
    """
    detects services with a long restart policy such as
    cron style services with a restart condition
    """
    try:
        restart_policy = service.attrs["Spec"]["TaskTemplate"]["RestartPolicy"]
        delay_ns = restart_policy["Delay"]

        # 10 minutes in nanoseconds
        cutoff_ns = 10 * 60 * 1e9

        return delay_ns > cutoff_ns
    except:
        return False
    

def is_oneshot(service: Service):
    """
    detects services that are intended as one shot
    """
    try:
        restart_policy = service.attrs["Spec"]["TaskTemplate"]["RestartPolicy"]
        return restart_policy["Condition"] == "none"
    except:
        return False


def get_state_infos(client: docker.DockerClient) -> List[StateInfo]:
    state_info: List[StateInfo] = []
    services = client.services.list()
    service: Service
    for service in services:
        mode = service.attrs["Spec"]["Mode"]

        if is_oneshot(service):
            # TODO: if its a one shot, check if the task is still
            #       running
            continue
        if has_long_restart_policy(service):
            continue

        if "Replicated" in mode:
            target_replicas = mode["Replicated"]["Replicas"]
        elif "Global" in mode:
            target_replicas = len(client.nodes.list())
        else: 
            continue

        desired_running_tasks = service.tasks(filters={"desired-state": "running"})
        actually_running_tasks = [elem for elem in desired_running_tasks 
                                if elem["Status"]["State"] == "running"]
        
        actually_running_tasks_count = len(actually_running_tasks)

        state_info.append(StateInfo(
            service=service,
            target_replicas=target_replicas,
            actual_replicas=actually_running_tasks_count
        ))

    return state_info


def is_settled() -> bool:
    client = docker.DockerClient()
    
    state_info = get_state_infos(client)

    settled_services = [elem for elem in state_info 
                        if elem.actual_replicas == elem.target_replicas]
    unsettled_services = [elem for elem in state_info 
                          if elem.actual_replicas != elem.target_replicas]
    
    unsettled_count = len(unsettled_services)

    for elem in settled_services:
        print_timed(f"OK: service {elem.service.name} ({elem.service.id}) has settled")
    for elem in unsettled_services:
        print_timed(f"NOK: service {elem.service.name} ({elem.service.id}) has not settled yet")
    
    return unsettled_count == 0


if __name__ == '__main__':
    if is_settled():
        print_timed("swarm has settled")
        exit(0)
    else:
        print_timed("swarm has not settled yet")
        exit(1)

@s4ke
Copy link
Member Author

s4ke commented Jun 19, 2023

see moby/moby#34139 (comment)

@s4ke
Copy link
Member Author

s4ke commented Jun 19, 2023

or moreover moby/moby#34139 (comment)

@s4ke
Copy link
Member Author

s4ke commented Jun 30, 2023

leaving this here as well

As an alternative approach to move off services of nodes that are about to be drained it would be worth trying out to update services with "--constraint-add 'node.hostname!=$(hostname)'" or any other constraint on a per need basis instead of deploying them with the constraint from the get go. I haven't tried this on a multinode swarm yet, but trying it on a local "1 node swarm" suggests it to be worth exploring more

This could work:

  1. docker node update $(hostname) --label-add draining=yes
  2. For each service run: docker service update --constraint-add "node.labels.draining!=yes" <service_name>
  3. actually drain the node
  4. docker node update $(hostname) --label-rm draining
  5. For each service run again docker service update --constraint-add "node.labels.draining!=yes" <service_name>

To not force the tasks off the nodes immediately, do this for every node in this order

@s4ke s4ke changed the title Wait for all services to stop scheduling new things during cluster upgrade Improve upgrade mechanisms Dec 29, 2023
@s4ke s4ke changed the title Improve upgrade mechanisms Improve upgrade mechanisms to keep service as healthy as possible Dec 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant