Skip to content

liveness_probe isn't running sometimes #15874

@urucoder

Description

@urucoder
  • Package Name: azure-mgmt-containerinstance
  • Package Version: 2.0.0
  • Operating System: Docker image based on Ubuntu 18.04
  • Python Version: 3.8.6

Describe the bug
I'm using liveness_probe to run a script every hour to monitor my ACI (Azure container instance) but there are some times, where liveness_probe even doesn't run on the init of the ACI. It's not consistent and there's no error message to track, given there's happening on the ACI I'm not sure how to provide a better report.

To Reproduce
Steps to reproduce the behavior:

  1. Write a bash script and save it on a docker image with execution permissions
  2. Launch an ACI with liveness_probe configured to run the script every hour

Expected behavior
The script runs every time an ACI is launched and every hour until it stops

Additional context
I'll let you a piece of code of how I'm running this
Bash script

    #!/bin/bash

   function getCpuUsage {
      echo `top -b -n1 | grep "Cpu(s)" | awk '{print int($2 + $4)}'`
   }

   function getRamUsage {
       echo `free -m| grep  Mem | awk '{ print int($3/$2*100) }'`
   }

  resPath='/tmp/resources'

  if [[ ! -f "${resPath}" ]]; then 
    getCpuUsage >> $resPath && getRamUsage >> $resPath
    echo "initial usage: " && cat $resPath
  else
    n=1
    cpu=0
    ram=0
    while read line; do
        if [[ $n -eq 1 ]]; then
            cpu=$line
        else
            ram=$line
	    fi
        n=$((n+1))
    done < $resPath
    
    curCpu=$(getCpuUsage)
    curRam=$(getRamUsage)
    echo "
    current cpu: $curCpu
    current ram: $curRam"
    
    if [[ $((cpu+2)) -ge $curCpu && $((ram+2)) -ge $curRam ]]; then
        exit 1
    else
        exit 0
    fi
fi

Piece of python code

    from azure.mgmt.containerinstance import ContainerInstanceManagementClient
    from azure.mgmt.containerinstance.models import (ContainerGroup, Container, ContainerGroupRestartPolicy,
         ResourceRequests, ResourceRequirements, OperatingSystemTypes, ContainerProbe, ContainerExec)

    aciclient = ContainerInstanceManagementClient(credential, subscription.subscription_id)
    container_resource_requests = ResourceRequests(memory_in_gb=16, cpu=4.0)
    container_resource_requirements = ResourceRequirements(requests=container_resource_requests)

    start_command_line = f"python3 /app/test.py 2>&1 | tee /mnt/logs/logger.log"
    container_exec = ContainerExec(command=["sh", "-c", "/app/monitor.sh"])
    liveness_probe = ContainerProbe(exec_property=container_exec, period_seconds=3600)
    container = Container(name="test_container",
                          image="test_image",
                          resources=container_resource_requirements,
                          command=["sh", "-c", start_command_line],
                          liveness_probe=liveness_probe,)

    # Configure the container group
    group = ContainerGroup(location="westus",
                           containers=[container],
                           os_type=OperatingSystemTypes.linux,
                           restart_policy=ContainerGroupRestartPolicy.on_failure,)
    aciclient.container_groups.create_or_update(resource_group.name, cgname, group)

Let me know if I can provide more information on this, thanks.

Metadata

Metadata

Assignees

Labels

Container InstancesMgmtThis issue is related to a management-plane library.customer-reportedIssues that are reported by GitHub users external to the Azure organization.issue-addressedWorkflow: The Azure SDK team believes it to be addressed and ready to close.questionThe issue doesn't require a change to the product in order to be resolved. Most issues start as that

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions