Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

queue stuck in INIT state on tarantool 3 #226

Closed
vanyarock01 opened this issue Mar 18, 2024 · 0 comments · Fixed by #227
Closed

queue stuck in INIT state on tarantool 3 #226

vanyarock01 opened this issue Mar 18, 2024 · 0 comments · Fixed by #227
Assignees
Labels
1sp bug Something isn't working teamE

Comments

@vanyarock01
Copy link

Problem

I tried to run cluster on tarantool 3 with queue on shard.
Everething was fine until i added a replicas to queue replicaset.
image (3)

Sometimes after cluster start i got strange behavior: queue stuck in INIT state on RW instance.

Reproduce

Versions:

  • queue 1.3.3
  • Tarantool 3.0.1-0-g31c2ddb on Darwin-x86_64-Release
  • Tarantool CLI version 2.1.2, darwin/amd64. commit: a63ba34 (v2.1.2-37-ga63ba34)

config.yaml

app: {}
credentials:
  users:
    client:
      password: super-client
      roles:
        - super
    replicator:
      password: usage-storage-replicator-secret
      roles:
        - replication
    storage:
      password: usage-storage-storage-secret
      roles:
        - super
        - sharding
groups:
  routers:
    replicasets:
      router_001:
        bootstrap_leader: router_001
        instances:
          router_001:
            iproto:
              listen:
                - uri: 127.0.0.1:6100
        leader: router_001
    roles:
    sharding:
      roles:
        - router
  storages:
    app:
      module: app.storage
    replicasets:
      storage_001:
        bootstrap_leader: storage_001_01
        database:
          replicaset_uuid: 45a6e000-0000-0001-0000-57012a6e0000
        instances:
          storage_001_01:
            database:
              instance_uuid: 45a6e000-0000-0001-0001-57012a6e0000
            iproto:
              listen:
                - uri: 127.0.0.1:6011
          storage_001_02:
            database:
              instance_uuid: 45a6e000-0000-0001-0002-57012a6e0000
            iproto:
              listen:
                - uri: 127.0.0.1:6012
        leader: storage_001_01
      storage_002:
        bootstrap_leader: storage_002_01
        database:
          replicaset_uuid: 45a6e000-0000-0002-0000-57012a6e0000
        instances:
          storage_002_01:
            database:
              instance_uuid: 45a6e000-0000-0002-0001-57012a6e0000
            iproto:
              listen:
                - uri: 127.0.0.1:6021
          storage_002_02:
            database:
              instance_uuid: 45a6e000-0000-0002-0002-57012a6e0000
            iproto:
              listen:
                - uri: 127.0.0.1:6022
        leader: storage_002_01
    roles:
      - app.roles.queue

    sharding:
      roles:
        - storage
iproto:
  advertise:
    peer:
      login: replicator
    sharding:
      login: storage
log:
  level: info
  modules:
    roles.queue: verbose
replication:
  bootstrap_strategy: config
  connect_timeout: 3
  failover: manual
roles_cfg:
  app.roles.queue:
    take_timeout: 0
    ttl: 86400
    ttr: 3

app/roles/billing.lua

local log = require 'log'.new('roles.billing-queue')
local fiber = require 'fiber'

local queue = require 'queue'
queue.cfg({ in_replicaset = true })
rawset(_G, 'queue', queue)

local M = {
    role_name = ...,
    defaults = {
        netbox_timeout = 1.5,
        ttr = 2,
        ttl = 30 * 86400,
        take_timeout = 1,
    },
    tube_name = 'billing',
}

function M.validate(cfg)
    cfg = cfg or {}

    if cfg.ttr then
        assert(type(cfg.ttr) == 'number', 'ttr must be a number')
        assert(cfg.ttr > 0, 'ttr must be a positive number')
    end

    if cfg.ttl then
        assert(type(cfg.ttl) == 'number', 'ttl must be a number')
        assert(cfg.ttl > 0, 'ttl must be a positive number')
    end
end

function M.apply(cfg)
    log.info('[queue_debug] apply role')

    -- workaround for correct queue bootstrap
    -- fiber.create(function ()
    --     log.info('[queue_debug] wait master')
    --     box.ctl.wait_rw()
    --     -- trigger box.cfg wrapper
    --     log.info('[queue_debug] box.cfg {}')
    --     box.cfg {}
    --     -- create queue
    --     log.info('[queue_debug] ro=%s', box.info.ro)
    --     if not queue.create_tube(M.tube_name, 'fifottl', { if_not_exists = true }) then
    --         log.info('[queue_debug] failed tube creation')
    --     end
    -- end)

    -- without workaround
    if not queue.create_tube(M.tube_name, 'fifottl', { if_not_exists = true }) then
        log.info('[queue_debug] failed tube creation')
    end
end

function M.stop()
end

return M
@vanyarock01 vanyarock01 added the bug Something isn't working label Mar 18, 2024
@0x501D 0x501D added the 1sp label Mar 18, 2024
@0x501D 0x501D assigned 0x501D and unassigned 0x501D Mar 18, 2024
DerekBum added a commit that referenced this issue Apr 4, 2024
Sometimes, instance could enter the queue initialization
while still in the orphan mode. This resulted in "lazy
start". But tarantool does not call `box.cfg {}` after
leaving orphan mode, so queue was stuck in the `INIT` state.

Now we wait for all orphan instances on the init stage of the queue.

Closes #226
DerekBum added a commit that referenced this issue Apr 4, 2024
Sometimes, instance could enter the queue initialization
while still in the orphan mode. This resulted in "lazy
start". But tarantool does not call `box.cfg {}` after
leaving orphan mode, so queue was stuck in the `INIT` state.

Now we wait for all orphan instances on the init stage of the queue.

Closes #226
DerekBum added a commit that referenced this issue Apr 4, 2024
Sometimes, instance could enter the queue initialization
while still in the orphan mode. This resulted in "lazy
start". But tarantool does not call `box.cfg {}` after
leaving orphan mode, so queue was stuck in the `INIT` state.

Now we wait for all orphan instances on the init stage of the queue.

Closes #226
DerekBum added a commit that referenced this issue Apr 9, 2024
Sometimes, instance could enter the queue initialization
while still in the orphan mode. This resulted in "lazy
start". But tarantool does not call `box.cfg {}` after
leaving orphan mode, so queue was stuck in the `INIT` state.

Now we wait in the background for all orphan instances.
It is simular to lazy init for read-only instances.

Closes #226
DerekBum added a commit that referenced this issue Apr 9, 2024
Sometimes, instance could enter the queue initialization
while still in the orphan mode. This resulted in "lazy
start". But Tarantool does not call `box.cfg {}` after
leaving orphan mode, so queue was stuck in the `INIT` state.

Now we wait in the background for all orphan instances.
It is similar to lazy init for read-only instances.

Closes #226
DerekBum added a commit that referenced this issue Apr 9, 2024
Sometimes, instance could enter the queue initialization
while still in the orphan mode. This resulted in "lazy
start". But Tarantool does not call `box.cfg {}` after
leaving orphan mode, so queue was stuck in the `INIT` state.

Now we wait in the background for all orphan instances.
It is similar to lazy init for read-only instances.

Note that this fix works only for Tarantool versions >= 2.10.0.
This is because of using watchers.

Closes #226
DerekBum added a commit that referenced this issue Apr 9, 2024
Sometimes, instance could enter the queue initialization
while still in the orphan mode. This resulted in "lazy
start". But Tarantool does not call `box.cfg {}` after
leaving orphan mode, so queue was stuck in the `INIT` state.

Now we wait in the background for all orphan instances.
It is similar to lazy init for read-only instances.

Note that this fix works only for Tarantool versions >= 2.10.0.
This is because of used watchers.

Closes #226
DerekBum added a commit that referenced this issue Apr 9, 2024
Sometimes, instance could enter the queue initialization
while still not running (for example, left in the orphan mode).
This resulted in "lazy start". But Tarantool does not call
`box.cfg {}` after leaving orphan mode, so queue could stuck in the
`INIT` state.

Now we wait in the background for instances, that are not running.
It is similar to lazy init for read-only instances.

Note that this fix works only for Tarantool versions >= 2.10.0.
This is because of used watchers.

Closes #226
DerekBum added a commit that referenced this issue Apr 9, 2024
Sometimes, instance could enter the queue initialization
while still not running (for example, left in the orphan mode).
This resulted in "lazy start". But Tarantool does not call
`box.cfg {}` after leaving orphan mode, so queue could stuck in the
`INIT` state.

Now we wait in the background for instances, that are not running.
It is similar to lazy init for read-only instances.

Note that this fix works only for Tarantool versions >= 2.10.0.
This is because of used watchers.

Closes #226
DerekBum added a commit that referenced this issue Apr 9, 2024
Sometimes, instance could enter the queue initialization
while still not running (for example, left in the orphan mode).
This resulted in "lazy start". But Tarantool does not call
`box.cfg {}` after leaving orphan mode, so queue could stuck in the
`INIT` state.

Now we wait in the background for instances, that are not running.
It is similar to lazy init for read-only instances.

Note that this fix works only for Tarantool versions >= 2.10.0.
This is because of used watchers.

Closes #226
DerekBum added a commit that referenced this issue Apr 9, 2024
Sometimes, instance could enter the queue initialization
while still not running (for example, left in the orphan mode).
This resulted in "lazy start". But Tarantool does not call
`box.cfg {}` after leaving orphan mode, so queue could stuck in the
`INIT` state.

Now we wait in the background for instances, that are not running.
It is similar to lazy init for read-only instances.

Note that this fix works only for Tarantool versions >= 2.10.0.
This is because of used watchers.

Closes #226
DerekBum added a commit that referenced this issue Apr 10, 2024
Sometimes, instance could enter the queue initialization
while still not running (for example, left in the orphan mode).
This resulted in "lazy start". But Tarantool does not call
`box.cfg {}` after leaving orphan mode, so queue could stuck in the
`INIT` state.

Now we wait in the background for instances, that are not running.
It is similar to lazy init for read-only instances.

Note that this fix works only for Tarantool versions >= 2.10.0.
This is because of used watchers.

Closes #226
DerekBum added a commit that referenced this issue Apr 10, 2024
Sometimes, instance could enter the queue initialization
while still not running (for example, left in the orphan mode).
This resulted in "lazy start". But Tarantool does not call
`box.cfg {}` after leaving orphan mode, so queue could stuck in the
`INIT` state.

Now we wait in the background for instances, that are not running.
It is similar to lazy init for read-only instances.

Note that this fix works only for Tarantool versions >= 2.10.0.
This is because of used watchers.

Closes #226
DerekBum added a commit that referenced this issue Apr 10, 2024
Sometimes, instance could enter the queue initialization
while still not running (for example, left in the orphan mode).
This resulted in "lazy start". But Tarantool does not call
`box.cfg {}` after leaving orphan mode, so queue could stuck in the
`INIT` state.

Now we wait in the background for instances, that are not running.
It is similar to lazy init for read-only instances.

Note that this fix works only for Tarantool versions >= 2.10.0.
This is because of used watchers.

Closes #226
DerekBum added a commit that referenced this issue Apr 11, 2024
Sometimes, instance could enter the queue initialization
while still not running (for example, left in the orphan mode).
This resulted in "lazy start". But Tarantool does not call
`box.cfg {}` after leaving orphan mode, so queue could stuck in the
`INIT` state.

Now we wait in the background for instances, that are not running.
It is similar to lazy init for read-only instances.

Note that this fix works only for Tarantool versions >= 2.10.0.
This is because of used watchers.

Closes #226
DerekBum added a commit that referenced this issue Apr 11, 2024
Sometimes, instance could enter the queue initialization
while still not running (for example, left in the orphan mode).
This resulted in "lazy start". But Tarantool does not call
`box.cfg {}` after leaving orphan mode, so queue could stuck in the
`INIT` state.

Now we wait in the background for instances, that are not running.
It is similar to lazy init for read-only instances.

Note that this fix works only for Tarantool versions >= 2.10.0.
This is because of used watchers.

Closes #226
DerekBum added a commit that referenced this issue Apr 11, 2024
Sometimes, instance could enter the queue initialization
while still not running (for example, left in the orphan mode).
This resulted in "lazy start". But Tarantool does not call
`box.cfg {}` after leaving orphan mode, so queue could stuck in the
`INIT` state.

Now we wait in the background for instances, that are not running.
It is similar to lazy init for read-only instances.

Note that this fix works only for Tarantool versions >= 2.10.0.
This is because of used watchers.

Closes #226
DerekBum added a commit that referenced this issue Apr 15, 2024
Sometimes, instance could enter the queue initialization
while still not running (for example, left in the orphan mode).
This resulted in "lazy start". But Tarantool does not call
`box.cfg {}` after leaving orphan mode, so queue could stuck in the
`INIT` state.

Now we wait in the background for instances, that are not running.
It is similar to lazy init for read-only instances.

Note that this fix works only for Tarantool versions >= 2.10.0.
This is because of used watchers.

Closes #226
DerekBum added a commit that referenced this issue Apr 15, 2024
Sometimes, instance could enter the queue initialization
while still not running (for example, left in the orphan mode).
This resulted in "lazy start". But Tarantool does not call
`box.cfg {}` after leaving orphan mode, so queue could stuck in the
`INIT` state.

Now we wait in the background for instances, that are not running.
It is similar to lazy init for read-only instances.

Note that this fix works only for Tarantool versions >= 2.10.0.
This is because of used watchers.

Closes #226
DerekBum added a commit that referenced this issue Apr 15, 2024
Sometimes, instance could enter the queue initialization
while still not running (for example, left in the orphan mode).
This resulted in "lazy start". But Tarantool does not call
`box.cfg {}` after leaving orphan mode, so queue could stuck in the
`INIT` state.

Now if the instance is read-only,  separate fiber is watching for
updates of its mode.

Note that this fix works only for Tarantool versions >= 2.10.0.
This is because of used watchers.

Closes #226
DerekBum added a commit that referenced this issue Apr 15, 2024
Sometimes, instance could enter the queue initialization
while still not running (for example, left in the orphan mode).
This resulted in "lazy start". But Tarantool does not call
`box.cfg {}` after leaving orphan mode, so queue could stuck in the
`INIT` state.

Now if the instance is read-only,  separate fiber is watching for
updates of its mode.

Note that this fix works only for Tarantool versions >= 2.10.0.
This is because of used watchers.

Closes #226
oleg-jukovec pushed a commit that referenced this issue Apr 15, 2024
Sometimes, instance could enter the queue initialization
while still not running (for example, left in the orphan mode).
This resulted in "lazy start". But Tarantool does not call
`box.cfg {}` after leaving orphan mode, so queue could stuck in the
`INIT` state.

Now if the instance is read-only,  separate fiber is watching for
updates of its mode.

Note that this fix works only for Tarantool versions >= 2.10.0.
This is because of used watchers.

Closes #226
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1sp bug Something isn't working teamE
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants