Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read start_permanent from mix.exs #118

Closed
AndrewDryga opened this issue Oct 19, 2016 · 7 comments
Closed

Read start_permanent from mix.exs #118

AndrewDryga opened this issue Oct 19, 2016 · 7 comments
Labels
investigating:can't reproduce Need help building a reproduction case for this issue so that it can be fixed

Comments

@AndrewDryga
Copy link
Contributor

Right now Distillery (and exrm) ignores start_permanent option in mix.exs and this behavior is ambiguous for most of developers that I know (few project in production have issues because of it).

Of course, we can add bold text that "you should set application type in rel/config.ex", but other option is more developer-friendly - to read application start type from Mix.exs and use it as default value.

@bitwalker
Copy link
Owner

The top-level application in a release will always be started as permanent, so this setting doesn't actually apply to releases. Was there a specific issue you encountered?

@AndrewDryga
Copy link
Contributor Author

In some of our workers when connection pool is overloaded, GenServer (it is added to a application supervisor as worker) will fail due to pool connection timeout. This failure results in reached max restart intensity error, then application supervisor should fail and kill erlang vm, but it lives with dead GenServer and application is not working.

@bitwalker
Copy link
Owner

@AndrewDryga Do you have an example app I can use to reproduce?

@bitwalker bitwalker added the investigating:can't reproduce Need help building a reproduction case for this issue so that it can be fixed label Oct 31, 2016
@AndrewDryga
Copy link
Contributor Author

AndrewDryga commented Nov 9, 2016

Unfortunately I don't have ways to reproduce it, but I have logs from production:

12:05:58.289 [info]  Raise processing flag: 5. Batch: "74ff28ea9920dc6ddccbbb25ed823b69d99e8bca".
12:06:05.672 [error] Postgrex.Protocol (#PID<0.1397.0>) failed to connect: ** (Postgrex.Error) tcp connect: non-existing domain - :nxdomain
12:06:05.672 [error] Postgrex.Protocol (#PID<0.1398.0>) failed to connect: ** (Postgrex.Error) tcp connect: non-existing domain - :nxdomain
12:06:05.672 [error] Postgrex.Protocol (#PID<0.1399.0>) failed to connect: ** (Postgrex.Error) tcp connect: non-existing domain - :nxdomain
12:06:05.672 [error] Postgrex.Protocol (#PID<0.1400.0>) failed to connect: ** (Postgrex.Error) tcp connect: non-existing domain - :nxdomain
12:06:05.673 [error] Postgrex.Protocol (#PID<0.1401.0>) failed to connect: ** (Postgrex.Error) tcp connect: non-existing domain - :nxdomain
12:06:05.673 [warn]  Loan from batch 09c0d1688a1ed0bd9b46831ec210da4024190ab7 not found. Removing task.
12:05:38.874 [error] GenServer #PID<0.1325.0> terminating
** (stop) no process
    (stdlib) proc_lib.erl:794: :proc_lib.stop/3
    (ap_master) lib/workers/master.ex:222: AssetProcessor.Workers.MasterProcess.get_staging_loan/2
    (ap_master) lib/workers/master.ex:32: AssetProcessor.Workers.MasterProcess.handle_info/2
    (stdlib) gen_server.erl:615: :gen_server.try_dispatch/4
    (stdlib) gen_server.erl:681: :gen_server.handle_msg/5
    (stdlib) proc_lib.erl:240: :proc_lib.init_p_do_apply/3
=CRASH REPORT==== 2-Nov-2016::12:05:38 ===
  crasher:
    initial call: Elixir.AssetProcessor.Workers.MasterProcess:init/1
    pid: <0.1325.0>
    registered_name: []
    exception exit: noproc
      in function  gen_server:terminate/7 (gen_server.erl, line 826)
    ancestors: ['Elixir.AssetProcessor.Workers.Supervisor',
                  'Elixir.AssetProcessor.MasterProcess.Supervisor',<0.1231.0>]
    messages: []
    links: [<0.1291.0>]
    dictionary: []
    trap_exit: false
    status: running
    heap_size: 1598
    stack_size: 27
    reductions: 990
  neighbours:
12:05:38.877 [error] GenServer #PID<0.1321.0> terminating
** (stop) exited in: :sys.terminate(#PID<0.1311.0>, :normal, :infinity)
    ** (EXIT) normal
    (stdlib) proc_lib.erl:796: :proc_lib.stop/3
    (ap_master) lib/workers/master.ex:222: AssetProcessor.Workers.MasterProcess.get_staging_loan/2
    (ap_master) lib/workers/master.ex:32: AssetProcessor.Workers.MasterProcess.handle_info/2
    (stdlib) gen_server.erl:615: :gen_server.try_dispatch/4
    (stdlib) gen_server.erl:681: :gen_server.handle_msg/5
    (stdlib) proc_lib.erl:240: :proc_lib.init_p_do_apply/3
=CRASH REPORT==== 2-Nov-2016::12:05:38 ===
  crasher:
    initial call: Elixir.AssetProcessor.Workers.MasterProcess:init/1
    pid: <0.1321.0>
    registered_name: []
    exception exit: {normal,{sys,terminate,[<0.1311.0>,normal,infinity]}}
      in function  gen_server:terminate/7 (gen_server.erl, line 826)
    ancestors: ['Elixir.AssetProcessor.Workers.Supervisor',
                  'Elixir.AssetProcessor.MasterProcess.Supervisor',<0.1231.0>]
    messages: []
    links: [<0.1291.0>]
    dictionary: []
    trap_exit: false
    status: running
    heap_size: 987
    stack_size: 27
    reductions: 954
  neighbours:
12:05:39.301 [error] GenServer #PID<0.1322.0> terminating
** (stop) exited in: :sys.terminate(#PID<0.1311.0>, :normal, :infinity)
    ** (EXIT) normal
    (stdlib) proc_lib.erl:796: :proc_lib.stop/3
    (ap_master) lib/workers/master.ex:222: AssetProcessor.Workers.MasterProcess.get_staging_loan/2
    (ap_master) lib/workers/master.ex:32: AssetProcessor.Workers.MasterProcess.handle_info/2
    (stdlib) gen_server.erl:615: :gen_server.try_dispatch/4
    (stdlib) gen_server.erl:681: :gen_server.handle_msg/5
    (stdlib) proc_lib.erl:240: :proc_lib.init_p_do_apply/3
=CRASH REPORT==== 2-Nov-2016::12:05:39 ===
  crasher:
    initial call: Elixir.AssetProcessor.Workers.MasterProcess:init/1
    pid: <0.1322.0>
    registered_name: []
    exception exit: {normal,{sys,terminate,[<0.1311.0>,normal,infinity]}}
      in function  gen_server:terminate/7 (gen_server.erl, line 826)
    ancestors: ['Elixir.AssetProcessor.Workers.Supervisor',
                  'Elixir.AssetProcessor.MasterProcess.Supervisor',<0.1231.0>]
    messages: []
    links: [<0.1291.0>]
    dictionary: []
    trap_exit: false
    status: running
    heap_size: 987
    stack_size: 27
    reductions: 962
  neighbours:
12:05:39.303 [error] GenServer #PID<0.1323.0> terminating
** (stop) exited in: :sys.terminate(#PID<0.1311.0>, :normal, :infinity)
    ** (EXIT) normal
    (stdlib) proc_lib.erl:796: :proc_lib.stop/3
    (ap_master) lib/workers/master.ex:222: AssetProcessor.Workers.MasterProcess.get_staging_loan/2
    (ap_master) lib/workers/master.ex:32: AssetProcessor.Workers.MasterProcess.handle_info/2
    (stdlib) gen_server.erl:615: :gen_server.try_dispatch/4
    (stdlib) gen_server.erl:681: :gen_server.handle_msg/5
    (stdlib) proc_lib.erl:240: :proc_lib.init_p_do_apply/3
=CRASH REPORT==== 2-Nov-2016::12:05:39 ===
  crasher:
    initial call: Elixir.AssetProcessor.Workers.MasterProcess:init/1
    pid: <0.1323.0>
    registered_name: []
    exception exit: {normal,{sys,terminate,[<0.1311.0>,normal,infinity]}}
      in function  gen_server:terminate/7 (gen_server.erl, line 826)
    ancestors: ['Elixir.AssetProcessor.Workers.Supervisor',
                  'Elixir.AssetProcessor.MasterProcess.Supervisor',<0.1231.0>]
    messages: []
    links: [<0.1291.0>]
    dictionary: []
    trap_exit: false
    status: running
    heap_size: 987
    stack_size: 27
    reductions: 970
  neighbours:
12:05:39.304 [error] GenServer #PID<0.1324.0> terminating
** (stop) exited in: :sys.terminate(#PID<0.1311.0>, :normal, :infinity)
    ** (EXIT) normal
    (stdlib) proc_lib.erl:796: :proc_lib.stop/3
    (ap_master) lib/workers/master.ex:222: AssetProcessor.Workers.MasterProcess.get_staging_loan/2
    (ap_master) lib/workers/master.ex:32: AssetProcessor.Workers.MasterProcess.handle_info/2
    (stdlib) gen_server.erl:615: :gen_server.try_dispatch/4
    (stdlib) gen_server.erl:681: :gen_server.handle_msg/5
    (stdlib) proc_lib.erl:240: :proc_lib.init_p_do_apply/3
=CRASH REPORT==== 2-Nov-2016::12:05:39 ===
  crasher:
    initial call: Elixir.AssetProcessor.Workers.MasterProcess:init/1
    pid: <0.1324.0>
    registered_name: []
    exception exit: {normal,{sys,terminate,[<0.1311.0>,normal,infinity]}}
      in function  gen_server:terminate/7 (gen_server.erl, line 826)
    ancestors: ['Elixir.AssetProcessor.Workers.Supervisor',
                  'Elixir.AssetProcessor.MasterProcess.Supervisor',<0.1231.0>]
    messages: []
    links: [<0.1291.0>]
    dictionary: []
    trap_exit: false
    status: running
    heap_size: 987
    stack_size: 27
    reductions: 978
  neighbours:
=SUPERVISOR REPORT==== 2-Nov-2016::12:05:39 ===
     Supervisor: {local,'Elixir.AssetProcessor.Workers.Supervisor'}
     Context:    child_terminated
     Reason:     noproc
     Offender:   [{pid,<0.1325.0>},
                  {id,'Elixir.AssetProcessor.Workers.MasterProcess'},
                  {mfargs,
                      {'Elixir.AssetProcessor.Workers.MasterProcess',
                          start_link,
                          [#{<<"batch_id">> => <<"8a08296000703cd36aabd3dd6a5ed48daa0aa289">>,
                             <<"id">> => 7},
                           7]}},
                  {restart_type,transient},
                  {shutdown,infinity},
                  {child_type,supervisor}]
=SUPERVISOR REPORT==== 2-Nov-2016::12:05:39 ===
     Supervisor: {local,'Elixir.AssetProcessor.Workers.Supervisor'}
     Context:    child_terminated
     Reason:     {normal,{sys,terminate,[<0.1311.0>,normal,infinity]}}
     Offender:   [{pid,<0.1321.0>},
                  {id,'Elixir.AssetProcessor.Workers.MasterProcess'},
                  {mfargs,
                      {'Elixir.AssetProcessor.Workers.MasterProcess',
                          start_link,
                          [#{<<"batch_id">> => <<"8a08296000703cd36aabd3dd6a5ed48daa0aa289">>,
                             <<"id">> => 3},
                           3]}},
                  {restart_type,transient},
                  {shutdown,infinity},
                  {child_type,supervisor}]
=SUPERVISOR REPORT==== 2-Nov-2016::12:05:39 ===
     Supervisor: {local,'Elixir.AssetProcessor.Workers.Supervisor'}
     Context:    child_terminated
     Reason:     {normal,{sys,terminate,[<0.1311.0>,normal,infinity]}}
     Offender:   [{pid,<0.1322.0>},
                  {id,'Elixir.AssetProcessor.Workers.MasterProcess'},
                  {mfargs,
                      {'Elixir.AssetProcessor.Workers.MasterProcess',
                          start_link,
                          [#{<<"batch_id">> => <<"8a08296000703cd36aabd3dd6a5ed48daa0aa289">>,
                             <<"id">> => 4},
                           4]}},
                  {restart_type,transient},
                  {shutdown,infinity},
                  {child_type,supervisor}]
=SUPERVISOR REPORT==== 2-Nov-2016::12:05:39 ===
     Supervisor: {local,'Elixir.AssetProcessor.Workers.Supervisor'}
     Context:    child_terminated
     Reason:     {normal,{sys,terminate,[<0.1311.0>,normal,infinity]}}
     Offender:   [{pid,<0.1323.0>},
                  {id,'Elixir.AssetProcessor.Workers.MasterProcess'},
                  {mfargs,
                      {'Elixir.AssetProcessor.Workers.MasterProcess',
                          start_link,
                          [#{<<"batch_id">> => <<"8a08296000703cd36aabd3dd6a5ed48daa0aa289">>,
                             <<"id">> => 5},
                           5]}},
                  {restart_type,transient},
                  {shutdown,infinity},
                  {child_type,supervisor}]
=SUPERVISOR REPORT==== 2-Nov-2016::12:05:39 ===
     Supervisor: {local,'Elixir.AssetProcessor.Workers.Supervisor'}
     Context:    shutdown
     Reason:     reached_max_restart_intensity
     Offender:   [{pid,<0.1323.0>},
                  {id,'Elixir.AssetProcessor.Workers.MasterProcess'},
                  {mfargs,
                      {'Elixir.AssetProcessor.Workers.MasterProcess',
                          start_link,
                          [#{<<"batch_id">> => <<"8a08296000703cd36aabd3dd6a5ed48daa0aa289">>,
                             <<"id">> => 5},
                           5]}},
                  {restart_type,transient},
                  {shutdown,infinity},
                  {child_type,supervisor}]
=SUPERVISOR REPORT==== 2-Nov-2016::12:05:39 ===
     Supervisor: {local,'Elixir.AssetProcessor.MasterProcess.Supervisor'}
     Context:    child_terminated
     Reason:     shutdown
     Offender:   [{pid,<0.1291.0>},
                  {id,'Elixir.AssetProcessor.Workers.Supervisor'},
                  {mfargs,
                      {'Elixir.AssetProcessor.Workers.Supervisor',start_link,
                          []}},
                  {restart_type,permanent},
                  {shutdown,infinity},
                  {child_type,supervisor}]

mix.exs:

defmodule AssetProcessor.MasterProcess.Mixfile do
  use Mix.Project

  @version "0.1.9"

  def project do
    [app: :ap_master,
     description: "OrangeSky Asset Processor - Master Process Worker.",
     package: package,
     version: @version,
     elixir: "~> 1.3",
     elixirc_paths: elixirc_paths(Mix.env),
     compilers: [] ++ Mix.compilers,
     build_embedded: Mix.env == :prod,
     start_permanent: Mix.env == :prod,
     aliases: aliases(),
     deps: deps(),
     test_coverage: [tool: ExCoveralls],
     preferred_cli_env: [coveralls: :test],
     docs: [source_ref: "v#\{@version\}", main: "readme", extras: ["README.md"]]]
  end
...

rel/config.ex:

use Mix.Releases.Config,
  default_release: :default,
  default_environment: :default

environment :default do
  set pre_start_hook: "bin/hooks/pre-start.sh"
  set dev_mode: false
  set include_erts: false
  set include_src: false
end

release :ap_master do
  set version: current_version(:ap_master)
  set applications: [
    ap_master: :permanent
  ]
end

Container is built with MIX_ENV=prod env.

@AndrewDryga
Copy link
Contributor Author

AndrewDryga commented Nov 9, 2016

At this moment its a Kubernetes Docker container with stopped application supervisor but Erlang VM is alive. So it's neither working, nor restarting.

@AndrewDryga
Copy link
Contributor Author

Another one:

=SUPERVISOR REPORT==== 7-Dec-2016::15:42:24 ===
     Supervisor: {local,'Elixir.Trader.Workers.Supervisor'}
     Context:    child_terminated
     Reason:     {#{'__exception__' => true,
                    '__struct__' => 'Elixir.Postgrex.Error',
                    connection_id => 16553,
                    message => nil,
                    postgres => #{code => undefined_column,
                      file => <<"parse_relation.c">>,
                      line => <<"3090">>,
                      message => <<"column b1.loans_invest_whole does not exist">>,
                      pg_code => <<"42703">>,
                      position => <<"350">>,
                      routine => <<"errorMissingColumn">>,
                      severity => <<"ERROR">>,
                      unknown => <<"ERROR">>}},
                  [{'Elixir.Ecto.Adapters.SQL',execute_and_cache,7,
                       [{file,"lib/ecto/adapters/sql.ex"},{line,415}]},
                   {'Elixir.Ecto.Repo.Queryable',execute,5,
                       [{file,"lib/ecto/repo/queryable.ex"},{line,121}]},
                   {'Elixir.Ecto.Repo.Queryable',all,4,
                       [{file,"lib/ecto/repo/queryable.ex"},{line,35}]},
                   {'Elixir.Ecto.Repo.Queryable',one,4,
                       [{file,"lib/ecto/repo/queryable.ex"},{line,59}]},
                   {'Elixir.Trader.Workers.Analyzer',do_task,2,
                       [{file,"lib/workers/analyzer.ex"},{line,95}]},
                   {'Elixir.Trader.Workers.Analyzer',handle_info,2,
                       [{file,"lib/workers/analyzer.ex"},{line,50}]},
                   {gen_server,try_dispatch,4,
                       [{file,"gen_server.erl"},{line,615}]},
                   {gen_server,handle_msg,5,
                       [{file,"gen_server.erl"},{line,681}]}]}
     Offender:   [{pid,<0.1425.0>},
                  {id,'Elixir.Trader.Workers.Analyzer'},
                  {mfargs,
                      {'Elixir.Trader.Workers.Analyzer',start_link,
                          [#{<<"buckets">> => [#{<<"actual_volume">> => 0,<<"bucket_id">> => 1}],
                             <<"portfolio_subscription_id">> => 1},
                           1]}},
                  {restart_type,transient},
                  {shutdown,5000},
                  {child_type,worker}]
=SUPERVISOR REPORT==== 7-Dec-2016::15:42:24 ===
     Supervisor: {local,'Elixir.Trader.Workers.Supervisor'}
     Context:    shutdown
     Reason:     reached_max_restart_intensity
     Offender:   [{pid,<0.1425.0>},
                  {id,'Elixir.Trader.Workers.Analyzer'},
                  {mfargs,
                      {'Elixir.Trader.Workers.Analyzer',start_link,
                          [#{<<"buckets">> => [#{<<"actual_volume">> => 0,<<"bucket_id">> => 1}],
                             <<"portfolio_subscription_id">> => 1},
                           1]}},
                  {restart_type,transient},
                  {shutdown,5000},
                  {child_type,worker}]
=SUPERVISOR REPORT==== 7-Dec-2016::15:42:24 ===
     Supervisor: {local,'Elixir.Trader.GapAnalyzer.Supervisor'}
     Context:    child_terminated
     Reason:     shutdown
     Offender:   [{pid,<0.1245.0>},
                  {id,'Elixir.Trader.Workers.Supervisor'},
                  {mfargs,{'Elixir.Trader.Workers.Supervisor',start_link,[]}},
                  {restart_type,permanent},
                  {shutdown,infinity},
                  {child_type,supervisor}]

@AndrewDryga
Copy link
Contributor Author

It seems that this is not Distillery's fault, rather unexpected (for me) behaviour of transient supervisor.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
investigating:can't reproduce Need help building a reproduction case for this issue so that it can be fixed
Projects
None yet
Development

No branches or pull requests

2 participants