Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recipe: elkstack::logstash #178

Open
jnganga opened this issue Aug 9, 2016 · 14 comments
Open

Recipe: elkstack::logstash #178

jnganga opened this issue Aug 9, 2016 · 14 comments

Comments

@jnganga
Copy link

jnganga commented Aug 9, 2016

Hi,
I'm running into this error below when converging. Without fail, it converges on the second attempt. Is it that elasticsearch is not running yet and logstash depends on it, hence the failure? How do we resolve this?

       Recipe: elkstack::elasticsearch
         * service[elasticsearch] action start (up to date)
       Recipe: elkstack::logstash
         * logstash_service[server] action restart
           * runit_service[logstash_server] action restart

             ================================================================================
             Error executing action `restart` on resource 'runit_service[logstash_server]'
             ================================================================================

             Mixlib::ShellOut::ShellCommandFailed
             ------------------------------------
             Expected process to exit with [0], but received '1'
             ---- Begin output of /usr/bin/sv restart /etc/service/logstash_server ----
             STDOUT: timeout: run: /etc/service/logstash_server: (pid 20646) 797s, got TERM
             STDERR: 
             ---- End output of /usr/bin/sv restart /etc/service/logstash_server ----
             Ran /usr/bin/sv restart /etc/service/logstash_server returned 1

@TheSeubert
Copy link

It is a bit hard to say from that output without some manual troubleshooting as well. From the output there, the elasticsearch service was started before, and a restart was issued to logstash_server runit service. The service started, received a PID, but then terminated itself.

After this failed run were you able to login to the instance and see the status of the elasticsearch service? It would also help to reference the runit logs under /var/log/logstash and see what the process got there. If you can hunt down this any any other information that may help us.

@jnganga
Copy link
Author

jnganga commented Aug 9, 2016

Thanks for responding.

I see elasticsearch running after the failed run. But I can't tell if it was running right at the moment of failure.

$ curl 'http://localhost:9200/?pretty'
{
  "status" : 200,
  "name" : "default-ubuntu-1404",
  "cluster_name" : "elasticsearch",
  "version" : {
    "number" : "1.4.4",
    "build_hash" : "c88f77ffc81301dfa9dfd81ca2232f09588bd512",
    "build_timestamp" : "2015-02-19T13:05:36Z",
    "build_snapshot" : false,
    "lucene_version" : "4.10.3"
  },
  "tagline" : "You Know, for Search"
}

Also, in my log directory, this is what I have:

vagrant@default-ubuntu-1404:/var/log/logstash_server$ ll
total 8
drwxr-xr-x  2 root root   4096 Aug  9 21:29 ./
drwxrwxr-x 10 root syslog 4096 Aug  9 21:38 ../
lrwxrwxrwx  1 root root     34 Aug  9 21:29 config -> /etc/sv/logstash_server/log/config

@jnganga jnganga closed this as completed Aug 9, 2016
@jnganga jnganga reopened this Aug 10, 2016
@martinb3
Copy link
Contributor

martinb3 commented Aug 10, 2016

Can you show us what settings you're applying or give us a reproducable example? I don't think we have enough information; I suspect this is an issue with configuration.

@jnganga
Copy link
Author

jnganga commented Aug 10, 2016

I'm actually cloning the entire repo into a new folder:
https://github.com/rackspace-cookbooks/elkstack.git
and then running 'kitchen converge' without any additional modifications.

Please see the logs below for the first and second runs. FYI, a colleague got the same issue on his machine.

first_run_log_elkstack.txt
second_run_log_elkstack.txt

@martinb3
Copy link
Contributor

Hi @jnganga -- I just cloned elkstack and ran the same command, & it converged for me on the first and second attempts.

Please see the logs below for the first and second runs. FYI, a colleague got the same issue on his machine.

These logs contain different run lists. cic_elkstack::packer is not part of elkstack, so I think there's something else going on here (these logs aren't from the same runlist, it seems).

Could you share your Berksfile.lock and Gemfile.lock so I can get on the same versions you're using, and re-test?

@jnganga
Copy link
Author

jnganga commented Aug 10, 2016

Sorry, I attached the log files from my earlier run where I'm wrapping your cookbook. In both cases, with or without the wrapper, it errors out at the same place.
Please see the files requested below. I only had to generate this on the first run with the wrapper cookbook. I probable should have repeated with the cloned elkstack. Will do that tonight.

Gemfile.lock.txt
Berksfile.lock.txt

@martinb3
Copy link
Contributor

Yes, please let us know when you have something with elkstack itself so we can try to reproduce it. I'm specifically interested in both logs & lock files for elkstack specifically, not your wrapper. Thanks.

@jnganga
Copy link
Author

jnganga commented Aug 11, 2016

Sure. Please find the logs and .lock files below.
My steps:

$ git clone https://github.com/rackspace-cookbooks/elkstack.git
$ cd elkstack/
$ berks install
$ bundle install
$ kitchen list
$ kitchen create default-ubuntu-1404
$ kitchen converge default-ubuntu-1404 - see attached log - "first_run_log_cloned_elkstack"
$ kitchen converge default-ubuntu-1404 - see attached log - "second_run_log_cloned_elkstack"

Thank you.

second_run_log_cloned_elkstack.txt
first_run_log_cloned_elkstack.txt
Cloned_elkstack_Berksfile.lock.txt
Cloned_elkstack_Gemfile.lock.txt

@TheSeubert
Copy link

TheSeubert commented Aug 12, 2016

For what its worth, even though I realize may not add any value, but I can go through the same steps and I compared the Gemfile.lock and Berksfile.lock. I was able to converge with no errors, and the only difference I found was that I had a slightly newer ohai gem locally.

In your above command I see you did not do bundle exec so it was actually using your system/user gemset. Also to note though, you are running the latest kitchen, same as I am too, so I didn't actually see an issue there.

We'll continue to dig into this, but so far I don't see a smoking gun.

@jnganga
Copy link
Author

jnganga commented Aug 12, 2016

@dude051 what exact command/order should I run for bundle exec

@TheSeubert
Copy link

TheSeubert commented Aug 12, 2016

The same order works, just prepend your kitchen commands with bundle exec so as to run it from the bundled gems. So to answer your question directly:

$ git clone https://github.com/rackspace-cookbooks/elkstack.git
$ cd elkstack/
$ bundle install
$ bundle exec berks install
$ bundle exec kitchen list
$ bundle exec kitchen create default-ubuntu-1404
$ bundle exec kitchen converge default-ubuntu-1404
$ bundle exec kitchen converge default-ubuntu-1404

@jnganga
Copy link
Author

jnganga commented Aug 12, 2016

I get the same results sir! Please see attached logs.

second_run_log_bundle_elkstack.txt
first_run_log_bundle_elkstack.txt

@martinb3
Copy link
Contributor

@jnganga Do you get anything in the logstash logs when you run this? I'm wondering if the logstash service just isn't starting for some reason.

@imewish
Copy link

imewish commented Sep 23, 2016

i guess this is related to my issue. here, i faced the same. with latest logstash versions

lusis/chef-logstash#459

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants