Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP alpine: support alpine 3.5 #989

Closed
wants to merge 2 commits into from
Closed

WIP alpine: support alpine 3.5 #989

wants to merge 2 commits into from

Conversation

rvagg
Copy link
Member

@rvagg rvagg commented Nov 8, 2017

This should work in theory but there's a problem with the containers on Joyent once you upgrade the packages. It pulls in a new sshd that doesn't like the UsePrivilegeSeparation option (deprecated) but sandbox setup appears to be why any additional ssh connections fail. i.e. once the ansible script updates the packages on the host, you can no longer establish new ssh connections. I've tried a bunch of things with sshd config to make it work with no success. Googling doesn't help much either although the one interesting result is this https://smartos.org/bugview/OS-4407 which looks connected, although old. The Resource temporarily unavailable [preauth] message is the same, and the DEBUG3 output is very similar, but UsePrivilegeSeparation deprecation seems to be the problem. My guess is that since openssh 7.5, privilege separation is mandatory so you can't turn it off but the underlying platform doesn't support what it's trying to do (smartos?).

@misterdjules any thoughts on who we could ping for support on this?

@chorrell
Copy link

chorrell commented Nov 8, 2017

Would pinning the OpenSSH package to a version prior to 7.5 be viable?

Something like:

apk add 'openssh<7.5'

That should, in theory, install the latest available OpenSSH package that's less than version 7.5. Any apk upgrade after that should stick to the same version constraint. I'm more familiar with apt-pinning, so I don't know how well this works in practice:

https://wiki.alpinelinux.org/wiki/Alpine_Linux_package_management#Holding_a_specific_package_back

@misterdjules
Copy link
Contributor

@rvagg Unfortunately I don't have the bandwidth to assist you with this right now :(

@cjihrig
Copy link
Contributor

cjihrig commented Nov 8, 2017

I've opened an internal support ticket. I'll update here if I receive any news.

@rvagg
Copy link
Member Author

rvagg commented Nov 8, 2017

wew. Thanks for chiming in @chorrell, I saw you in that original ticket but noticed you're not at Joyent anymore so didn't want to bother you by pinging you here. Great to see you're still with us!
@misterdjules no problems, thanks for letting us know, I hope you're doing well though.
@cjihrig I forgot you wee Joyent now. I'll keep that in mind for future support needs.

@cjihrig
Copy link
Contributor

cjihrig commented Nov 8, 2017

We also have @geek on the build team 😄

@refack
Copy link
Contributor

refack commented Nov 8, 2017

Are there and security issues with sshd<7.5? Otherwise pinning seems like the simplest solution.

refack

This comment was marked as off-topic.

@geek
Copy link
Member

geek commented Nov 8, 2017

@rvagg I'll plan to take a look tomorrow morning.

@rvagg
Copy link
Member Author

rvagg commented Nov 9, 2017

apk add 'openssh<7.5' worked, hosts up and running and I'm trying them out now. Added another commit here.

I had to make a change in jenkins because it was forcing use of $(getconf _NPROCESSORS_ONLN) for JOBS even if you have JOBS set. Which means that in these containers it's defaulting to the host CPU count which is 48 on Joyent. So I've now got it using JOBS if that exists and falling back to $(getconf _NPROCESSORS_ONLN) if it doesn't. In Ansible I've allowed server_jobs to be included in inventory.yml so you don't have to know to put it in the host_vars file for all alpine jobs.

Unfortunately one of our Alpine 3.4 hosts is uncontactable via ssh. My guess is that it's the same problem as this OpenSSH upgrade. It's still connected to Jenkins and able to run jobs but we can't perform maintenance. This will likely mean that as soon as it becomes a problem we'll just have to yank Alpine 3.4 from the list completely.

@rvagg
Copy link
Member Author

rvagg commented Nov 9, 2017

So, it seems that we have a very firm limit on CPU usage in these containers on Joyent. They get part way through build & test and then just freeze up. I can log in to the machine and check do stuff on it myself but the processes doing the work are sitting idle.

@jbergstroem we never fully enabled the alpine34 Joyent containers in CI replacing the single DO Docker alpine34 host we have going. What was the blocker to getting that done, was it this same issue I'm seeing here? Is this even going to be possible or do we need to go full Docker ourselves for these? I just deleted my WIP standalone Docker/alpine Ansible script PR from this repo, maybe I need to open that back up and go that route again?

joaocgreis

This comment was marked as off-topic.

@rvagg
Copy link
Member Author

rvagg commented Nov 9, 2017

@geek perhaps you could have a look at these, it's just weird, I've left them hanging and they are in the middle of a run, just at the start of test: https://ci.nodejs.org/job/node-test-commit-linux/13958/nodes=alpine34-x64/console & https://ci.nodejs.org/job/node-test-commit-linux/13958/nodes=alpine35-x64/console, it's as if they are just paused but they are still responsive and you can log in and mess around. You should have access to the build/test key to get into them I think. If it helps, the former is infrastructure container UUID 8b68e6f8-587f-4936-8b61-c45fa22c8cf1 and the latter is UUID 1b909000-11d6-69b2-f145-b97b6fdc4092.

@rvagg
Copy link
Member Author

rvagg commented Nov 9, 2017

Just noticed this in the ansible/README.md:

- [ ] remove native alpine34 vm's on joyent since the joyent host
      is not mature enough to provide linux emulation. use docker instead.

I think that answers my question so perhaps this PR can sit idle or be closed.

Joyent folks: if our work here is helpful at all to you we can proceed to tinker, otherwise we'll just move on to running our own Docker hosts (PR incoming for that).

@rvagg
Copy link
Member Author

rvagg commented Nov 9, 2017

Docker approach active according to this: #992

@mhart
Copy link

mhart commented Nov 9, 2017

How does 3.6 support look? That's been out for a while now

@mhart
Copy link

mhart commented Nov 9, 2017

Ah, nevermind, I see that's referenced in #992

@geek
Copy link
Member

geek commented Nov 9, 2017

@rvagg it looks like you were able to make progress in the #992 PR. Do you need anything from me for 989?

@rvagg
Copy link
Member Author

rvagg commented Nov 9, 2017

yep, I'll consider this closed and power down the joyent containers, perhaps we can revisit this in the future

@rvagg rvagg closed this Nov 9, 2017
@maclover7 maclover7 deleted the rvagg/alpine3.5 branch November 10, 2017 00:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants