Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add pun_pre_hook #535

Merged
merged 16 commits into from
Nov 30, 2020
Merged

add pun_pre_hook #535

merged 16 commits into from
Nov 30, 2020

Conversation

johrstrom
Copy link
Contributor

Partial fix for #496.

Adds a pun pre_hook. --user $USER is given as a command line argument to the hook and OIDC_ACCESS_TOKEN is set as an environment variable if it exists in the parent process (the lua scripts in ood_mod_proxy).

I can provide a script that exchanges OIDC tokens and sets the kuberenetes config in this PR too. We'll eventually need to put it somewhere to share, I'm just not sure where. The k8s exchange script relies on environment specific things like which IDP to look at for example (OSC's dev/test/prod) and it's unclear where these should be source from (/etc/ood/profile?)

@ericfranz
Copy link
Contributor

What does the Apache config change look like in this case? It would seem that we would want to include the appropriate ood-portal-generator modifications to support that change as well.

Is there is a way to pass all claim headers to stdin instead of cherry-picking OIDC access token?

Is there a way to do this without being OIDC specific?

@johrstrom
Copy link
Contributor Author

What does the Apache config change look like in this case?

Same. We hold our oidc configs separately from what we distribute anyway. Mod auth openidc defaults to passing these items as both header and environment variables, so we'd need to document that you need at least the environment variable for this to work.

Is there is a way to pass all claim headers to stdin instead of cherry-picking OIDC access token?

Maybe? We could loop through r:subprocess_env_table() and export anything matching oidc_*. We just don't need any more yet.

Is there a way to do this without being OIDC specific?

Sure, but our use case (along with SDSC's) is to use the access token. I guess we could blindly export the entire subprocess_env_table?

@treydock
Copy link
Contributor

What does the Apache config change look like in this case?

Same. We hold our oidc configs separately from what we distribute anyway. Mod auth openidc defaults to passing these items as both header and environment variables, so we'd need to document that you need at least the environment variable for this to work.

Once we support Dex the OIDC settings will be exposed via ood-portal-generator. That was part of #474 . Part of that PR was to make it possible to define OIDC settings as part of ood-portal.conf because they are needed by Dex.

@ericfranz
Copy link
Contributor

blindly export the entire subprocess_env_table

Yeah, especially if you don't modify the nginx_stage process's environment (see inline comment above). I think that is the best solution. Other Apache modules set env vars that would end up in this table and this should provide us access to all the OIDC claim headers. So it would be a win-win - our hook scripts get access all the claim headers and it is no longer OIDC specific.

One question though: who is this hook script running as? root or the user? We may want to have two hooks: a root hook script and a per user hook script, unless both ours and SDSC can be handled by a per user hook script.

@johrstrom
Copy link
Contributor Author

One question though: who is this hook script running as? root or the user?

This runs as root. In my k8s script I sudo -u "$UNPRIV_USER" kubectl config. Indeed we don't even need root (SDSC may not either). There could be some way to use passenger hooks, but we'd need to pass the right environment to it.

@johrstrom
Copy link
Contributor Author

It's not immediately clear how to copy the environment. I've got to find docs on how to iterate through an Apr.Table (the type returned by r.subprocess_env). Apparently there's a read-only version available in 2.5.x but, that's years away. So there's some work to be done trying to iterate through this table.

@johrstrom
Copy link
Contributor Author

After some digging I'm fairly certain we can't export the r.subprocess_env table until this commit on the httpd source becomes available to us. Even in that commit message is said 'only get and set APIs are available' which I believe are defined through this c function that exposes them, thought it's not 100% clear to me how C code translates to LUA APIs.

So unless these things are reachable from some other mechanism, I'm not sure if there is a way to do this generically yet.

@ericfranz
Copy link
Contributor

Darn. Maybe then we make configurable somehow (comma delimited list) the list of vars in r.subprocess_env to export? Then we can specify all the oidc claim headers we want to pass

@ericfranz ericfranz added this to the OOD1.8 milestone Jul 2, 2020
@johrstrom
Copy link
Contributor Author

Updated with a configuration. Unfortunately it's in the ood portal generator so there's one config in nginx_stage for the command to run and one config in the ood_portal.yml for what variables to export to the it. I'm not really sure there's a way around it because that's where the environment comes from.

If this strategy seems OK I can work on some test cases for both the ood portal and the nginx stage.

@ericfranz
Copy link
Contributor

We configure exports in Apache, and lua sets stdin with this when calling nginx_stage
We then configure nginx_stage configuration to tell nginx_stage to execute a prehook, converting this stdin to env for the prehook

Instead of utilizing nginx_stage configuration to tell what prehook to run, what if we just added support for more arguments to nginx stage - for example --pre-hook=/path/to/pre/hook.sh? Then we could handle configuration for the exports and the prehook in ood-portal-generator config.

@johrstrom
Copy link
Contributor Author

That seems fine to me. Do we want to change the name to pun_root_pre_hook. I can't find a spot for this to run as a user.

(also I see the apache tests failing now, so I'll get on that).

@ericfranz
Copy link
Contributor

Hold off on the per-user one till this is merged - we can revisit. Maybe the root/user is appended to the end of the name? Instead of --pre-hook it is --pre-hook-root?

Give a command and some standard in, get a string for merged stdout and stderr
--]]
function capture2e_with_stdin(cmd, stdin)
local output_file = os.tmpname()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are back to using temporary filenames, and every single PUN launch will now use this, should we do io.tmpfile() instead? Also, this file we should ensure the permissions to be 600 before writing sensitive data to this.

Copy link
Contributor Author

@johrstrom johrstrom Aug 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I was testing this, I saw it was creating 600 files, though that could be platform/configuration dependent. You think a os.execute(chmod ... ) is enough?

Looks like I was using os.tmpname because it's actually a string, so we don't have to cast it.

> t = os.tmpname()
> os.execute('chmod 600 ' .. t .. ' >/dev/null 2>&1')
true	exit	0
> f = io.tmpfile()
> os.execute('chmod 600 ' .. f .. ' >/dev/null 2>&1')
stdin:1: attempt to concatenate a FILE* value (global 'f')
stack traceback:
	stdin:1: in main chunk
	[C]: in ?

Then I'm not sure how to cast it to a string.

> tostring(f)
file (0x56056fb7ea90)
> tostring(t)
/tmp/lua_8GVcnl

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can rely on io.tmpfile() to create 600 files. It is either that or introduce a tiny race condition with the use of os.tmpname.

if app_init_url then
cmd = cmd .. " -a '" .. r:escape(app_init_url) .. "'"
end

local err = capture2e(cmd)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we take advantage of this STDIN any other way besides a prehook? Are we comfortable with changing every Execution of nginx_stage by Apache by providing STDIN like we are doing below.

If it could only ever be used for the pre_hook maybe we should consider maintaining the default local err = capture2e(cmd) and only passing stdin when a pre_hook_root_cmd is provided reduce the scope of this change. I have mixed feelings on this, though.


os.remove(output_file)

return output
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we will need to document that the pre hook when executed by Apache should not output any data to stdout or stderr since we report both as an error.

@ericfranz ericfranz modified the milestones: OOD1.8, OOD1.8-patch Aug 10, 2020
@ericfranz ericfranz modified the milestones: OOD1.8-patch, OOD2.0 Sep 22, 2020
@ericfranz
Copy link
Contributor

In #496 we discussed being able to add luaposix. Would that be useful here at all?

@johrstrom
Copy link
Contributor Author

In #496 we discussed being able to add luaposix. Would that be useful here at all?

It would help us pass the environment variables directly into the pre hook instead of parsing them from std in. So yes it would be helpful, but I'm not sure if the package management is worth it. We'd have to build and distribute a version for centos/RHEL 7 to say nothing about other platforms.

@treydock
Copy link
Contributor

PowerTools for CentOS 8 is very common to have to enable and so is EPEL for EL7. I see no problem with depending on those repos. For CentOS they are very simple to get enabled, for RHEL it's a bit more involved but still common to have enabled.

@johrstrom johrstrom removed the ready label Oct 21, 2020
@johrstrom johrstrom force-pushed the nginx_stage_prehook branch from 5af0688 to a8ef0d3 Compare October 26, 2020 20:47
@johrstrom
Copy link
Contributor Author

Rebased off of current main and added syslog to nginx_stage. We should be good with this now, although it's been so long it's worth looking over it again.

@treydock
Copy link
Contributor

@johrstrom Were you going to try and use lua posix to handle environment variables?

@johrstrom
Copy link
Contributor Author

I could try, and I'm sure it would work, but I'm hesitant to add the dependency.

Of course if folks feel strongly about it, I can change this PR, but my vote would be to defer it at least until we know we want it also for some other use case (like the regex exploration in #710). The user/admin interface would look the same even if the internals change.

pun_pre_hook runs the configured executable before the PUN starts.
This addition also enables logic to pass the OIDC_ACCESS_TOKEN if
it exists into the nginx_stage script's standard in (in the form
key=value).
move pun_pre_hook configuration from the nginx_stage to the
ood-portal-generator to minimize edit locations for this feature.
now we pass in the script to the 'nginx_stage pun' command with
the -P option.
most installations will continue to use the same code path as before.
@johrstrom johrstrom force-pushed the nginx_stage_prehook branch from b74eaac to 328568e Compare November 2, 2020 15:55
@ericfranz
Copy link
Contributor

I'd like to see how the code would change/possibly simplify if we used the lua posix module. If it is less code for us to maintain then I'd say go for it.

@johrstrom johrstrom merged commit 28da29d into master Nov 30, 2020
@johrstrom johrstrom deleted the nginx_stage_prehook branch October 12, 2021 16:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Find and document a solution for before PUN start and after PUN stop hook scripts
3 participants