Skip to content

Conversation

@gaborgsomogyi
Copy link
Contributor

@gaborgsomogyi gaborgsomogyi commented Mar 25, 2019

What changes were proposed in this pull request?

Krb5LoginModule supports debug parameter which is not yet supported from Spark side. This configuration makes it easier to debug authentication issues against Kafka.

In this PR Krb5LoginModule debug flag controlled by either sun.security.krb5.debug or com.ibm.security.krb5.Krb5Debug.

Additionally found some hardcoded values like ssl.truststore.location, etc... which could be error prone if Kafka changes it so in such cases Kafka define used.

How was this patch tested?

Existing + additional unit tests + on cluster.

@gaborgsomogyi
Copy link
Contributor Author

cc @HeartSaVioR

@SparkQA
Copy link

SparkQA commented Mar 25, 2019

Test build #103915 has finished for PR 24204 at commit bc210c6.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

@HeartSaVioR HeartSaVioR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@vanzin
Copy link
Contributor

vanzin commented Mar 25, 2019

Is this different from setting the sun.security.krb5.debug system property? That doesn't require any changes to Spark.

@HeartSaVioR
Copy link
Contributor

I'm seeing some value to enable the option to only Kafka side instead of global (since that option normally brings verbose messages), but no strong opinion.

@gaborgsomogyi
Copy link
Contributor Author

It provides different kind of debug log entries just like HADOOP_JAAS_DEBUG. Please see doc (I know its CM but not found better) and the code.

I'm seeing some value to enable the option to only Kafka side instead of global (since that option normally brings verbose messages), but no strong opinion.

My change turns on debug messages only on Kafka related authentications. Not really sure what flag do you mean.

@HeartSaVioR
Copy link
Contributor

HeartSaVioR commented Mar 26, 2019

My change turns on debug messages only on Kafka related authentications. Not really sure what flag do you mean.

That's what I meant in my comment, vs turning on/off global configuration.

@vanzin
Copy link
Contributor

vanzin commented Mar 29, 2019

It provides different kind of debug log entries just like HADOOP_JAAS_DEBUG

I didn't ask about that option. I asked about sun.security.krb5.debug which is the system property to enable the exact thing you're enabling here with another config.

The only advantage I see here is to only enable that option for the kafka login, but is that really an advantage that justifies adding a new config option that people won't necessarily know about, when everybody who has to debug kerberos one time or another has to deal with that system property?

Did you try setting that system property and seeing whether it shows the things you want to see when debugging?

@dongjoon-hyun
Copy link
Member

I also agree with @vanzin 's point of views. For the security stuff, we need to see all logs in many cases. I believe the current documentation is enough.

@gaborgsomogyi
Copy link
Contributor Author

gaborgsomogyi commented Apr 1, 2019

Maybe my explanation was not enough/clean. Let me give a little bit more details.
I've created a small standalone application where these things can be tested easily.
The application connect to a secure Kafka cluster and tries to do authentication with dynamic JAAS configuration (where debug=true can be set). I've re-tested everything to give exact logs.

Test with debug=false:
Command: java -Dsun.security.krb5.debug=true -Dsun.security.spnego.debug=true ...
Output:

Java config name: null
Native config name: /etc/krb5.conf
>>> KeyTabInputStream, readName(): GCE.COMPANY.COM
>>> KeyTabInputStream, readName(): systest
>>> KeyTab: load() entry length: 70; type: 16
>>> KeyTabInputStream, readName(): GCE.COMPANY.COM
>>> KeyTabInputStream, readName(): systest
>>> KeyTab: load() entry length: 62; type: 23
>>> KeyTabInputStream, readName(): GCE.COMPANY.COM
>>> KeyTabInputStream, readName(): systest
>>> KeyTab: load() entry length: 54; type: 8
>>> KeyTabInputStream, readName(): GCE.COMPANY.COM
>>> KeyTabInputStream, readName(): systest
>>> KeyTab: load() entry length: 54; type: 3
>>> KeyTabInputStream, readName(): GCE.COMPANY.COM
>>> KeyTabInputStream, readName(): systest
>>> KeyTab: load() entry length: 54; type: 1
>>> KeyTabInputStream, readName(): GCE.COMPANY.COM
>>> KeyTabInputStream, readName(): systest
>>> KeyTab: load() entry length: 54; type: 1
Looking for keys for: [email protected]
Found unsupported keytype (1) for [email protected]
Found unsupported keytype (1) for [email protected]
Found unsupported keytype (3) for [email protected]
Found unsupported keytype (8) for [email protected]
Added key: 23version: 1
Added key: 16version: 1
>>> KdcAccessibility: reset
Looking for keys for: [email protected]
Found unsupported keytype (1) for [email protected]
Found unsupported keytype (1) for [email protected]
Found unsupported keytype (3) for [email protected]
Found unsupported keytype (8) for [email protected]
Added key: 23version: 1
Added key: 16version: 1
Using builtin default etypes for default_tkt_enctypes
default etypes for default_tkt_enctypes: 17 16 23.
>>> KrbAsReq creating message
getKDCFromDNS using UDP
getKDCFromDNS using TCP
>>> KrbKdcReq send: kdc=gsomogyi-cdh6x-secure-1.gce.company.com UDP:88, timeout=30000, number of retries =3, #bytes=148
>>> KDCCommunication: kdc=gsomogyi-cdh6x-secure-1.gce.company.com UDP:88, timeout=30000,Attempt =1, #bytes=148
>>> KrbKdcReq send: #bytes read=775
>>> KdcAccessibility: remove gsomogyi-cdh6x-secure-1.gce.company.com
Looking for keys for: [email protected]
Found unsupported keytype (1) for [email protected]
Found unsupported keytype (1) for [email protected]
Found unsupported keytype (3) for [email protected]
Found unsupported keytype (8) for [email protected]
Added key: 23version: 1
Added key: 16version: 1
>>> EType: sun.security.krb5.internal.crypto.Des3CbcHmacSha1KdEType
>>> KrbAsRep cons in KrbAsReq.getReply systest
>>> 19/04/01 10:08:56 INFO authenticator.AbstractLogin: Successfully logged in.
>>> 19/04/01 10:08:56 INFO kerberos.KerberosLogin: [[email protected]]: TGT refresh thread started.
>>> 19/04/01 10:08:56 INFO kerberos.KerberosLogin: [[email protected]]: TGT valid starting at: Mon Apr 01 10:08:55 CEST 2019
>>> 19/04/01 10:08:56 INFO kerberos.KerberosLogin: [[email protected]]: TGT expires: Wed May 01 10:08:55 CEST 2019
>>> 19/04/01 10:08:56 INFO kerberos.KerberosLogin: [[email protected]]: TGT refresh sleeping until: Fri Apr 26 05:03:45 CEST 2019
>>> 19/04/01 10:08:56 INFO utils.AppInfoParser: Kafka version : 2.0.0
>>> 19/04/01 10:08:56 INFO utils.AppInfoParser: Kafka commitId : 3402a8361b734732

Test with debug=true:
Command: java -Dsun.security.krb5.debug=true -Dsun.security.spnego.debug=true ...
Output:

Java config name: null
Native config name: /etc/krb5.conf
Debug is  true storeKey true useTicketCache false useKeyTab true doNotPrompt false ticketCache is null isInitiator true KeyTab is /Users/gaborsomogyi/kafka-delegation-token/kafka-consumer/systest.keytab refreshKrb5Config is false principal is [email protected] tryFirstPass is false useFirstPass is false storePass is false clearPass is false
>>> KeyTabInputStream, readName(): GCE.COMPANY.COM
>>> KeyTabInputStream, readName(): systest
>>> KeyTab: load() entry length: 70; type: 16
>>> KeyTabInputStream, readName(): GCE.COMPANY.COM
>>> KeyTabInputStream, readName(): systest
>>> KeyTab: load() entry length: 62; type: 23
>>> KeyTabInputStream, readName(): GCE.COMPANY.COM
>>> KeyTabInputStream, readName(): systest
>>> KeyTab: load() entry length: 54; type: 8
>>> KeyTabInputStream, readName(): GCE.COMPANY.COM
>>> KeyTabInputStream, readName(): systest
>>> KeyTab: load() entry length: 54; type: 3
>>> KeyTabInputStream, readName(): GCE.COMPANY.COM
>>> KeyTabInputStream, readName(): systest
>>> KeyTab: load() entry length: 54; type: 1
>>> KeyTabInputStream, readName(): GCE.COMPANY.COM
>>> KeyTabInputStream, readName(): systest
>>> KeyTab: load() entry length: 54; type: 1
Looking for keys for: [email protected]
Found unsupported keytype (1) for [email protected]
Found unsupported keytype (1) for [email protected]
Found unsupported keytype (3) for [email protected]
Found unsupported keytype (8) for [email protected]
Added key: 23version: 1
Added key: 16version: 1
>>> KdcAccessibility: reset
Looking for keys for: [email protected]
Found unsupported keytype (1) for [email protected]
Found unsupported keytype (1) for [email protected]
Found unsupported keytype (3) for [email protected]
Found unsupported keytype (8) for [email protected]
Added key: 23version: 1
Added key: 16version: 1
Using builtin default etypes for default_tkt_enctypes
default etypes for default_tkt_enctypes: 17 16 23.
>>> KrbAsReq creating message
getKDCFromDNS using UDP
getKDCFromDNS using TCP
>>> KrbKdcReq send: kdc=gsomogyi-cdh6x-secure-1.gce.company.com UDP:88, timeout=30000, number of retries =3, #bytes=148
>>> KDCCommunication: kdc=gsomogyi-cdh6x-secure-1.gce.company.com UDP:88, timeout=30000,Attempt =1, #bytes=148
>>> KrbKdcReq send: #bytes read=775
>>> KdcAccessibility: remove gsomogyi-cdh6x-secure-1.gce.company.com
Looking for keys for: [email protected]
Found unsupported keytype (1) for [email protected]
Found unsupported keytype (1) for [email protected]
Found unsupported keytype (3) for [email protected]
Found unsupported keytype (8) for [email protected]
Added key: 23version: 1
Added key: 16version: 1
>>> EType: sun.security.krb5.internal.crypto.Des3CbcHmacSha1KdEType
>>> KrbAsRep cons in KrbAsReq.getReply systest
principal is [email protected]
Will use keytab
Commit Succeeded

>>> 19/04/01 10:11:25 INFO authenticator.AbstractLogin: Successfully logged in.
>>> 19/04/01 10:11:25 INFO kerberos.KerberosLogin: [[email protected]]: TGT refresh thread started.
>>> 19/04/01 10:11:25 INFO kerberos.KerberosLogin: [[email protected]]: TGT valid starting at: Mon Apr 01 10:11:24 CEST 2019
>>> 19/04/01 10:11:25 INFO kerberos.KerberosLogin: [[email protected]]: TGT expires: Wed May 01 10:11:24 CEST 2019
>>> 19/04/01 10:11:25 INFO kerberos.KerberosLogin: [[email protected]]: TGT refresh sleeping until: Fri Apr 26 02:37:27 CEST 2019
>>> 19/04/01 10:11:25 INFO utils.AppInfoParser: Kafka version : 2.0.0
>>> 19/04/01 10:11:25 INFO utils.AppInfoParser: Kafka commitId : 3402a8361b734732

As a final conclusion even if global krb debug flag is enabled Krb5LoginModule debug messages not shown with dynamic JAAS configuration. Please see from the first execution the following messages are missing:

Debug is  true storeKey true useTicketCache false useKeyTab true doNotPrompt false ticketCache is null isInitiator true KeyTab is /Users/gaborsomogyi/kafka-delegation-token/kafka-consumer/systest.keytab refreshKrb5Config is false principal is [email protected] tryFirstPass is false useFirstPass is false storePass is false clearPass is false
...
principal is [email protected]
Will use keytab
Commit Succeeded
...

These messages is just a sample and Krb5LoginModule provides much more debug information which may be helpful for debugging.

@gaborgsomogyi
Copy link
Contributor Author

To make life easier and testable by others as well the app can be found here.

@vanzin
Copy link
Contributor

vanzin commented Apr 1, 2019

Please see from the first execution the following messages are missing:

I guess I'm not seeing what's so useful about those messages. All the other stuff that is printed when you set the system property seems way more useful.

@gaborgsomogyi
Copy link
Contributor Author

That said the module contains 22 debug messages not only the 2 shown up.
The advantages what I see:

  • It provides messages from higher level of abstraction
  • Prints out exceptions for example here

@gaborgsomogyi
Copy link
Contributor Author

If there are still concerns maybe the debug flag can be turned on by sun.security.krb5.debug which will end up in not introducing new Spark config.

@vanzin
Copy link
Contributor

vanzin commented Apr 2, 2019

If you really think those extra debug messages are super important, I'd reuse the known system property for that purpose.

@vanzin
Copy link
Contributor

vanzin commented Apr 2, 2019

Prints out exceptions for example here

The exception is also thrown, so it would show up anyway.

@gaborgsomogyi
Copy link
Contributor Author

After dealing several weeks with multi-cluster delegation token and facing constant kerberos problems I have 2 conclusions:

  • I see your point and from user perspective having a new flag is overkill and not convenient at all
  • Kerberos core provided messages are sometimes cryptic and Krb5LoginModule messages shed some lights when got stuck (though this gets better with experience).

Considering these if no objection I would turn debug on with the mentioned global flag.

@SparkQA
Copy link

SparkQA commented Apr 5, 2019

Test build #104322 has finished for PR 24204 at commit b3c467b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@vanzin
Copy link
Contributor

vanzin commented Apr 11, 2019

retest this please

@SparkQA
Copy link

SparkQA commented Apr 11, 2019

Test build #104523 has finished for PR 24204 at commit b3c467b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@vanzin
Copy link
Contributor

vanzin commented Apr 11, 2019

Merging to master.

@vanzin vanzin closed this in 94adffa Apr 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants