Skip to content

Conversation

@squito
Copy link
Contributor

@squito squito commented Aug 27, 2018

This eliminates some duplication in the code to connect to a server on localhost to talk directly to the jvm. Also it gives consistent ipv6 and error handling. Two other incidental changes, that shouldn't matter:

  1. python barrier tasks perform authentication immediately (rather than waiting for the BARRIER_FUNCTION indicator)
  2. for rdd._load_from_socket, the timeout is only increased after authentication.

@squito
Copy link
Contributor Author

squito commented Aug 27, 2018

@SparkQA
Copy link

SparkQA commented Aug 27, 2018

Test build #95303 has finished for PR 22247 at commit c232ec6.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

raise Exception("Unexpected reply from iterator server.")


def local_connect_and_auth(sock_info):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@squito, not a big deal but how about local_connect_and_auth (port, auth_secret) and ..

(sockfile, sock) = local_connect_and_auth(port, auth_secret)
(sock_file, _) = local_connect_and_auth(java_port, auth_secret)
port, auth_secret = sock_info
(sockfile, sock) = local_connect_and_auth(port, auth_secret)

or

(sockfile, sock) = local_connect_and_auth(*sock_info)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, thanks -- I used the varargs version *sock_info for the last one, i forgot about that in python.

# Support for both IPv4 and IPv6.
# On most of IPv6-ready systems, IPv6 will take precedence.
for res in socket.getaddrinfo("127.0.0.1", port, socket.AF_UNSPEC, socket.SOCK_STREAM):
af, socktype, proto, canonname, sa = res
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: af, socktype, proto, canonname, sa = res -> af, socktype, proto, _, sa = res

auth_secret = os.environ["PYTHON_WORKER_FACTORY_SECRET"]
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.connect(("127.0.0.1", java_port))
sock_file = sock.makefile("rwb", 65536)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vanzin, BTW, did you test this on Windows too?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I quickly tested and seems working fine. Please ignore this comment.

Copy link
Contributor

@jiangxb1987 jiangxb1987 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

# do_server_auth() function and data serialization methods.
sockfile = sock.makefile("rwb", 65536)

(sockfile, sock) = local_connect_and_auth(port, auth_secret)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We must set sock timeout to None to allow barrier() call blocking forever.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, thanks! updated

Copy link
Contributor

@vanzin vanzin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

sock.close()
sock = None
continue
break
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slight shorter (and more "python-compliant"?):

  • move the socket initialization (and the return) inside the try
  • get rid of the continue
  • use an else instead of the condition below

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@SparkQA
Copy link

SparkQA commented Aug 28, 2018

Test build #95354 has finished for PR 22247 at commit d07d21d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 28, 2018

Test build #95365 has finished for PR 22247 at commit b0c4483.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 29, 2018

Test build #95380 has finished for PR 22247 at commit 65ed777.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

Merged to master.

@asfgit asfgit closed this in 38391c9 Aug 29, 2018
@mengxr
Copy link
Contributor

mengxr commented Aug 29, 2018

@squito Thanks for the refactor!

squito added a commit to squito/spark that referenced this pull request Sep 12, 2018
This eliminates some duplication in the code to connect to a server on localhost to talk directly to the jvm.  Also it gives consistent ipv6 and error handling.  Two other incidental changes, that shouldn't matter:
1) python barrier tasks perform authentication immediately (rather than waiting for the BARRIER_FUNCTION indicator)
2) for `rdd._load_from_socket`, the timeout is only increased after authentication.

Closes apache#22247 from squito/py_connection_refactor.

Authored-by: Imran Rashid <[email protected]>
Signed-off-by: hyukjinkwon <[email protected]>
(cherry picked from commit 38391c9)
asfgit pushed a commit that referenced this pull request Sep 13, 2018
This eliminates some duplication in the code to connect to a server on localhost to talk directly to the jvm.  Also it gives consistent ipv6 and error handling.  Two other incidental changes, that shouldn't matter:
1) python barrier tasks perform authentication immediately (rather than waiting for the BARRIER_FUNCTION indicator)
2) for `rdd._load_from_socket`, the timeout is only increased after authentication.

Closes #22247 from squito/py_connection_refactor.

Authored-by: Imran Rashid <[email protected]>
Signed-off-by: hyukjinkwon <[email protected]>
(cherry picked from commit 38391c9)
asfgit pushed a commit that referenced this pull request Sep 25, 2018
This eliminates some duplication in the code to connect to a server on localhost to talk directly to the jvm.  Also it gives consistent ipv6 and error handling.  Two other incidental changes, that shouldn't matter:
1) python barrier tasks perform authentication immediately (rather than waiting for the BARRIER_FUNCTION indicator)
2) for `rdd._load_from_socket`, the timeout is only increased after authentication.

Closes #22247 from squito/py_connection_refactor.

Authored-by: Imran Rashid <[email protected]>
Signed-off-by: hyukjinkwon <[email protected]>
(cherry picked from commit 38391c9)
(cherry picked from commit a2a54a5)
Willymontaz pushed a commit to criteo-forks/spark that referenced this pull request Sep 26, 2019
This eliminates some duplication in the code to connect to a server on localhost to talk directly to the jvm.  Also it gives consistent ipv6 and error handling.  Two other incidental changes, that shouldn't matter:
1) python barrier tasks perform authentication immediately (rather than waiting for the BARRIER_FUNCTION indicator)
2) for `rdd._load_from_socket`, the timeout is only increased after authentication.

Closes apache#22247 from squito/py_connection_refactor.

Authored-by: Imran Rashid <[email protected]>
Signed-off-by: hyukjinkwon <[email protected]>
(cherry picked from commit 38391c9)
(cherry picked from commit a2a54a5)
Willymontaz pushed a commit to criteo-forks/spark that referenced this pull request Sep 27, 2019
This eliminates some duplication in the code to connect to a server on localhost to talk directly to the jvm.  Also it gives consistent ipv6 and error handling.  Two other incidental changes, that shouldn't matter:
1) python barrier tasks perform authentication immediately (rather than waiting for the BARRIER_FUNCTION indicator)
2) for `rdd._load_from_socket`, the timeout is only increased after authentication.

Closes apache#22247 from squito/py_connection_refactor.

Authored-by: Imran Rashid <[email protected]>
Signed-off-by: hyukjinkwon <[email protected]>
(cherry picked from commit 38391c9)
(cherry picked from commit a2a54a5)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants