-
Notifications
You must be signed in to change notification settings - Fork 860
[K8s] Fix rsync Not Found
#7844
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
/smoke-test -k test_docker_storage_mounts --kubernetes |
86e64fa to
e8ee5c7
Compare
|
/smoke-test -k test_docker_storage_mounts --kubernetes |
1 similar comment
|
/smoke-test -k test_docker_storage_mounts --kubernetes |
cg505
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the happy path (rsync already installed), this will introduce an additional roundtrip while uploading files - what is the overhead of that?
If significant, can we instead execute a bash script that does the waiting on the remote cluster and than execs `"$@" once rsync is installed?
Also, doesn't block the PR but we should understand why we are not already waiting for rsync to be installed using the signal files in /tmp
The overhead was pretty significant. So I switched to your suggestion waiting on the remote cluster and checking if rsync is available. For your suggestion I saw no difference between our current code and this code in the rsync time in the happy path case. Both took within 10ms of each other.
So yes we do currently touch a file after we’ve finished installing packages. In the setup commands in the kubernetes ray yaml file we will wait until this file has been created. So setting up the ray runtime will wait for all of the packages to be installed. But the rsync code comes from |
|
/smoke-test -k test_docker_storage_mounts --kubernetes |
SeungjinYang
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @lloyd-brown for the catch!
We saw an error last week #7761 where using rsync to set up the files required by a cluster In Kubernetes would fail with
OCI runtime exec failed: exec failed: unable to start container process: exec: "rsync": executable file not found in $PATH: unknownshowing that rsync was not available on the kubernetes pod.After some digging I found that while we do use a file to denote the end of the package installation process and we do check if the file exists before running cluster setup we don't check the existence of this file in
internal_file_mounts. So if package installation is sufficiently slow we will run this function and error out with rsync not being found.Now when we exec into the pod before we start running rsync we will check on the pod if rsync exists. This should help cases where the package installation is slightly slow.
I tested this by
Tested (run the relevant ones):
bash format.sh/smoke-test(CI) orpytest tests/test_smoke.py(local)/smoke-test -k test_name(CI) orpytest tests/test_smoke.py::test_name(local)/quicktest-core(CI) orpytest tests/smoke_tests/test_backward_compat.py(local)