-
Notifications
You must be signed in to change notification settings - Fork 21
Support localhost connection to k8s api server #1077
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Yanjun Zhou <[email protected]>
a8cdaff to
f6c0470
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1077 +/- ##
==========================================
+ Coverage 75.77% 75.79% +0.01%
==========================================
Files 145 146 +1
Lines 19708 19740 +32
==========================================
+ Hits 14934 14962 +28
- Misses 3863 3866 +3
- Partials 911 912 +1
🚀 New features to boost your workflow:
|
| func main() { | ||
| log.Info("Starting NSX Operator") | ||
| mgr, err := ctrl.NewManager(ctrl.GetConfigOrDie(), ctrl.Options{ | ||
| mgr, err := ctrl.NewManager(pkgutil.GetConfig(), ctrl.Options{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the API server address switch only can occur in the startup stage, right? Then if the eth1 down during the NSX operator runtime, what will happen?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it only occurs in the startup stage.
The current case is when wcp enabled between backup and restore, cpvm eth1 will be down after NSX restore and we rely on NSX Operator to recover it. In this case, NSX Operator will always restarts as NSX connection will be down due to restore. In other cases eth1 may be down, shall we always expect NSX or WCP side to bring it back, and it might be fine NSX Operator does not work during that time?
If there is use case that NSX Operator should switch from cluster ip to localhost at runtime, maybe we can leverage the liveness probe to force the nsx operator restarting. Actually we need to refactor the liveness probe in a following up PR as currently it will try to check the eth1, i.e. get api like http://172.26.0.3:8384/healthz
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've checked this in HA mode, and found operator will restart after eht1 down automatically because the lease renewal failed.
Updated: But in non-HA mode, operator will not restart, but the api server call will fail with errors like {"error": "Put \"https://172.24.0.1:443/apis/crd.nsx.vmware.com/v1alpha1/namespaces/ns-1/subnetsets/pod-default/status\": http2: client connection lost"}
|
Can one of the admins verify this patch? |
The PR supports to connect k8s API server through localhost when cpvm eth1 is down.
Testing done:
Override the Kubernetes service host to unaccessible and observe the NSX Operator
runs as expected by connecting k8s API server through localhost.