feat: Direct connection support between provider clusters#3115
feat: Direct connection support between provider clusters#3115Gabbe64 wants to merge 12 commits intoliqotech:masterfrom
Conversation
|
Hi @Gabbe64. Thanks for your PR! I am @adamjensenbot.
Make sure this PR appears in the liqo changelog, adding one of the following labels:
|
07a8051 to
0c46130
Compare
|
/test |
1 similar comment
|
/test |
| ) | ||
|
|
||
| // MapEndpointsWithConfiguration maps the endpoints of the shadowendpointslice. | ||
| func MapEndpointsWithConfiguration(ctx context.Context, cl client.Client, |
There was a problem hiding this comment.
This function is has a parametr called clusterID whic is overrided by the GetConnectionDataByIP function. The options are 2:
- the parameter is useless
- the GetConnectionDataByIP is shadowing the value
There was a problem hiding this comment.
Yes, that was a mistake.
The name of the parameter was changed and it stores the clusterID of the local cluster.
| if addrHasBeenRemapped { | ||
| break |
There was a problem hiding this comment.
The break statement exits the inner address loop (for j := range endpoints[i].Addresses) after the first address is remapped. If an endpoint has multiple addresses, only the first one gets processed
| func (l *InfoList) GetConnectionDataByIP(ip string) (clusterID, originalIP string, found bool) { | ||
| for _, entry := range l.Items { | ||
| for i, remappedIP := range entry.RemappedIPs { | ||
| if ip == remappedIP { | ||
| return entry.ClusterID, entry.IPs[i], true | ||
| } | ||
| } | ||
| } | ||
| return "", "", false | ||
| } |
There was a problem hiding this comment.
Assumes len(entry.IPs) == len(entry.RemappedIPs) without validation. If these slices have different lengths (which is possible via the Add() method), accessing entry.IPs[i] will cause a panic.
| continue | ||
| } | ||
|
|
||
| objectName := endpoint.TargetRef.Name |
There was a problem hiding this comment.
endpoint.TargetRef is an optional pointer (*v1.ObjectReference). According to Kubernetes API specifications, it can be nil. This code will panic if an endpoint has no TargetRef
There was a problem hiding this comment.
Added a check at the start of the block.
| err := remoteConnectionsData.FromJSON([]byte(val)) | ||
|
|
||
| if err != nil { | ||
| klog.Errorf("failed to unmarshal direct connection data for shadowendpointslice %q: %v", nsName, err) |
There was a problem hiding this comment.
In the previous implementation the error was not returned because in case of problems during the unmarshalling, the controller could just skip the remapping of the direct address and use the standard external CIDR address (of the consumer) provided by the ShadowEndpointSlice.
It thought it to be some sort of fallback mechanism.
With the new implementation, since the address in the ShadowEPS is not the one from the external CIDR, it's mandatory to return an error, because there is no "usable" endpoint in case the unmarshalling goes wrong.
| type Info struct { | ||
| ClusterID string `json:"ID"` | ||
| IPs []string `json:"IPs"` | ||
| RemappedIPs []string `json:"rIPs"` | ||
| } |
There was a problem hiding this comment.
It's not clear to me why you need all these info in the annotation. I think just an annotation with the clusterID should be enough.
In theory you should just reflect the endpointslie as it is on the remote cluster, then the remote cluster will reconcile the shadow_endpointslice and will use the "configuration" resource related to the clusterID contained inside the annotation, to remap the IP
| klog.Errorf("failed to unmarshal direct connection data for shadowendpointslice %q: %v", nsName, err) | ||
| } | ||
| // JSON is not propagated to the EndpointSlice | ||
| delete(shadowEps.Annotations, directConnectionAnnotationLabel) |
There was a problem hiding this comment.
Can you remove it in a more explicit way? For example create a function that "forge" the endpointslice annotations starting from the shadowendpointslice
There was a problem hiding this comment.
|
Given your feedback I decided to refactor this feature. The core logic remains the same, but the exchanged data has changed slightly. Changes:
I also added a few utilities, including a helper for annotation forging and an optimized lookup for the data structure embedded within the annotations. |
Description
Fixes #3011
This PR introduces the possibility to make pods deployed on provider clusters communicate directly by using an established network connection (
liqoctl network connect).This feature is opt-in: users can enable it by adding the annotation key
use-direct-connections: trueto theServiceexposing the pods that should communicate directly. This way, the default behavior remains unchanged, while also giving users the possibility to choose whether to route traffic through the consumer cluster or directly.Implementation
As discussed during the last community meetings, the modified components are the virtual-kubelet and the ShadowEndpointslice controller.
Transmission - virtual-kubelet
When reflecting
Endpointslicesand the correspondingServicehas the right annotation, some data is collected from the informers (without direct the API server calls) then serialized as JSON and reflected in the annotation field of theShadowEndpointslice.This data is:
clusterIDwhere the endpoint is deployed,Endpointsliceof the consumer,Reception
The ShadowEndpointslice controller (in the provider cluster) receives this data and uses it to remap the addresses so to use the direct connections towards other provider clusters (if available).
To achieve this:
podCIDR(point 1) and the host part extracted from the local IP (point 2); this required the implementation of a "forced mapping".