Conversation
|
/hold |
|
/payload 4.14 ci blocking |
|
/payload 4.14 nightly blocking |
|
@tssurya: trigger 4 job(s) of type blocking for the ci release of OCP 4.14
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/f5f796f0-ffb9-11ed-919e-9c52ea160e54-0 |
|
@tssurya: trigger 7 job(s) of type blocking for the nightly release of OCP 4.14
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/1f58b2e0-ffba-11ed-8e7b-b523aac04171-0 |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: tssurya The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
422ba8d to
01be7fc
Compare
|
/test e2e-gcp-ovn |
|
Route manager unit test is failing. Ginkgo Skip is called because we arent running as root but its panicing and not getting caught by ginkgo and just simply skipped. Will fix. Override is fine until then. |
This commit adds a controller - zone cluster controller for managing the zone related node annotations. Upcoming commit will add support for zones. Right now zone cluster controller will allocate a unique id for each node and stores it in the node annotation. Future commits will make use of the node id to - allocate node gateway router port ips - to support interconnect feature Co-authored-by: Numan Siddique <numans@ovn.org> Signed-off-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Numan Siddique <numans@ovn.org>
This patch adds the concept of "zone" where in an k8s deployment
nodes can be grouped into one or more zones.
Some of the properties of a zone are:
- A zone can have one or more nodes
- A node can be part of only one zone.
- Each zone becomes an independent OVN deployment i.e OVN DB servers
(standalone or raft) and ovn-northd for each zone and
ovn-controllers running on each node of a zone connecting to the
zone OVN DB servers.
For each zone, a deployment should set the zone name in the OVN Northbound
database's "name" column of the NB_Global table singleton row
and in the "options:name" column of the NB_Global table.
Eg. to set the zone name as 'foo'
ovn-nbctl set NB_Global . name=foo
ovn-nbctl set NB_Global . options:name=foo
ovn-northd will copy the options:name from NB_Global to Southbound
database SB_Global options.
ovnkube-node during startup will read the Southbound database
zone name from the Southbound database with the command :
"ovn-sbctl get SB_Global . options:name" and stores it in the
node annotation - k8s.ovn.org/zone-name.
If the zone name is not set in the Southbound database, default
value "global" is used.
ovnkube network controller manager in the upcoming commits
will read this annotation value to figure out if a node
belongs to its local zone or to a remote zone.
This commit only support single global zone. Upcoming commits
will support multiple zones.
Signed-off-by: Numan Siddique <numans@ovn.org>
Presently logical switch manager (of default network controller) allocates the addresses for each node gateway router port connecting to the Join logical switch from the config.Gateway.V4JoinSubnet/V6JoinSubnet. In order to support multiple zones, this needs to be centralized in order to have unique addresses across the cluster. Cluster manager now allocates these addresses. It derives the addresses from the V4JoinSubnet/V6JoinSubnet config and the node id and it stores them in the existing node annotation - "k8s.ovn.org/node-gateway-router-lrp-ifaddr". Network controller manager will now read the addresses for the node gateway router port from this node annotation instead. Also 'JoinSwitchIPManager' is no longer needed and is removed from the code base. Note: This commit still doesn't support multiple zones. Upcoming commits will add the support. When multiple zones are supported, each zone's ovn_cluster_router router port connecting to the join logical switch will have the same IP (eg. 100.64.0.1/16). This will be fine though. Signed-off-by: Numan Siddique <numans@ovn.org>
This patch adds the concept of zones where a zone can have one or more
nodes grouped in it. To start with this patch only supports one global
zone (all the nodes belong to the global zone) and upcoming patches will
add the support for multiple zones.
Each zone requires
- OVN Northbound and Southbound database cluster
(either a single node or a raft cluster)
- ovnkube network-controller-manager which connects to its local zone OVN db
cluster and creates the OVN logical resources in the Northbound
database. network-controller-manager can be run on a single node or
multiple nodes with active/passive HA (etcd leader election).
- ovn-northd which connects to the local zone OVN dbs.
- ovnkube-node and ovn-controller on each node of the zone.
network-controller-manager creates the node logical switch and other Northbound
resources for a node only if the node belongs to its zone. Nodes with other zones
are ignored. It also creates pod resources only if it is scheduled
in the local zone node.
network-controller-manager gets its zone name from the NB_Global.name of
OVN Northbound databse. If the name is empty then the default value "global" is used.
Signed-off-by: Numan Siddique <numans@ovn.org>
In order to interconnect multiple zones (i.e for the traffic flow from one zone to another), network controller manager needs to create a Transit switch in OVN Northbound database and create a logical port for each node. In this commit, cluster manager generates transit switch port ips for each node if interconnect feature is enabled (a feature flag "--enable-interconnect" is added for this) and stores it in the node annotation - "k8s.ovn.org/ovn-node-transit-switch-port-ifaddr". It derives the node transit switch port ips for each node from the config - ClusterManager.V4TransitSwitchSubnet/V6TransitSwitchSubnet (newly added by this patch) and the node id. In the next commit, network controller manager will create the Transit switch and make use of these generated transit switch port ips. Note: This commit still doesn't support multiple zones. Upcoming commits will add the support. Signed-off-by: Numan Siddique <numans@ovn.org>
This patch creates interconnect resources in the Northbound
database to inter-connect the configured zones if interconnect
feature is enabled.
ovnkube network controller manager is expected to run on each
configured zone. Network controller manager does the similar job
to that of the `ovn-ic` service provided by OVN. An interconnect
Transit switch is created and it is connected to the OVN cluster router.
For each local node of a zone
- a pair of logical switch port - logical router port
is created. Logical switch port is created on the Transit switch
and the router port is created on the OVN cluster router.
For each remote node (not part of the local zone)
- a remote chassis entry is created in the Southbound database
- a remote port is created in the Transit switch and this remote
port is bound to the remote chassis.
- A static route is added in the OVN cluster router to send the
traffic to the Transit switch if the packet is destined to
the remote node.
This facilitates the packet traversal from a local zone node to a
remote zone node. When the packet is received by the remote node,
it resumes the egress stage of the Transit switch. From the
Transit switch the packet enters the OVN cluster router and then
to the node logical switch and finally to the destination pod.
Inter connection is presently restricted to the default network.
With this patch, multiple zones are supported for the default network
with few other features missing. Upcoming commits will add that support.
Co-authored-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Numan Siddique <numans@ovn.org>
Also track remote zone pod IPs in their namespaces. We need them in order to be able to select remote zone pods by namespace in a network policy. Signed-off-by: Dumitru Ceara <dceara@redhat.com> Co-authored-by: Numan Siddique <numans@ovn.org> Signed-off-by: Numan Siddique <numans@ovn.org>
Service controller will consider only the zone nodes when building the cluster LBs and per node LBs. Note that when a service is created, eazh zone network controller manager will create an OVN load balancer for it. Signed-off-by: Numan Siddique <numans@ovn.org>
For the ovnkube-network-controller-manager, the passed zone name is validated against the zone name stored in the Northbound database. ovnkube-network-controller-manager will exit if there is a mismatch between the two. This ensures that ovnkube-network-controller-manager is connected to the proper Northboubd database. For the ovnkube-node, the passed zone name is validated against the zone name stored in the Southbound databae. ovnkube-node will exit if there is a mismatch between the two. This ensures that ovnkube-node is connected to the proper Southbound database. This patch also changes the LE lock name. If ovnkube is started in "master" mode (both cluster manager and network controller manager) or in "network controller manager" mode, then the LE lock name will be - "ovn-kubernetes-master-<zone_name>". Signed-off-by: Numan Siddique <numans@ovn.org>
A transit switch is created for each secondary layer3 network just like how its created for the default network. cluster manager generates a network id for each network and stores it in the node annotation - "k8s.ovn.org/network-ids". This is used by network controller manager to generate a unique tunnel key for each secondary layer3 transit switch network it creates. With this patch, E-W traffic within each secondary layer3 network works when interconnect is enabled. With this patch mulitple zones are supported both for the default network and secondary layer 3 networks (except for secondary layer2 and localnet networks). Although a user can deploy multiple zones, it is still not recommended as this feature is still not tested in the CI. Upcoming patches will add the support in kind to deploy multiple zones and test it in the CI. Signed-off-by: Numan Siddique <numans@ovn.org>
…ature is enabled. Interconnect feature still doesn't support interconnecting these networks with multiple zones. Its not a limitation of the interconnect feature and upcoming patches will add this support. Signed-off-by: Numan Siddique <numans@ovn.org>
Run latest mockery to regenerate the mocked version of the SriovnetOps using the following command after changing directory to ./go-controller/pkg/util/ (future reference): ~/go/bin/mockery --name=SriovnetOps Signed-off-by: Balazs Nemeth <bnemeth@redhat.com>
Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com>
Network policies targeting IPAM less networks can only have `ipBlock` peers; while this behavior already existed, it now fails a lot earlier, when translating the multi-net policies into regular `NetworkPolicy`, thus making the flow more explicit, efficient, and readable. Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com>
Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com>
…icies Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com>
Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com>
Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com>
Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com>
When on a multi-homing lane create a secondary network **without** external access (the ovnkube-node gateway init code would pick an interface with a default route as the interface to create `br-ex` on top of). Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com>
This commit adds code to configure the cluster underlay to provide east/west connectivity between pods using a localnet secondary network, something which will allows us to test the underlay topology - since up to now we weren't actually testing anything meaningful: since all pods using a secondary network were scheduled in the same node, the underlay was not being used. Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com>
The shift instruction when setting the multi-network flag is not required. Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com>
On a normal, green, unit test run ovn package is already close to the limit of 10m timeout to run the unit tests with the eventual timeout actually happening sometimes: ok github.com/ovn-org/ovn-kubernetes/go-controller/pkg/ovn 593.856s Signed-off-by: Jaime Caamaño Ruiz <jcaamano@redhat.com>
when we create a hybrid overlay Node we pass LocalPodInformer which only knows about Pods local to the node they are running on. The hybrid overlay was assuming it was listening to all pods and manually filtering out those that are not on the local node. This commit fixes that assumption and renames the variables to reflect that they are localPodInformers and localPodListers Signed-off-by: Jacob Tanenbaum <jtanenba@redhat.com>
this code in AddPod() is not required and is duplicate effort for the hybrid overlay. code was added to AddNode() that once the drIP and drMAC are set will loop through all pods and initialize them. This check is not requried. Signed-off-by: Jacob Tanenbaum <jtanenba@redhat.com>
This is the metrics refactor for interconnect following the work done in ovn-kubernetes/ovn-kubernetes#3386. Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>
Ensuring the work done here: martinkennelly/ovn-kubernetes-1@c47ed89 sees light. Co-Authored-by: Martin Kennelly <mkennell@redhat.com> Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>
Recently we added the network-controller-manager flag mode for supporting deployments where master will run separately from cluster-manager. From ovn-kubernetes/ovn-kubernetes#3366 we have renamed the container to be more generic: `ovnkube-controller`. Since we are still early and only merged this flag a few weeks ago, let's make sure we stay consistent moving forward and rename this flag to ovnkube-controller before bringing this flag downstream into CNO. NOTE: All exisiting internal code can call this NetworkControllerManager. On a user facing level, I'd like to keep this simple and say anything programming ovnkube and OVN DB is the ovnkube-controller similar to ovn-controller container. If we don't do this change now we are going to end up with confusion and mismatch between ncm flag and ovnkube-controller container. Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>
We have 3 modes that we can run in: a) cluster-manager b) ovnkube-controller (internally ncm) c) node When IC=true and multiple zones are used cluster-manager and ovnkube-controller are run separately. CloudPrivateIPConfigType is used only from cluster-manger in that case and there is no need to initialize it from ovnkube-controller. However when these modes are run together, we use the master watch factory and this creates confusion. Here is what this PR proposes: a) CMWatch Factory when CM is run independently b) NCMWatch Factory when NCM is run independently c) NodeWatch Factory when node is run independently d) MasterWatch Factory when CM+NCM are run together e) MasterWatch Factory when NCM+Node are run together f) MasterWatch Factory when NCM+Node+CM are run together Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>
a399777 to
593577e
Compare
Route manager doesn't make any use of the waitgroup it was created with. Additionally it was only calling `wg.Done()` without calling `wg.Add(1)` first. Signed-off-by: Patryk Diak <pdiak@redhat.com> (cherry picked from commit a1b8aef)
Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>
|
PR needs rebase. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
Issues go stale after 90d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle stale |
|
@tssurya: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
|
Stale issues rot after 30d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle rotten |
/hold