Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 12 additions & 5 deletions _topic_maps/_topic_map.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2947,14 +2947,21 @@ Topics:
File: what-huge-pages-do-and-how-they-are-consumed-by-apps
Distros: openshift-origin,openshift-enterprise
- Name: Low latency tuning
File: cnf-low-latency-tuning
Dir: low_latency_tuning
Distros: openshift-origin,openshift-enterprise
- Name: Performing latency tests for platform verification
File: cnf-performing-platform-verification-latency-tests
Topics:
- Name: Understanding low latency
File: cnf-understanding-low-latency
- Name: Tuning nodes for low latency with the performance profile
File: cnf-tuning-low-latency-nodes-with-perf-profile
- Name: Provisioning real-time and low latency workloads
File: cnf-provisioning-low-latency-workloads
- Name: Debugging low latency tuning
File: cnf-debugging-low-latency-tuning-status
- Name: Performing latency tests for platform verification
File: cnf-performing-platform-verification-latency-tests
- Name: Improving cluster stability in high latency environments using worker latency profiles
File: scaling-worker-latency-profiles
- Name: Creating a performance profile
File: cnf-create-performance-profiles
Distros: openshift-origin,openshift-enterprise
- Name: Workload partitioning
File: enabling-workload-partitioning
Expand Down
6 changes: 2 additions & 4 deletions edge_computing/ztp-advanced-policy-config.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -33,9 +33,7 @@ include::modules/ztp-using-pgt-to-configure-power-states.adoc[leveloffset=+1]
[role="_additional-resources"]
.Additional resources

* xref:../scalability_and_performance/cnf-low-latency-tuning.adoc#cnf-understanding-workload-hints_cnf-master[Understanding workload hints]

* xref:../scalability_and_performance/cnf-low-latency-tuning.adoc#configuring-workload-hints_cnf-master[Configuring workload hints manually]
* xref:../scalability_and_performance/low_latency_tuning/cnf-tuning-low-latency-nodes-with-perf-profile.adoc#configuring-workload-hints_cnf-low-latency-perf-profile[Configuring node power consumption and realtime processing with workload hints]

include::modules/ztp-using-pgt-to-configure-performance-mode.adoc[leveloffset=+2]

Expand All @@ -46,7 +44,7 @@ include::modules/ztp-using-pgt-to-configure-power-saving-mode.adoc[leveloffset=+
[role="_additional-resources"]
.Additional resources

* xref:../scalability_and_performance/cnf-low-latency-tuning.adoc#node-tuning-operator-pod-power-saving-config_cnf-master[Enabling critical workloads for power saving configurations]
* xref:../scalability_and_performance/low_latency_tuning/cnf-tuning-low-latency-nodes-with-perf-profile.adoc#cnf-configuring-power-saving-for-nodes_cnf-low-latency-perf-profile[Configuring power saving for nodes that run colocated high and low priority workloads]

* xref:../edge_computing/ztp-reference-cluster-configuration-for-vdu.adoc#ztp-du-configuring-host-firmware-requirements_sno-configure-for-vdu[Configuring host firmware for low latency and high performance]

Expand Down
2 changes: 1 addition & 1 deletion installing/installing-preparing.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -114,7 +114,7 @@ For a production cluster, you must configure the following integrations:
[id="installing-preparing-cluster-for-workloads"]
== Preparing your cluster for workloads

Depending on your workload needs, you might need to take extra steps before you begin deploying applications. For example, after you prepare infrastructure to support your application xref:../cicd/builds/build-strategies.adoc#build-strategies[build strategy], you might need to make provisions for xref:../scalability_and_performance/cnf-low-latency-tuning.adoc#cnf-low-latency-tuning[low-latency] workloads or to xref:../nodes/pods/nodes-pods-secrets.adoc#nodes-pods-secrets[protect sensitive workloads]. You can also configure xref:../observability/monitoring/enabling-monitoring-for-user-defined-projects.adoc#enabling-monitoring-for-user-defined-projects[monitoring] for application workloads.
Depending on your workload needs, you might need to take extra steps before you begin deploying applications. For example, after you prepare infrastructure to support your application xref:../cicd/builds/build-strategies.adoc#build-strategies[build strategy], you might need to make provisions for xref:../scalability_and_performance/low_latency_tuning/cnf-tuning-low-latency-nodes-with-perf-profile.adoc#cnf-low-latency-perf-profile[low-latency] workloads or to xref:../nodes/pods/nodes-pods-secrets.adoc#nodes-pods-secrets[protect sensitive workloads]. You can also configure xref:../observability/monitoring/enabling-monitoring-for-user-defined-projects.adoc#enabling-monitoring-for-user-defined-projects[monitoring] for application workloads.
If you plan to run xref:../windows_containers/enabling-windows-container-workloads.adoc#enabling-windows-container-workloads[Windows workloads], you must enable xref:../networking/ovn_kubernetes_network_provider/configuring-hybrid-networking.adoc#configuring-hybrid-networking[hybrid networking with OVN-Kubernetes] during the installation process; hybrid networking cannot be enabled after your cluster is installed.

[id="supported-installation-methods-for-different-platforms"]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -42,4 +42,5 @@ After you perform preinstallation tasks, install your cluster by following the m
* Consult the following references after you deploy your cluster to improve its performance:
** xref:../../networking/hardware_networks/using-dpdk-and-rdma.adoc#nw-openstack-ovs-dpdk-testpmd-pod_using-dpdk-and-rdma[A test pod template for clusters that use OVS-DPDK on OpenStack].
** xref:../../networking/hardware_networks/add-pod.adoc#nw-openstack-sr-iov-testpmd-pod_add-pod[A test pod template for clusters that use SR-IOV on OpenStack].
** xref:../../scalability_and_performance/cnf-create-performance-profiles.adoc#installation-openstack-ovs-dpdk-performance-profile_cnf-create-performance-profiles[A performance profile template for clusters that use OVS-DPDK on OpenStack].
** xref:../../scalability_and_performance/low_latency_tuning/cnf-tuning-low-latency-nodes-with-perf-profile.adoc#installation-openstack-ovs-dpdk-performance-profile_cnf-low-latency-perf-profile[A performance profile template for clusters that use OVS-DPDK on OpenStack]
.
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
// Module included in the following assemblies:
//
// * scalability_and_performance/low_latency_tuning/cnf-understanding-low-latency.adoc

:_mod-docs-content-type: CONCEPT
[id="cnf-about-hyper-threading-for-low-latency-and-real-time-applications_{context}"]
= About Hyper-Threading for low latency and real-time applications

Hyper-Threading is an Intel processor technology that allows a physical CPU processor core to function as two logical cores, executing two independent threads simultaneously. Hyper-Threading allows for better system throughput for certain workload types where parallel processing is beneficial. The default {product-title} configuration expects Hyper-Threading to be enabled.

For telecommunications applications, it is important to design your application infrastructure to minimize latency as much as possible. Hyper-Threading can slow performance times and negatively affect throughput for compute-intensive workloads that require low latency. Disabling Hyper-Threading ensures predictable performance and can decrease processing times for these workloads.

[NOTE]
====
Hyper-Threading implementation and configuration differs depending on the hardware you are running {product-title} on. Consult the relevant host hardware tuning information for more details of the Hyper-Threading implementation specific to that hardware. Disabling Hyper-Threading can increase the cost per core of the cluster.
====
7 changes: 4 additions & 3 deletions modules/cnf-about-irq-affinity-setting.adoc
Original file line number Diff line number Diff line change
@@ -1,10 +1,11 @@
// Module included in the following assemblies:
//
// scalability_and_performance/cnf-low-latency-tuning.adoc
// * scalability_and_performance/cnf-low-latency-tuning.adoc
// * scalability_and_performance/low_latency_tuning/cnf-tuning-low-latency-nodes-with-perf-profile.adoc

:_mod-docs-content-type: CONCEPT
[id="about_irq_affinity_setting_{context}"]
= About support of IRQ affinity setting
= Finding the effective IRQ affinity setting for a node

Some IRQ controllers lack support for IRQ affinity setting and will always expose all online CPUs as the IRQ mask. These IRQ controllers effectively run on CPU 0.

Expand Down Expand Up @@ -60,4 +61,4 @@ $ find /proc/irq -name effective_affinity -printf "%p: " -exec cat {} \;
/proc/irq/34/effective_affinity: 2
----

Some drivers use `managed_irqs`, whose affinity is managed internally by the kernel and userspace cannot change the affinity. In some cases, these IRQs might be assigned to isolated CPUs. For more information about `managed_irqs`, see link:https://access.redhat.com/solutions/4819541[Affinity of managed interrupts cannot be changed even if they target isolated CPU].
Some drivers use `managed_irqs`, whose affinity is managed internally by the kernel and userspace cannot change the affinity. In some cases, these IRQs might be assigned to isolated CPUs. For more information about `managed_irqs`, see link:https://access.redhat.com/solutions/4819541[Affinity of managed interrupts cannot be changed even if they target isolated CPU].
4 changes: 2 additions & 2 deletions modules/cnf-about-the-profile-creator-tool.adoc
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
// Module included in the following assemblies:
// Epic CNF-792 (4.8)
// * scalability_and_performance/cnf-create-performance-profiles.adoc
//
// * scalability_and_performance/low_latency_tuning/cnf-tuning-low-latency-nodes-with-perf-profile.adoc

:_mod-docs-content-type: CONCEPT
[id="cnf-about-the-profile-creator-tool_{context}"]
Expand Down

This file was deleted.

Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
// Module included in the following assemblies:
//CNF-1483 (4.8)
// * scalability_and_performance/low-latency-tuning.adoc
//
// * scalability_and_performance/low_latency_tuning/cnf-tuning-low-latency-nodes-with-perf-profile.adoc

:_mod-docs-content-type: PROCEDURE
[id="adjusting-nic-queues-with-the-performance-profile_{context}"]
Expand Down Expand Up @@ -165,4 +165,4 @@ spec:
[source,terminal]
----
$ oc apply -f <your_profile_name>.yaml
----
----
6 changes: 3 additions & 3 deletions modules/cnf-allocating-multiple-huge-page-sizes.adoc
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
// CNF-538 Promote Multiple Huge Pages Sizes for Pods and Containers to beta
// Module included in the following assemblies:
//
// *scalability_and_performance/cnf-low-latency-tuning.adoc
// * scalability_and_performance/cnf-low-latency-tuning.adoc
// * scalability_and_performance/low_latency_tuning/cnf-tuning-low-latency-nodes-with-perf-profile.adoc

[id="cnf-allocating-multiple-huge-page-sizes_{context}"]
= Allocating multiple huge page sizes
Expand All @@ -22,4 +22,4 @@ spec:
- count: 4
node: 1
size: 1G
----
----
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
// CNF-643 Support and debugging tools for CNF
// Module included in the following assemblies:
//
// *scalability_and_performance/cnf-low-latency-tuning.adoc
// * scalability_and_performance/cnf-low-latency-tuning.adoc
// * scalability_and_performance/low_latency_tuning/cnf-debugging-low-latency-tuning-status.adoc

:_mod-docs-content-type: PROCEDURE
[id="cnf-collecting-low-latency-tuning-debugging-data-for-red-hat-support_{context}"]
Expand Down
158 changes: 6 additions & 152 deletions modules/cnf-configure_for_irq_dynamic_load_balancing.adoc
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
// Module included in the following assemblies:
//
// scalability_and_performance/cnf-low-latency-tuning.adoc
// * scalability_and_performance/low_latency_tuning/cnf-tuning-low-latency-nodes-with-perf-profile.adoc

:_mod-docs-content-type: PROCEDURE
[id="configuring_for_irq_dynamic_load_balancing_{context}"]
= Configuring a node for IRQ dynamic load balancing
= Configuring node interrupt affinity

Configure a cluster node for IRQ dynamic load balancing to control which cores can receive device interrupt requests (IRQ).

Expand Down Expand Up @@ -34,154 +34,8 @@ spec:
+
[NOTE]
====
When you configure reserved and isolated CPUs, the infra containers in pods use the reserved CPUs and the application containers use the isolated CPUs.
When you configure reserved and isolated CPUs, operating system processes, kernel processes, and systemd services run on reserved CPUs.
Infrastructure pods run on any CPU except where the low latency workload is running.
Low latency workload pods run on exclusive CPUs from the isolated pool.
For more information, see "Restricting CPUs for infra and application containers".
====

. Create the pod that uses exclusive CPUs, and set `irq-load-balancing.crio.io` and `cpu-quota.crio.io` annotations to `disable`. For example:
+
[source,yaml,subs="attributes+"]
----
apiVersion: v1
kind: Pod
metadata:
name: dynamic-irq-pod
annotations:
irq-load-balancing.crio.io: "disable"
cpu-quota.crio.io: "disable"
spec:
securityContext:
runAsNonRoot: true
seccompProfile:
type: RuntimeDefault
containers:
- name: dynamic-irq-pod
image: "registry.redhat.io/openshift4/cnf-tests-rhel8:v{product-version}"
command: ["sleep", "10h"]
resources:
requests:
cpu: 2
memory: "200M"
limits:
cpu: 2
memory: "200M"
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop: [ALL]
nodeSelector:
node-role.kubernetes.io/worker-cnf: ""
runtimeClassName: performance-dynamic-irq-profile
# ...
----

. Enter the pod `runtimeClassName` in the form performance-<profile_name>, where <profile_name> is the `name` from the `PerformanceProfile` YAML, in this example, `performance-dynamic-irq-profile`.
. Set the node selector to target a cnf-worker.
. Ensure the pod is running correctly. Status should be `running`, and the correct cnf-worker node should be set:
+
[source,terminal]
----
$ oc get pod -o wide
----
+
.Expected output
+
[source,terminal]
----
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
dynamic-irq-pod 1/1 Running 0 5h33m <ip-address> <node-name> <none> <none>
----
. Get the CPUs that the pod configured for IRQ dynamic load balancing runs on:
+
[source,terminal]
----
$ oc exec -it dynamic-irq-pod -- /bin/bash -c "grep Cpus_allowed_list /proc/self/status | awk '{print $2}'"
----
+
.Expected output
+
[source,terminal]
----
Cpus_allowed_list: 2-3
----
. Ensure the node configuration is applied correctly. Log in to the node to verify the configuration.
+
[source,terminal]
----
$ oc debug node/<node-name>
----
+
.Expected output
+
[source,terminal]
----
Starting pod/<node-name>-debug ...
To use host binaries, run `chroot /host`

Pod IP: <ip-address>
If you do not see a command prompt, try pressing enter.

sh-4.4#
----

. Verify that you can use the node file system:
+
[source,terminal]
----
sh-4.4# chroot /host
----
+
.Expected output
+
[source,terminal]
----
sh-4.4#
----

. Ensure the default system CPU affinity mask does not include the `dynamic-irq-pod` CPUs, for example, CPUs 2 and 3.
+
[source,terminal]
----
$ cat /proc/irq/default_smp_affinity
----
+
.Example output
+
[source,terminal]
----
33
----
. Ensure the system IRQs are not configured to run on the `dynamic-irq-pod` CPUs:
+
[source,terminal]
----
find /proc/irq/ -name smp_affinity_list -exec sh -c 'i="$1"; mask=$(cat $i); file=$(echo $i); echo $file: $mask' _ {} \;
----
+
.Example output
+
[source,terminal]
----
/proc/irq/0/smp_affinity_list: 0-5
/proc/irq/1/smp_affinity_list: 5
/proc/irq/2/smp_affinity_list: 0-5
/proc/irq/3/smp_affinity_list: 0-5
/proc/irq/4/smp_affinity_list: 0
/proc/irq/5/smp_affinity_list: 0-5
/proc/irq/6/smp_affinity_list: 0-5
/proc/irq/7/smp_affinity_list: 0-5
/proc/irq/8/smp_affinity_list: 4
/proc/irq/9/smp_affinity_list: 4
/proc/irq/10/smp_affinity_list: 0-5
/proc/irq/11/smp_affinity_list: 0
/proc/irq/12/smp_affinity_list: 1
/proc/irq/13/smp_affinity_list: 0-5
/proc/irq/14/smp_affinity_list: 1
/proc/irq/15/smp_affinity_list: 0
/proc/irq/24/smp_affinity_list: 1
/proc/irq/25/smp_affinity_list: 1
/proc/irq/26/smp_affinity_list: 1
/proc/irq/27/smp_affinity_list: 5
/proc/irq/28/smp_affinity_list: 1
/proc/irq/29/smp_affinity_list: 0
/proc/irq/30/smp_affinity_list: 0-5
----
55 changes: 55 additions & 0 deletions modules/cnf-configuring-high-priority-workload-pods.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
// Module included in the following assemblies:
//
// * scalability_and_performance/low_latency_tuning/cnf-provisioning-low-latency-workloads.adoc

:_mod-docs-content-type: PROCEDURE
[id="cnf-configuring-high-priority-workload-pods_{context}"]
= Disabling power saving mode for high priority pods

You can configure pods to ensure that high priority workloads are unaffected when you configure power saving for the node that the workloads run on.

When you configure a node with a power saving configuration, you must configure high priority workloads with performance configuration at the pod level, which means that the configuration applies to all the cores used by the pod.

By disabling P-states and C-states at the pod level, you can configure high priority workloads for best performance and lowest latency.

.Configuration for high priority workloads
[cols="1,2,3", options="header"]

|===
| Annotation | Possible Values | Description

|`cpu-c-states.crio.io:` a| * `"enable"`
* `"disable"`
* `"max_latency:microseconds"` | This annotation allows you to enable or disable C-states for each CPU. Alternatively, you can also specify a maximum latency in microseconds for the C-states. For example, enable C-states with a maximum latency of 10 microseconds with the setting `cpu-c-states.crio.io`: `"max_latency:10"`. Set the value to `"disable"` to provide the best performance for a pod.

| `cpu-freq-governor.crio.io:` | Any supported `cpufreq governor`. | Sets the `cpufreq` governor for each CPU. The `"performance"` governor is recommended for high priority workloads.
|===

.Prerequisites

* You have configured power saving in the performance profile for the node where the high priority workload pods are scheduled.

.Procedure

. Add the required annotations to your high priority workload pods. The annotations override the `default` settings.
+
.Example high priority workload annotation
[source,yaml]
----
apiVersion: v1
kind: Pod
metadata:
#...
annotations:
#...
cpu-c-states.crio.io: "disable"
cpu-freq-governor.crio.io: "performance"
#...
#...
spec:
#...
runtimeClassName: performance-<profile_name>
#...
----

. Restart the pods to apply the annotation.
Loading