This page contains an analyzis on the list of test cases listed in the CNCF CNF Testsuite to determine if RA2 should contain related workload requirements.

Each test should be clearly documented - there is no documentation currently.
The test case description should be written describing expectation clearly

(eg Test if the CNF crashes when disk fill occurs

should be written as

Test that the CNF does not crash when disk fill occurs)

Issues raised to CNCF CNF Testsuite during this work

Issue	Status
[BUG]: Test case descriptions are not clear	Open
[BUG]: Link for rolling-update-replication-controller is broken	Fix in [BUG] 1243 1244 usage doc URL and description fixes
[BUG]: Bugs in "To check if the CNF is compatible with different CNIs"	Fix in [BUG] 1243 1244 usage doc URL and description fixes #1245

The analyzis

Test	Note	Verdict
To test the increasing and decreasing of capacity	Do we request horizontal scaling from all CNF-s? Most (data plane, signalling, etc) but not all (eg OSS)	should be optional, or just fail if it scales incorrectly in case the CNF scales
Test if the Helm chart is published	At the moment RA2 does not mandate the usage of Helm. We should first decide on CNF packaging. RA2 can stay neutral, follow the O-RAN/ONAP ASD path or propose own solution.	should be fine - no HELM specs in RA2 today, unless some incompatible CNFs packaging specs (unlikely)
Test if the Helm chart is valid	At the moment RA2 does not mandate the usage of Helm.
Test if the Helm deploys	At the moment RA2 does not mandate the usage of Helm. This should be more generic, like testing if the CNF deploys.
Test if the install script uses Helm v3	At the moment RA2 does not mandate the usage of Helm.
To test if the CNF can perform a rolling update	As there's some CNFs that actually use rolling update without keeping the service alive (because they require some post-configuration), the test should make sure that there is service continuity. this might just be a health probe or testing the k8s service, or something sufficiently straightforward. In other words, CNF service/traffic should work during the whole process (before during and after a rolling upgrade)	Needed
To check if a CNF version can be downgraded through a rolling_version_change	It is not clear what is the difference between a rolling downgrade and a rolling version change. Maybe when you request an arbitrary version?
To check if a CNF version can be downgraded through a rolling_downgrade	Same as above?	Needed
To check if a CNF version can be rolled back rollback	It is not clear what is the difference between a rolling downgrade and a rolled back rollback.
To check if the CNF is compatible with different CNIs	This covers only the default CNI, does not cover the metaplugin part. Need additional tests for cases with multiple interfaces.	Ok but needs additional tests for multiple interfaces
(PoC) To check if a CNF uses Kubernetes alpha APIs	Alpha API-s are not recommended by `ra2.k8s.012`. It is not clear what is the OK criteria of this test.	Ok if fails with alpha
To check if the CNF has a reasonable image size	It passes if the image size is smaller than 5GB.	Ok but should be documented or configurable?
To check if the CNF have a reasonable startup time	It is not clear what reasonable startup time is	Ok but should be documented or configurable?
To check if the CNF has multiple process types within one container	Containers in the CNF should have only one process type.	What's the rationale?
To check if the CNF exposes any of its containers as a service	Service type what? RA2 mandates that clusters must support Loadbalancer and ClusterIP, and should support Nodeport and ExternalName Should there be a test for the CNF to use Ingress or Gateway objects as well ?	May need tweaking to add Ingress?
To check if the CNF has multiple microservices that share a database	Clarify rationale? In some cases it is good for multiple Microservices to share a DB, eg when restoring the state of a transaction from a failed service. Also good to have a shared DB across multiple services for things like HSS etc.	Clarify
Test if the CNF crashes when node drain and rescheduling occurs. All configuration should be stateless	CNF should react gracefully (no loss of context/sessions/data/logs & service continues to run) to eviction and node draining The statelessness test should be made independent & Should be skipped for stateful pods eg Dns	Needed - but replace "crash" with "react gracefully" (no loss of context/sessions/data/logs & service continues to run) Statelessness test should be separate
To test if the CNF uses a volume host path	should pass if the cnf doesn't have a hostPath volume What's the rationale?
To test if the CNF uses local storage	should fail if local storage configuration found What's the rationale?
To test if the CNF uses elastic volumes	should pass if the cnf uses an elastic volume What's an elastic volume? Does this mean Ephemeral? Or is this an AWS-specific test? There should be a definition of what an elastic volume is (besides ELASTIC_PROVISIONING_DRIVERS_REGEX)	What's an elastic volume? Does this mean Ephemeral? Or is this an AWS-specific test?
To test if the CNF uses a database with either statefulsets, elastic volumes, or both	A database may use statefulsets along with elastic volumes to achieve a high level of resiliency. Any database in K8s should at least use elastic volumes to achieve a minimum level of resilience regardless of whether a statefulset is used. Statefulsets without elastic volumes is not recommended, especially if it explicitly uses local storage. The least optimal storage configuration for a database managed by K8s is local storage and no statefulsets, as this is not tolerant to node failure. There should be a definition of what an elastic volume is (besides ELASTIC_PROVISIONING_DRIVERS_REGEX)	What's an elastic volume? Does this mean Ephemeral? Or is this an AWS-specific test?
Test if the CNF crashes when network latency occurs	How is this tested? Where is the test running? Some traffic against a service? Latency should be configurable (default is 2s)? What should happen if latency is exceeded? Should this be more stringent than "not crashing?" What is the expectation? (not crashing = not exit with error code or (better) not stopping to process traffic)	Needed but needs clarification
Test if the CNF crashes when disk fill occurs	What is the expectation? (not crashing = not exit with error code or (better) not stopping to process traffic)	Needed
Test if the CNF crashes when pod delete occurs	What is the expectation? (not crashing = not exit with error code or (better) not stopping to process traffic)	Needed
Test if the CNF crashes when pod memory hog occurs	What is the expectation? (not crashing = not exit with error code or (better) not stopping to process traffic)
Test if the CNF crashes when pod io stress occurs	What is the expectation? (not crashing = not exit with error code or (better) not stopping to process traffic)	Needed
Test if the CNF crashes when pod network corruption occurs	It is not clear what network corruption is in this context. What is the expectation? (not crashing = not exit with error code or (better) not stopping to process traffic)
Test if the CNF crashes when pod network duplication occurs	It is not clear what network duplication is in this context. What is the expectation? (not crashing = not exit with error code or (better) not stopping to process traffic)
To test if there is a liveness entry in the Helm chart	Liveness probe should be mandatory, but RA2 does not mandate Helm at the moment. (it's in the pod definition rather than helm - maybe fix the title)	Needed
To test if there is a readiness entry in the Helm chart	Readiness probe should be mandatory, but RA2 does not mandate Helm at the moment. (it's in the pod definition rather than helm - maybe fix the title)	Needed
To check if logs are being sent to stdout/stderr	optional, as there's no way to accurately figure out if we're missing something from stdout/stderr
To check if prometheus is installed and configured for the cnf	There is a chapter for Additional required components (4.10), but without any content. should ra2 mandate prometheus?
To check if logs and data are being routed through fluentd	There is a chapter for Additional required components (4.10), but without any content. should ra2 mandate fluent?
To check if Open Metrics is being used and or compatible.	There is a chapter for Additional required components (4.10), but without any content. should ra2 mandate open metrics?
To check if tracing is being used with Jaeger	There is a chapter for Additional required components (4.10), but without any content. should ra2 mandate jaeger?
To check if a CNF is using container socket mounts	what is being tested? Make sure to not mount /var/run/docker.sock, /var/run/containerd.sock or /var/run/crio.sock on the containers?	Needed
To check if containers are using any tiller images	ie test if it's NOT helm v2?	ok if not helm v2
To check if any containers are running in privileged mode	ie NOT privileged?
To check if a CNF is running services with external IP's	does this mean "k8s service?" RA2 mandates that clusters must support Loadbalancer and ClusterIP, and should support Nodeport and ExternalName
To check if any containers are running as a root user	ie not Root?
To check if any containers allow for privilege escalation	ie not allowed?
To check if an attacker can use a symlink for arbitrary host file system access	ok if not
To check if there are service accounts that are automatically mapped	what is the expectation? Application Credentials: Developers store secrets in the Kubernetes configuration files, such as environment variables in the pod configuration. Such behavior is commonly seen in clusters that are monitored by Azure Security Center. Attackers who have access to those configurations, by querying the API server or by accessing those files on the developer’s endpoint, can steal the stored secrets and use them. Check if the pod has sensitive information in environment variables, by using list of known sensitive key names. Check if there are configmaps with sensitive information. Remediation: Use Kubernetes secrets or Key Management Systems to store credentials. See more at ARMO-C0012
To check if there is a host network attached to a pod	should be ok with or without - eg when exposing services via cluster network as opposed to nodeport?
To check if there are service accounts that are automatically mapped	Disable automatic mounting of service account tokens to PODs either at the service account level or at the individual POD level, by specifying the automountServiceAccountToken: false. Note that POD level takes precedence. See more at ARMO-C0034
To check if there is an ingress and egress policy defined	ok - maybe more stringent?
To check if there are any privileged containers	duplicate?
To check for insecure capabilities	what is the expectation?
To check for dangerous capabilities	what is the expectation?
To check if namespaces have network policies defined	ok - maybe more stringent? duplicate?
To check if containers are running with non-root user with non-root membership	duplicate?
To check if containers are running with hostPID or hostIPC privileges	ok if not
To check if security services are being used to harden containers	what services? should be configurable or optional Linux Hardening: Check if there is AppArmor, Seccomp, SELinux or Capabilities are defined in the securityContext of container and pod. If none of these fields are defined for both the container and pod, alert. Remediation: In order to reduce the attack surface, it is recommended to harden your application using security services such as SELinux®, AppArmor®, and seccomp. Starting from Kubernetes version 22, SELinux is enabled by default. Read more at ARMO-C0055
To check if containers have resource limits defined	ok
To check if containers have immutable file systems	ok
To check if containers have hostPath mounts	ok if not
To check if containers are using labels	ok - maybe mandate some mandatory labels?
To test if there are versioned tags on all images using OPA Gatekeeper	ok
To test if there are any (non-declarative) hardcoded IP addresses or subnet masks	ok - there shouldn't be any internal hardcoded nw anyway
To test if there are node ports used in the service configuration	ok but service type LB should be better
To test if there are host ports used in the service configuration	duplicate? host ports = node ports?
To test if there are any (non-declarative) hardcoded IP addresses or subnet masks in the K8s runtime configuration	duplicate?
To check if a CNF version uses immutable configmaps	ok
Test if the CNF crashes when pod dns error occurs	What is the expectation? (not crashing = not exit with error code or (better) not stopping to process traffic)

Space shortcuts

Blog

Analyzis of CNCF CNF Testsuite tests for RA2

Issues raised to CNCF CNF Testsuite during this work

The analyzis

To test the increasing and decreasing of capacity

Test if the Helm chart is published

Test if the Helm chart is valid

Test if the Helm deploys

Test if the install script uses Helm v3

To test if the CNF can perform a rolling update

To check if a CNF version can be downgraded through a rolling_version_change

To check if a CNF version can be downgraded through a rolling_downgrade

To check if a CNF version can be rolled back rollback

To check if the CNF is compatible with different CNIs

(PoC) To check if a CNF uses Kubernetes alpha APIs

To check if the CNF has a reasonable image size

To check if the CNF have a reasonable startup time

To check if the CNF has multiple process types within one container

To check if the CNF exposes any of its containers as a service

To check if the CNF has multiple microservices that share a database

Test if the CNF crashes when node drain and rescheduling occurs. All configuration should be stateless

To test if the CNF uses a volume host path

To test if the CNF uses local storage

To test if the CNF uses elastic volumes

To test if the CNF uses a database with either statefulsets, elastic volumes, or both

Test if the CNF crashes when network latency occurs

Test if the CNF crashes when disk fill occurs

Test if the CNF crashes when pod delete occurs

Test if the CNF crashes when pod memory hog occurs

Test if the CNF crashes when pod io stress occurs

Test if the CNF crashes when pod network corruption occurs

Test if the CNF crashes when pod network duplication occurs

To test if there is a liveness entry in the Helm chart

To test if there is a readiness entry in the Helm chart

To check if logs are being sent to stdout/stderr

To check if prometheus is installed and configured for the cnf

To check if logs and data are being routed through fluentd

To check if Open Metrics is being used and or compatible.

To check if tracing is being used with Jaeger

To check if a CNF is using container socket mounts

To check if containers are using any tiller images

To check if any containers are running in privileged mode

To check if a CNF is running services with external IP's

To check if any containers are running as a root user

To check if any containers allow for privilege escalation

To check if an attacker can use a symlink for arbitrary host file system access

To check if there are service accounts that are automatically mapped

To check if there is a host network attached to a pod

To check if there are service accounts that are automatically mapped

To check if there is an ingress and egress policy defined

To check if there are any privileged containers

To check for insecure capabilities

To check for dangerous capabilities

To check if namespaces have network policies defined

To check if containers are running with non-root user with non-root membership

To check if containers are running with hostPID or hostIPC privileges

To check if security services are being used to harden containers

To check if containers have resource limits defined

To check if containers have immutable file systems

To check if containers have hostPath mounts

To check if containers are using labels

To test if there are versioned tags on all images using OPA Gatekeeper

To test if there are any (non-declarative) hardcoded IP addresses or subnet masks

To test if there are node ports used in the service configuration

To test if there are host ports used in the service configuration

To test if there are any (non-declarative) hardcoded IP addresses or subnet masks in the K8s runtime configuration

To check if a CNF version uses immutable configmaps

Test if the CNF crashes when pod dns error occurs