OCI OKE Autoscaler Demo

Deploy a CPU‑intensive demo app (autoscale-probe) and a separate Grafana deployment.
Each component is exposed through its own OCI Layer‑7 Load Balancer on port 80.

App: http://<APP_LB_IP>/ (port 80 → container 8080)
Health: http://<APP_LB_IP>/healthz
Grafana: http://<GRAFANA_LB_IP>/ (port 80 → container 3000, default creds admin / admin — change in prod)

The manifest includes: Metrics Server, Prometheus, HorizontalPodAutoscaler (HPA) (Ensure the OCI Cluster Autoscaler add‑on is Enabled).

Overview

This setup targets OKE and demonstrates HPA‑driven scaling of a CPU‑bound app while visualizing metrics in Grafana.
Grafana runs in its own Deployment and is published by a separate Service of type LoadBalancer.

Metrics sources used by the grafana bundled dashboard:

CPU & Memory Utilization (per‑node + cluster): from kubelet cAdvisor (container_*, machine_*).
Node readiness & capacity/allocatable: custom gauges exported by autoscale-probe (k8s_node_*).
Pods Up & Pod Health: from Prometheus up{job="autoscale-probe"} (labels added via relabeling).

Architecture

                      +-----------------------------+         +-----------------------------+
                      |   OCI LB (App)              |         |   OCI LB (Grafana)          |
Internet  ----------->|   Port 80 ---> / (App)      |         |   Port 80 ---> / (Grafana)  |
                      +-----------------------------+         +-----------------------------+
                                 |                                        |
                       Service: autoscale-probe-lb               Service: grafana
                       (ns: autoscale)                           (ns: monitoring)
                                 |                                        |
                      +-------------------------+                 +-------------------------+
                      |  Pod: autoscale-probe   |                 |  Pod: grafana           |
                      |  Container: app (8080)  |                 |  Container: grafana     |
                      +-------------------------+                 +-------------------------+

Namespaces:
- autoscale   : app + Services (LB + metrics) + HPA
- monitoring  : Grafana + Prometheus + Grafana provisioning ConfigMaps
- kube-system : CertManager + metrics-server + Cluster Autoscaler (enable addons in OKE)

What’s in the Manifest

Namespace autoscale
- Deployment autoscale-probe (container: app)
- Service autoscale-probe-lb (type LoadBalancer, port 80 → app)
- Service autoscale-probe-metrics (ClusterIP for Prometheus scraping)
- HorizontalPodAutoscaler autoscale-probe-hpa (60% CPU, 1→20 replicas)
  - Cluster Autoscaler adds nodes when pods become unschedulable and removes them when nodes are underused.
Namespace monitoring
- Deployment grafana + Service grafana (LoadBalancer on port 80)
- Grafana provisioning ConfigMap:
  - grafana-datasource → Prometheus datasource
  - grafana-dashboard → “OKE Autoscale” dashboard JSON
  - grafana-dash-provider → dashboard file provider
- Deployment prometheus + Service prometheus (ClusterIP)
Namespace kube-system
- cert-manager (enable the addon in OKE)
- metrics-server (enable the addon in OKE)
- Cluster Autoscaler (enable the addon in OKE)

Prerequisites

OKE cluster with the OCI Cloud Controller Manager (for LBs)
kubectl configured to point at your cluster
(Optional) hey for load generation
Egress access for image pulls (ghcr.io, grafana, prometheus, k8s images)

Security: Grafana defaults to admin/admin in this demo. For production, inject credentials via a Secret and env, or use an IdP.

Deploy

Clone repo:

git clone https://github.com/cj667113/OCI_K8_AUTOSCALING_APP.git
cd OCI_K8_AUTOSCALING_APP
kubectl apply -f oke-autoscale.yaml

Grab the Load Balancer IPs

kubectl -n autoscale   get svc autoscale-probe-lb -w
kubectl -n monitoring  get svc grafana -w

When EXTERNAL-IP appears, use those as APP_LB_IP and GRAFANA_LB_IP.

Use the App and Grafana

App (root)    : http://<APP_LB_IP>/
Health        : http://<APP_LB_IP>/healthz
Grafana (UI)  : http://<GRAFANA_LB_IP>/  (admin / admin)

Prometheus is internal (ClusterIP):

kubectl -n monitoring get svc prometheus

If you update the dashboard ConfigMap and don’t see changes, reload Grafana:
kubectl -n monitoring rollout restart deploy/grafana

Sample Grafana Dashboard

_{Grafana dashboard included via ConfigMaps (see grafana-dashboard and grafana-datasource in the manifest).}

Generate Load & Watch Autoscaling

The HPA targets 60% CPU with minReplicas: 1, maxReplicas: 20.

Generate sustained CPU load for 500s with concurrency 72:

hey -z 500s -c 72 -disable-keepalive "http://<APP_LB_IP>/burn?cpu_ms=100"

Observe metrics and scaling:

watch -n 1 kubectl top pods -A
kubectl -n autoscale get hpa autoscale-probe
kubectl get nodes -w

Enable the OCI Cluster Autoscaler

Docs:

OCI CLI example:

oci ce cluster update-addon --addon-name ClusterAutoscaler --from-json file://<path-to-config-file> --cluster-id <cluster-ocid>
kubectl -n kube-system rollout restart deploy/cluster-autoscaler

Tail autoscaler logs:

kubectl -n kube-system logs -f deploy/cluster-autoscaler | egrep -i "scale-down|unneeded|removing|utilization|NoScaleDown"

Troubleshooting

No EXTERNAL-IP on Service

kubectl -n autoscale describe svc autoscale-probe-lb
kubectl -n monitoring describe svc grafana
Verify OCI CCM is running and LB quota/permissions are OK
Ensure subnet/security lists allow inbound 80

Grafana not loading

Confirm security lists / firewall for <GRAFANA_LB_IP>:80
kubectl -n monitoring logs deploy/grafana

HPA shows unknown metrics

kubectl get apiservices | grep metrics → v1beta1.metrics.k8s.io must be Available
kubectl top pods -A should return CPU/Memory (metrics-server healthy)
kubectl -n autoscale describe hpa autoscale-probe for details

Cleanup

kubectl delete -f oke-autoscale.yaml

Handy Commands (copy/paste)

kubectl apply -f oke-autoscale.yaml
kubectl -n kube-system rollout restart deploy/cluster-autoscaler
kubectl get nodes
watch -n 1 kubectl top pods -A
kubectl get pods -n kube-system
kubectl get pods -n autoscale
kubectl get pods -n monitoring
kubectl top pods -A | head
kubectl -n kube-system logs -f deploy/cluster-autoscaler | egrep -i "scale-down|unneeded|removing|utilization|NoScaleDown"
kubectl get nodes -w

# Load test (replace APP_LB_IP)
hey -z 500s -c 72 -disable-keepalive "http://<APP_LB_IP>/burn?cpu_ms=100"

Enjoy! 🎉

Name		Name	Last commit message	Last commit date
Latest commit History 130 Commits
docker		docker
flask		flask
images		images
kubernetes		kubernetes
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OCI OKE Autoscaler Demo

Table of Contents

Overview

Architecture

What’s in the Manifest

Prerequisites

Deploy

Grab the Load Balancer IPs

Use the App and Grafana

Sample Grafana Dashboard

Generate Load & Watch Autoscaling

Enable the OCI Cluster Autoscaler

Troubleshooting

Cleanup

Handy Commands (copy/paste)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

OCI OKE Autoscaler Demo

Table of Contents

Overview

Architecture

What’s in the Manifest

Prerequisites

Deploy

Grab the Load Balancer IPs

Use the App and Grafana

Sample Grafana Dashboard

Generate Load & Watch Autoscaling

Enable the OCI Cluster Autoscaler

Troubleshooting

Cleanup

Handy Commands (copy/paste)

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages