Upgrading

This guide covers upgrading your RootCause.ai deployment to new versions. Proper upgrade procedures ensure zero-downtime updates and safe rollback options.

Upgrade Process Overview

Review release notes – Understand what changed
Backup data – Ensure recovery options
Test in staging – Validate before production
Upgrade platform – Deploy new version
Upgrade dependencies – If required by the new version
Verify – Confirm everything works
Rollback if needed – Quick recovery option

Checking Current Version

# View installed chart version
helm list -n perceptura

# View running image versions
kubectl get pods -n perceptura -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.spec.containers[*].image}{"\n"}{end}'

Checking Available Versions

# Show chart info (includes version)
helm show chart oci://registry.gitlab.com/perceptura/client-deployments/helm-charts-platform/helm/perceptura-platform

# Pull and inspect a specific version
helm pull oci://registry.gitlab.com/perceptura/client-deployments/helm-charts-platform/helm/perceptura-platform --version 0.3.0

Pre-Upgrade Checklist

1. Backup MongoDB

# Create backup using mongodump
kubectl exec -it perceptura-mongo-0 -n perceptura -- \
  mongodump --out=/backup/$(date +%Y%m%d)

# Or use your backup solution

2. Backup PostgreSQL

kubectl exec -it postgres-0 -n perceptura -- \
  pg_dumpall -U postgres > backup-$(date +%Y%m%d).sql

3. Export current values

helm get values perceptura-platform -n perceptura > current-values.yaml

4. Check pod health

kubectl get pods -n perceptura
# All pods should be Running

Upgrading from OCI Registry

Standard Upgrade (Latest):

helm upgrade perceptura-platform \
  oci://registry.gitlab.com/perceptura/client-deployments/helm-charts-platform/helm/perceptura-platform \
  --namespace perceptura \
  --values production-values.yaml

Upgrade to Specific Version:

helm upgrade perceptura-platform \
  oci://registry.gitlab.com/perceptura/client-deployments/helm-charts-platform/helm/perceptura-platform \
  --version 0.3.0 \
  --namespace perceptura \
  --values production-values.yaml

Dry Run First:

helm upgrade perceptura-platform \
  oci://registry.gitlab.com/perceptura/client-deployments/helm-charts-platform/helm/perceptura-platform \
  --version 0.3.0 \
  --namespace perceptura \
  --values production-values.yaml \
  --dry-run

Upgrading from Local Chart

# Pull latest changes
git pull

# Update dependencies
helm dependency update ./perceptura-platform

# Upgrade
helm upgrade perceptura-platform ./perceptura-platform \
  --namespace perceptura \
  --values production-values.yaml

Upgrading Dependencies

When upgrading dependencies (MongoDB, PostgreSQL, etc.), follow their specific upgrade paths:

Dependencies Chart:

helm upgrade perceptura-dependencies \
  oci://registry.gitlab.com/perceptura/client-deployments/helm-charts-dependencies/helm/perceptura-dependencies \
  --namespace perceptura \
  --values dependencies-values.yaml

⚠️ Database upgrades require special care:

Always backup first
Check for schema migrations
Follow vendor upgrade guides
Test data integrity after upgrade

Monitoring the Upgrade

Watch the rollout:

# Watch pods
kubectl get pods -n perceptura -w

# Watch deployment rollout
kubectl rollout status deployment/platform -n perceptura
kubectl rollout status deployment/data-service -n perceptura

# Check events
kubectl get events -n perceptura --sort-by='.lastTimestamp'

Post-Upgrade Verification

1. Check pod status

kubectl get pods -n perceptura
# All pods should be Running with READY status

2. Check logs for errors

kubectl logs -l app=platform -n perceptura --tail=50
kubectl logs -l app=data-service -n perceptura --tail=50

3. Test application functionality

Login to the UI
Load a Data View
Open a Digital Twin
Run a simple simulation

4. Check health endpoints

kubectl port-forward svc/data-service 8080:80 -n perceptura
curl http://localhost:8080/health

Rollback Procedures

If something goes wrong:

Quick Rollback:

# Rollback to previous release
helm rollback perceptura-platform -n perceptura

# Rollback to specific revision
helm rollback perceptura-platform 3 -n perceptura

View revision history:

helm history perceptura-platform -n perceptura

Example output:

REVISION  UPDATED                   STATUS      CHART                    APP VERSION
1         Mon Jan 15 10:00:00 2024  superseded  perceptura-platform-0.2.0  1.0.0
2         Tue Jan 16 14:30:00 2024  superseded  perceptura-platform-0.2.5  1.0.5
3         Wed Jan 17 09:00:00 2024  deployed    perceptura-platform-0.3.0  1.1.0

Zero-Downtime Upgrades

For zero-downtime upgrades, ensure:

1. Rolling update strategy (default):

platform:
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0

2. Sufficient replicas:

platform:
  replicaCount: 2  # Minimum for zero-downtime

3. Pod disruption budgets:

platform:
  podDisruptionBudget:
    enabled: true
    minAvailable: 1

4. Proper health checks:

platform:
  livenessProbe:
    httpGet:
      path: /api/health
      port: 3000
    initialDelaySeconds: 30
  readinessProbe:
    httpGet:
      path: /api/health
      port: 3000
    initialDelaySeconds: 10

Handling Breaking Changes

For major version upgrades with breaking changes:

1. Read migration guide – Check release notes for migration steps

2. Blue-green deployment – Deploy new version alongside old:

# Deploy new version with different release name
helm install perceptura-platform-v2 \
  oci://registry.gitlab.com/perceptura/client-deployments/helm-charts-platform/helm/perceptura-platform \
  --version 2.0.0 \
  --namespace perceptura \
  --values production-values.yaml

# Update ingress to point to new version
# Verify new version works
# Remove old version

helm uninstall perceptura-platform -n perceptura

3. Database migrations – Run any required migrations:

# If migrations are needed, run them before switching traffic
kubectl exec -it deploy/data-service -n perceptura -- python manage.py migrate

Upgrade Schedule

Recommended approach:

Environment

Timing

Validation Period

Development

Immediately

N/A

Staging

1 day after release

1-2 days

Production

3-5 days after release

After staging validation

Maintenance windows:

Schedule during low-usage periods
Notify users in advance
Have support team available

Troubleshooting Upgrades

Pods stuck in Pending:

kubectl describe pod <pod-name> -n perceptura
# Check for resource constraints or scheduling issues

CrashLoopBackOff:

kubectl logs <pod-name> -n perceptura --previous
# Check logs from crashed container

Image pull errors:

# Verify image exists and credentials work
kubectl get secret regcred -n perceptura -o yaml

Configuration errors:

# Compare current vs expected config
helm get values perceptura-platform -n perceptura
helm show values oci://registry.gitlab.com/perceptura/client-deployments/helm-charts-platform/helm/perceptura-platform

Disaster Recovery

If rollback doesn't work:

1. Restore from backup:

# MongoDB restore
kubectl exec -it perceptura-mongo-0 -n perceptura -- \
  mongorestore --drop /backup/20240115

# PostgreSQL restore
kubectl exec -i postgres-0 -n perceptura -- \
  psql -U postgres < backup-20240115.sql

2. Reinstall from known-good version:

helm uninstall perceptura-platform -n perceptura
helm install perceptura-platform \
  oci://registry.gitlab.com/perceptura/client-deployments/helm-charts-platform/helm/perceptura-platform \
  --version <last-known-good-version> \
  --namespace perceptura \
  --values production-values.yaml

Next Steps

With upgrade procedures established:

Set up automated backup schedules
Configure monitoring alerts for deployment status
Document your specific upgrade runbook
Practice rollback procedures regularly

PreviousScaling NextSelf-Hosted Requirements

Last updated 3 months ago

hashtagUpgrade Process Overview

hashtagChecking Current Version

hashtagChecking Available Versions

hashtagPre-Upgrade Checklist

hashtagUpgrading from OCI Registry

hashtagUpgrading from Local Chart

hashtagUpgrading Dependencies

hashtagMonitoring the Upgrade

hashtagPost-Upgrade Verification

hashtagRollback Procedures

hashtagZero-Downtime Upgrades

hashtagHandling Breaking Changes

hashtagUpgrade Schedule

hashtagTroubleshooting Upgrades

hashtagDisaster Recovery

hashtagNext Steps