# Requirements

Everything you need before deploying. Read this first, hand the checklist to your infrastructure team, and move on to the Deployment guide once everything is in place.

***

## How the platform is deployed

The RootCause Platform is deployed into your Kubernetes cluster via the **RootCause Operator**. The operator manages the full lifecycle — install, configure, upgrade — through an Admin UI.

The deployment flow is:

1. Your infra team installs the operator (one Helm command)
2. You open the Admin UI and fill in a bootstrap wizard
3. The operator deploys everything: databases, message queues, identity, LLM proxy, and the platform itself

Once installed, data scientists can check for updates and apply them through the Admin UI without involving the infra team.

### What the operator deploys

```
┌─────────────────────────────────────────────────────┐
│                   Admin UI (:3000)                  │
│  Bootstrap │ Overview │ Secrets │ Users │ Releases  │
└──────────────────────┬──────────────────────────────┘
                       │ writes CR
                       ▼
┌─────────────────────────────────────────────────────┐
│              RootCauseInstallation CR               │
└──────────────────────┬──────────────────────────────┘
                       │ reconciles
                       ▼
┌─────────────────────────────────────────────────────┐
│              Operator Controller                    │
└──────────────────────┬──────────────────────────────┘
                       │ manages
          ┌────────────┼────────────┐
          ▼            ▼            ▼
   Dependencies    Platform     Secrets
   ├ PostgreSQL    ├ Platform   ├ Auth keys
   ├ Redis         ├ Data Svc   ├ DB credentials
   ├ MongoDB       ├ Data Fusion├ Storage credentials
   ├ Temporal      ├ ML Jobs
   ├ LiteLLM       └ Ingresses
   └ FusionAuth
```

**Platform components:**

| Component    | What it does                                                                   |
| ------------ | ------------------------------------------------------------------------------ |
| Platform     | UI and backend-for-frontend (Next.js + API)                                    |
| Data Service | Core backend for data processing, orchestration, and LLM integration (FastAPI) |
| Data Fusion  | Data transformation and aggregation engine                                     |
| ML Jobs      | Headless compute engine for batch and algorithmic workloads                    |

**Infrastructure dependencies** (deployed by the operator into your cluster):

| Dependency | Purpose                                                        | Minimum version     |
| ---------- | -------------------------------------------------------------- | ------------------- |
| PostgreSQL | Relational database for Temporal, FusionAuth, and LiteLLM      | 15.0+               |
| MongoDB    | Document database for application data                         | 8.0+                |
| Redis      | Sync layer                                                     | 7.2.4+              |
| RabbitMQ   | Message queue for async processing                             | 3.12.2+             |
| Temporal   | Durable workflow execution                                     | 1.27.2+             |
| LiteLLM    | LLM API proxy and management                                   | Deployed by default |
| FusionAuth | Identity and access management (optional — see Identity below) | Deployed on request |

***

## Infrastructure requirements

### Kubernetes and tooling

| Requirement | Version                                   |
| ----------- | ----------------------------------------- |
| Kubernetes  | 1.26+ (tested on AKS, EKS, GKE, and kind) |
| Helm        | 3.12+                                     |
| kubectl     | Configured with cluster-admin access      |

### Cluster sizing

**Minimum** — supports datasets up to \~5GB, 50 columns, \~1M rows:

| Resource       | Spec                          |
| -------------- | ----------------------------- |
| Nodes          | 3-4                           |
| CPU per node   | 8 vCPU                        |
| RAM per node   | 64 GB                         |
| Storage        | SSD-backed persistent volumes |
| GPU (optional) | 48 GB VRAM (local LLMs only)  |

**Recommended for production** — supports datasets up to \~10GB, 100+ columns, \~20M rows:

| Resource       | Spec                          |
| -------------- | ----------------------------- |
| Nodes          | 4+                            |
| CPU per node   | 64 vCPU                       |
| RAM per node   | 128 GB                        |
| Storage        | SSD-backed persistent volumes |
| GPU (optional) | 96 GB VRAM (local LLMs only)  |

All nodes must be amd64 architecture.

***

## Cloud-specific prerequisites

The platform runs on Azure, AWS, and GCP. The core deployment is identical; only the items below differ.

| Requirement            | Azure                                     | AWS                               | GCP                                   |
| ---------------------- | ----------------------------------------- | --------------------------------- | ------------------------------------- |
| **Managed Kubernetes** | AKS                                       | EKS                               | GKE                                   |
| **Storage class**      | `managed-csi`                             | `gp3`                             | `standard` or `premium-rwo`           |
| **Object storage**     | Azure Blob Storage                        | S3                                | GCS                                   |
| **Ingress options**    | Azure Application Gateway (AGIC) or nginx | nginx or ALB Ingress Controller   | nginx or GCE Ingress                  |
| **Identity (OIDC)**    | Azure EntraID                             | Okta, Auth0, or any OIDC provider | Google Workspace or any OIDC provider |

> **Note:** Azure Application Gateway requires additional ingress annotations and path configuration. The Deployment guide covers this in detail.

***

## What you need to provide

Hand this checklist to your infrastructure team. Everything must be in place before deployment begins.

### Cluster and access

* [ ] Kubernetes cluster provisioned (1.26+, amd64 nodes, SSD-backed storage)
* [ ] `kubectl` configured with cluster-admin access
* [ ] Helm 3.12+ installed

### Networking

* [ ] Ingress controller installed (nginx, Azure Application Gateway, or equivalent)
* [ ] Base domain with wildcard DNS record pointing to the ingress controller IP
* [ ] Subdomains planned for: platform, auth, and litellm (e.g., `platform.rootcause.example.com`, `auth.rootcause.example.com`, `litellm.rootcause.example.com`)
* [ ] TLS certificate — wildcard or per-subdomain

### Storage

* [ ] Object storage bucket(s) created with credentials:
  * Datasets bucket/container
  * Digital twins bucket/container
  * ML models bucket/container
* [ ] Storage class available for persistent volumes (e.g., `managed-csi`, `gp3`)

### Registry access

* [ ] Container registry credentials from RootCause (GitLab deploy token with `read_registry` scope)

### Identity (choose one)

* [ ] **OIDC (recommended):** Issuer URL, client ID, client secret, and well-known configuration URL from your identity provider
* [ ] **SAML:** Metadata URL or XML, entity ID, and certificate from your identity provider
* [ ] **Managed FusionAuth:** No preparation needed — the operator provisions it automatically. Best for POCs or organizations without an existing IdP.

### LLM access

* [ ] API keys for at least one LLM provider (OpenAI, Anthropic, Google AI Studio, AWS Bedrock, or Azure OpenAI). LiteLLM is included in every deployment and proxies requests to your LLM providers.

***

## What RootCause provides

* **RootCause Operator** Helm chart (OCI registry)
* **Platform and dependencies** Helm charts (OCI registry, pulled automatically by the operator)
* **Admin UI** for configuration, deployment, user management, and upgrades
* **Deployment support** via Slack, email, and scheduled check-ins

***

## Decision points

Answer these before starting the Deployment guide. They determine which sections you'll fill in during the bootstrap wizard.

### 1. Identity provider

| Option                          | When to choose it                                                                            |
| ------------------------------- | -------------------------------------------------------------------------------------------- |
| **External OIDC** (recommended) | Your organization has an existing identity provider (EntraID, Okta, Auth0, Google Workspace) |
| **External SAML**               | Your IdP only supports SAML, or your security team requires it                               |
| **Managed FusionAuth**          | POC, eval, or no existing IdP. The operator deploys and configures FusionAuth automatically. |

### 2. Ingress controller

| Option                        | When to choose it                                                                                                                                                     |
| ----------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **nginx**                     | Most common. Works on all clouds. Simpler configuration.                                                                                                              |
| **Azure Application Gateway** | Required by your Azure networking team, or you need WAF/DDoS protection at the ingress layer. Requires additional annotations and path config — see Deployment guide. |

### 3. Object storage

| Cloud | Service            | What you need                                        |
| ----- | ------------------ | ---------------------------------------------------- |
| Azure | Azure Blob Storage | Storage account name, connection string, account key |
| AWS   | S3                 | Bucket names, access key, secret key, region         |
| GCP   | GCS                | Bucket names, service account JSON key               |

***

## Next step

Once your infrastructure team has checked off the list above, proceed to the **Deployment guide**.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.rootcause.ai/installation-and-deployment/requirements.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
