# Digital Twin & Simulations

Once a causal graph has been discovered and validated, RootCause.ai creates a Digital Twin: a live, data-driven model of your system. The twin acts as a sandbox where interventions, counterfactuals, and optimizations can be tested out, and their impacts evaluated, before being applied in real life.

***

#### Definition & Purpose

The Digital Twin is the execution layer of RootCause.ai. It translates causal structure into decision support by:

* Simulating the effects of interventions in a controlled environment
* Providing explainable reasoning behind KPI changes
* Balancing trade-offs across multiple outcomes

This makes it possible to move beyond descriptive analytics and into prescriptive, causally sound decision-making.

***

#### How It Works

1. Baseline World – The twin samples outcomes from the learned causal model.
2. Intervention – One or more variables are modified (hard values, relative changes, or segment-specific).
3. Propagation – Effects flow through the causal graph, updating downstream nodes according to their dependencies.
4. Simulation Runs – Monte Carlo sampling produces distributions of possible futures.
5. Comparison – Baseline and intervention scenarios are compared, with uncertainty intervals provided.

***

#### Advanced Capabilities

* Bayesian Foundations – The twin runs on a causal Bayesian model with posterior sampling, additive regression trees, and ensembles for time-series data.
* Ontology Integration – Variable dependencies are inferred from the ontology, ensuring simulations respect real-world structure.
* Scalability – Optimized search and independence testing avoid quadratic blowup, allowing analysis of high-dimensional datasets.
* Segment Analysis – Simulations can run across sub-populations (regions, customer cohorts, product lines) to uncover heterogeneous effects.
* Optimization – The system can recommend levers that maximize or minimize a target while accounting for secondary impacts.
* Natural Language Interface – Simulations can be configured via structured UI or plain-language queries.

***

#### Deployment & Performance

* Enterprise-Ready – Runs self-hosted for sensitive environments, with optional cloud execution for evaluation.
* High-Volume Data – Capable of handling tens of gigabytes of multivariate time-series data in hours.
* Efficiency – Designed to operate in high-dimensional feature spaces where traditional causal inference becomes infeasible.

***

#### Types of Simulations

* Interventions – Test the effect of changing a driver variable.
* Counterfactuals – Explore alternative outcomes for past events.
* Explanations – Identify which drivers most influenced a KPI shift.
* Optimizations – Automatically search for the best intervention to reach a desired outcome.
