# Reviewing Model Quality

Causal discovery finds relationships; evaluation tells you how trustworthy the model actually is. The Evaluation panel — switched on from the right-hand panel in the Digital Twin Management view — shows overall predictive accuracy, per-variable performance, and side-by-side version comparisons.

Read this before you trust simulation results. A model that predicts poorly will also simulate poorly.

For context, see [Step 5: Build Digital Twin](/user-guide/creating-digital-twin.md) and [Exploring the Causal Model](/more-details/digital-twin/exploring-causal-model.md).

***

## Opening the Evaluation panel

The Overview panel's **Evaluation** section shows a single accuracy bar. Click the *View Evaluation* link on the right of that section to open the full view.

<figure><img src="/files/tZ0Aqx3cCftR84RlbJ1C" alt="The Overview panel of a Digital Twin, with an arrow pointing at the link in the Evaluation section that opens the full Evaluation view"><figcaption><p>The <em>View Evaluation</em> link on the right of the Evaluation section opens the full panel.</p></figcaption></figure>

The graph stays visible on the left; the right panel switches its content to metrics. A link at the top returns you to Overview.

<figure><img src="/files/NvbZB3vUzE7jFZKb05uA" alt="The Evaluation panel for a 19-node Churn model, showing the predictive accuracy banner, two bar charts, and a per-node metrics table"><figcaption><p>The full Evaluation panel against the DAG view.</p></figcaption></figure>

***

## Predictive accuracy at a glance

The banner at the top is the summary verdict:

* A single headline accuracy number (here, 72%).
* How many nodes scored well, mid, and poor (3 above 80%, 12 between 50–80%, 1 below 50%).
* A note that deterministic variables are excluded from the aggregate score.
* A reminder that the variable you care most about may score very differently — check the per-node table.

If you recently removed redundant or derived columns, the headline number can drop because near-perfect-fit nodes are no longer in the average. That's not a degradation in quality; it's a change in what's being averaged.

***

## Best predicted variables

Two bar charts highlight the model's strengths:

* **Best Predicted Categories** — ranked by accuracy: the percentage of correct predictions. 100% means every prediction was right.
* **Best Predicted Numeric Variables** — ranked by R²: how much of each variable's variation the model can explain. 1.0 is perfect; 0 is no better than guessing the mean; negative means worse than that.

***

## Per-variable metrics

The table at the bottom of the panel shows every variable and its performance. Click a row to expand a per-class accuracy breakdown.

| Variable type         | Metrics shown                                           |
| --------------------- | ------------------------------------------------------- |
| Boolean / Category    | Accuracy, Precision, Recall, F1, Weighted Accuracy, AUC |
| Numeric               | MSE, MAE, R², Log Likelihood                            |
| Numeric (time series) | MAPE per forecast horizon                               |

Reading the classification metrics:

* **Accuracy** — overall correctness.
* **Precision** — when the model says "yes", how often is it right?
* **Recall** — of all true "yes" cases, how many did the model find?
* **F1** — the harmonic mean of precision and recall.
* **AUC** — discrimination ability. 1.0 is perfect; 0.5 is chance.

For rare-event variables, precision and recall are usually more informative than accuracy alone.

***

## Comparing versions

The **Version Comparison** dropdown selects one or more versions to chart side by side. Useful for confirming a configuration change improved things — and for spotting versions that improved some metrics while quietly degrading others.

***

## When a variable scores poorly

A low metric is information, not a verdict. Common causes:

* **Missing causes.** The variable's true drivers aren't in the model.
* **Data quality.** Noise, errors, or too many missing values.
* **Wrong model type.** A time-dependent variable in a static twin.
* **Too little data.** Not enough examples to learn the pattern.

If you can add the missing drivers, do so in the Config panel and retrain. If the variable is genuinely hard to predict, that's a constraint to remember for any simulation that touches it.

***

## Other Working with a Digital Twin pages

* [Exploring the Causal Model](/more-details/digital-twin/exploring-causal-model.md) — graph layouts and variable details.
* [Inspecting Causal Relationships](/more-details/digital-twin/causal-relationships.md) — individual edges and their statistics.
* [Configuration](/more-details/digital-twin/configuration.md) — model settings, included variables, constraints.
* **Version History** *(coming soon)* — multiple versions of the same twin.

See [Digital Twin overview](/more-details/digital-twin.md) — general overview.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.rootcause.ai/more-details/digital-twin/model-quality.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
