Build Ontology

When your data finishes importing, RootCause automatically builds an ontology. It scans every dataset, identifies each column's type and role, and maps connections between columns that represent the same concept across different datasets. You do not need to create it — it is ready by the time your data is processed.


What the ontology does

The ontology is the semantic layer that makes everything downstream possible. When "customer_id" in your sales data and "cust_id" in your support data are recognised as the same concept, RootCause can join those datasets automatically. When a column is classified as a Time concept, causal discovery respects temporal ordering. When a column is classified as an Identifier, it is used as a join key rather than treated as a variable in the causal model.


Reviewing your ontology

From Data Management, click the Ontology tab. The left pane shows a network graph: datasets appear as coloured clusters, concepts as labelled nodes, and edges between them show which concepts are shared across sources.

Ontology network view showing two datasets connected through a shared Customer Id concept
The ontology network. Each coloured cluster is a dataset; the node where they meet is a shared identifier concept — in this case, Customer Id.

The right panel — Ontology Home — shows three things:

  • Recommended Data Views — joins the platform has detected based on shared identifier concepts. This is the most important panel on the screen: it is the bridge between your ontology and Step 3.

  • Data Views — views already created for this workspace.

  • Query your data — ad-hoc query entry points for exploring your data directly.


Concept classifications

Every concept is assigned one of four classifications:

Identifier — unique keys that link records across datasets: Customer ID, Product SKU, Order Number. These serve as join keys and are excluded from causal analysis.

Time — temporal columns: Order Date, Timestamp, Created At. Defines event ordering for time-series analysis and temporal Digital Twins.

Location — geographic columns: City, Region, Postal Code. Enables location-based filtering and analysis.

Entity — everything else: Revenue, Quantity, Churn, Monthly Charges. These are the variables that participate in causal relationships.

You can browse concepts grouped by dataset or by classification type using the Card view. Switch to it with the toggle at the top of the Ontology tab.

Ontology card view showing concepts grouped by dataset
Card view groups concepts by dataset. Each card shows the concept name, its data type, and its classification.
Ontology card view scrolled to show Entity and Identifier concept groups
Scrolling down shows concepts grouped by classification — useful for checking that identifiers and time fields have been correctly detected.

Refining your ontology

For most projects the auto-generated ontology is accurate enough to proceed. If a concept has been misclassified, or if two columns that represent the same thing were created as separate concepts, you can correct this at any time — reclassify a concept, merge two into one, or split one that incorrectly combines two different things. A detailed guide to ontology refinement will be added here in a future update.


Next step

With the ontology in place, RootCause knows how your datasets connect. Step 3: Build 360 Data Table walks through creating the single analysis-ready dataset that the causal engine trains on.

Last updated