Ontology Concepts
The ontology is your organization's semantic layer—a map of what your data actually means, independent of how it's stored. When "customer_id" in your sales database is the same concept as "cust_id" in your support system, the ontology makes that explicit.
This matters for causal discovery. When RootCause.ai builds a Digital Twin, it needs to understand that these two columns represent the same thing. Without an ontology, you'd be manually specifying joins every time. With one, the platform automatically knows how your datasets connect.
Think of it as the bridge between the physical reality of your data (column names, table structures, database schemas) and the business reality of what that data represents (customers, transactions, products, events).
(SCREENSHOT: Ontology concepts page showing network view with datasets and concepts interconnected)
Why Ontology Matters
Unified Data Understanding
Data rarely lives in one place. Customer information might span your CRM, billing system, support tickets, and web analytics. The ontology unifies these sources—one concept can map to fields across dozens of datasets.
Automatic Join Discovery
When you create a Data View or build a Digital Twin, RootCause.ai uses the ontology to automatically discover how datasets connect. Select a customer identifier concept, and the platform finds all datasets that share it.
Consistent Semantics
Is "revenue" gross or net? Does "created_date" mean when the record was created or when the event occurred? The ontology provides a single source of truth for what each concept means in your organization.
Causal Model Quality
Causal discovery algorithms work better when they understand the semantic structure of your data. Marking a column as a "time" concept tells the model to respect temporal ordering. Marking something as an "identifier" prevents it from being treated as a causal variable.
Concept Classifications
Concepts can be tagged with classifications that carry semantic meaning:
Identifier
Unique keys that link records across datasets. Customer ID, Product SKU, Transaction ID, Order Number. These serve as join keys and are typically excluded from causal analysis (you don't model "customer_id causes revenue"—the customer causes revenue).
Time
Temporal attributes that enable time-based analysis. Order Date, Timestamp, Created At, Event Time. Essential for temporal Digital Twins and time-series operations. Time concepts define the ordering of events.
Location
Geographic or spatial attributes. City, Region, Country, Postal Code, Latitude/Longitude. Enable location-based filtering and geographic analysis.
Entity (Default)
General business concepts without special classification. Revenue, Quantity, Status, Price, Rating. These are the variables typically involved in causal relationships.
(SCREENSHOT: Classification dropdown showing Identifier, Time, Location options)
View Modes
The ontology can be explored in three different views:
Network View
Visualizes relationships between concepts and datasets as an interactive graph. Concepts appear as nodes, with edges showing which datasets contain them. Datasets sharing concepts are visually connected.
This view is best for:
Understanding how datasets relate to each other
Finding concepts that bridge multiple data sources
Discovering the "shape" of your data landscape
Table View
A traditional list of concepts with sortable columns: name, classification, data type, number of sources, creation date. Supports filtering and search.
This view is best for:
Quickly finding a specific concept
Bulk operations
Detailed metadata inspection
Card View
Concepts organized by dataset or classification in a card layout. Each card shows the concept name, type, and connected sources.
This view is best for:
Browsing concepts by category
Visual exploration
Understanding what data each dataset contributes
(SCREENSHOT: Toggle between Network, Table, and Card views)
Network Anchor Modes
In Network View, you can change how the graph is organized using anchor modes:
Dataset Anchor
Groups concepts around their source datasets. Each dataset appears as a hub, with its concepts radiating outward. Concepts shared between datasets create bridges.
Use this when you want to understand what data each source contributes.
Concept Type Anchor
Groups concepts by their classification (Identifier, Time, Location, Entity). See all your identifier concepts in one cluster, all time concepts in another.
Use this when you want to understand the semantic structure of your ontology.
Identifier Anchor
Centers the view around identifier concepts. Shows how entities (customers, products, orders) connect to their attributes across datasets.
Use this when you want to understand the key entities in your data and what you know about them.
Time Anchor
Displays a timeline view showing when data is available for each concept. Useful for understanding temporal coverage and identifying gaps.
Use this when planning time-series analysis or checking data availability.
(SCREENSHOT: Network view with Dataset anchor mode showing datasets as hubs)
Concept Details
Click any concept to open the detail panel:
Basic Information
Name: The semantic name for this concept
Description: What this concept represents (editable)
Classification: Identifier, Time, Location, or Entity
Data Type: String, Number, DateTime, Boolean, Category
Schema Field Name: The canonical field name used in Data Views
Metadata
Additional properties depending on the concept type:
Categories: For categorical concepts, the list of valid values
Min/Max Values: For numeric concepts, the expected range
Is Unique: Whether values should be unique (useful for identifiers)
Is Monotonically Increasing/Decreasing: For time-ordered values
Location Type: For location concepts—country, city, postal code, coordinates
Data Sources
A table showing every dataset that contains this concept, with:
Dataset name
Field name in that dataset
Actions (Split)
Data Preview
Sample rows from each connected dataset, showing actual values for this concept.
(SCREENSHOT: Concept detail panel showing metadata, sources, and data preview)
Merge and Split Operations
Merging Concepts
When two concepts represent the same thing but were created separately, merge them:
Open the concept you want to keep
Click "Merge"
Select the concept to merge into this one
Confirm
All field mappings from the source concept transfer to the target. The source concept is deleted.
Example: You have "customer_id" from your sales data and "cust_identifier" from support tickets. After merging, one concept maps to both fields.
Splitting Concepts
When a concept incorrectly combines fields that mean different things, split them:
Open the concept
In the Data Sources table, find the mapping that should be separate
Click "Split" on that row
A new concept is created with just that mapping
Example: An "id" concept accidentally maps to both customer IDs and product IDs. Split the product mapping into its own concept.
(SCREENSHOT: Merge dialog showing concept selection and preview)
Data View Composer
The ontology includes a visual composer for creating Data Views:
Click "New View" in the sidebar
The graph enters composer mode
Click concepts to add them as anchors
The system automatically discovers all connected concepts transitively
Configure join behavior for each anchor
Preview the resulting Data View
Save
Anchor Concepts
When you select a concept as an anchor, you're saying "I want data about this thing." The composer then finds all datasets containing that concept and all other concepts that can be joined through identifier relationships.
Join Configuration
For each anchor, you can configure:
Which fields to include
Join type (inner, left, right)
Aggregation if needed
(SCREENSHOT: Composer mode with selected anchor concepts and join configuration)
Automatic Concept Creation
RootCause.ai automatically creates ontology concepts when you upload data:
Each column becomes a concept
Data types are inferred from values
Columns with similar names across datasets may be suggested for merging
Common patterns (emails, dates, IDs) are detected and classified
You can always edit, merge, or reclassify concepts after automatic creation.
Best Practices
Name Concepts Clearly
Use business terminology, not technical column names. "Customer Lifetime Value" is better than "cust_ltv_amt".
Add Descriptions
Future you (and your colleagues) will thank you. Describe what the concept represents, not just what column it came from.
Classify Intentionally
Identifier, Time, and Location classifications carry semantic meaning that affects how RootCause.ai treats the data. Don't leave everything as Entity.
Merge Aggressively
If two concepts mean the same thing, merge them. More mappings per concept = better data integration.
Review After Uploads
Automatic concept creation is smart but not perfect. Always review new concepts after uploading data.
Link to Core Technology
For deeper understanding of how ontologies enable causal discovery, see Ontology in Core Technologies.
Last updated

