Selecting Data
RootCause.ai adapts rapidly to your data in its current format, without requiring lengthy data engineering work.
Upload a file, and the platform automatically detects column types, identifies patterns, and prepares your data for analysis.
Connect a database, and your data stays in sync without manual exports.

If a database connector you need isn't supported, let us know as we develop new ones based on customer requirements. If you can't wait, you can optionally just export the data from its original source and upload it directly into the platform.
Dataset Management
Once your data is in RootCause.ai, you can explore it, keep it fresh, and organize it for your team.
Viewing a Dataset
Click on any dataset to see its full details:
Schema (columns and data types)
Data preview (first rows)
Statistics (row count, column distributions)
Connection details (for connected sources)
(SCREENSHOT: Dataset detail view with schema panel, data preview table, and statistics sidebar)
Refreshing Data
For connected data sources, keeping data current is straightforward:
Sync Now – Click to manually refresh the data immediately
Schedule Sync – Set automatic refresh intervals (hourly, daily, weekly)
When a sync runs, RootCause.ai pulls fresh data from the source and updates all Data Views and analyses that depend on it.
(SCREENSHOT: Sync settings panel showing schedule options with last sync timestamp)
Renaming and Organizing
As your workspace grows, organization becomes important:
Click the dataset name to rename it—use names that describe the contents, not the source
Use folders to organize datasets by project, department, or topic
Add descriptions to help team members understand what each dataset contains and where it came from
Schema Detection
RootCause.ai automatically analyzes your data to detect column types. This matters because causal discovery algorithms treat numbers, categories, and dates differently.
Number
Integers and decimals (revenue, counts, measurements)
Text
Strings and categorical values (names, IDs, labels)
DateTime
Dates and timestamps (order dates, event times)
Boolean
True/false values (flags, binary indicators)
Category
Columns with limited unique values (status, region, tier)
Adjusting Types
Automatic detection is usually correct, but sometimes context matters. A column of ZIP codes might be detected as numbers when it should be categories. A date stored as text might need conversion.
If a column is detected incorrectly:
Open the dataset
Click on the column type
Select the correct type from the dropdown
Changes are applied to the dataset
(SCREENSHOT: Column type dropdown showing available type options)
Best Practices
File Size
RootCause.ai handles large files well, but format matters:
Files up to several GB can be uploaded directly
For very large files, use Parquet format—it's compressed and faster to process
For massive datasets (tens of GB), consider using cloud storage connectors (S3, Azure) which stream data more efficiently
Data Quality
Better data in means better insights out:
Clean your data before uploading when possible—remove test records, fix obvious errors
Use consistent date formats within columns
Ensure column headers are meaningful names that describe the content
Naming Conventions
Future you (and your teammates) will thank you:
Use descriptive dataset names:
Sales_Transactions_2024_Q1beatsdata_export_final_v2Include date ranges or versions when relevant
Avoid special characters that might cause issues
Next Steps
Once your data is uploaded, you're ready to start preparing it for analysis:
Create a Data View to transform and combine your datasets
Tag columns with Ontology Concepts to link related data across sources
Build a Digital Twin to discover causal relationships
Last updated

