Selecting Data

RootCause.ai adapts rapidly to your data in its current format, without requiring lengthy data engineering work.

Upload a file, and the platform automatically detects column types, identifies patterns, and prepares your data for analysis.

Connect a database, and your data stays in sync without manual exports.

If a database connector you need isn't supported, let us know as we develop new ones based on customer requirements. If you can't wait, you can optionally just export the data from its original source and upload it directly into the platform.

Dataset Management

Once your data is in RootCause.ai, you can explore it, keep it fresh, and organize it for your team.

Viewing a Dataset

Click on any dataset to see its full details:

Schema (columns and data types)
Data preview (first rows)
Statistics (row count, column distributions)
Connection details (for connected sources)

(SCREENSHOT: Dataset detail view with schema panel, data preview table, and statistics sidebar)

Refreshing Data

For connected data sources, keeping data current is straightforward:

Sync Now – Click to manually refresh the data immediately
Schedule Sync – Set automatic refresh intervals (hourly, daily, weekly)

When a sync runs, RootCause.ai pulls fresh data from the source and updates all Data Views and analyses that depend on it.

(SCREENSHOT: Sync settings panel showing schedule options with last sync timestamp)

Renaming and Organizing

As your workspace grows, organization becomes important:

Click the dataset name to rename it—use names that describe the contents, not the source
Use folders to organize datasets by project, department, or topic
Add descriptions to help team members understand what each dataset contains and where it came from

Schema Detection

RootCause.ai automatically analyzes your data to detect column types. This matters because causal discovery algorithms treat numbers, categories, and dates differently.

Detected Type

Description

Number

Integers and decimals (revenue, counts, measurements)

Text

Strings and categorical values (names, IDs, labels)

DateTime

Dates and timestamps (order dates, event times)

Boolean

True/false values (flags, binary indicators)

Best Practices

File Size

RootCause.ai handles large files well, but format matters:

Files up to several GB can be uploaded directly
For very large files, use Parquet format—it's compressed and faster to process
For massive datasets (tens of GB), consider using cloud storage connectors (S3, Azure) which stream data more efficiently

Data Quality

Better data in means better insights out:

Clean your data before uploading when possible—remove test records, fix obvious errors
Use consistent date formats within columns
Ensure column headers are meaningful names that describe the content

Naming Conventions

Future you (and your teammates) will thank you:

Use descriptive dataset names: Sales_Transactions_2024_Q1 beats data_export_final_v2
Include date ranges or versions when relevant
Avoid special characters that might cause issues

Next Steps

Once your data is uploaded, you're ready to start preparing it for analysis:

Create a Data View to transform and combine your datasets
Tag columns with Ontology Concepts to link related data across sources
Build a Digital Twin to discover causal relationships

PreviousData Management NextFile Uploads

Last updated 3 months ago

hashtagDataset Management

hashtagSchema Detection

hashtagBest Practices

hashtagNext Steps

Dataset Management

Schema Detection

Best Practices

Next Steps