Find any data assetin your stack
Search across every database, view, owner and lineage edge — with PII tags, freshness signals and access trails out of the box. The catalog your analysts actually use.
// facets
// stats
results
126
scanned
10.4k
// hits
Discovery that explains itself
Every hit comes with owner, freshness, lineage and policy context — no spelunking required.
Auto-scan every source
Continuous catalog scanner walks your databases, file stores and views — schemas, owners and freshness in one index.
Classify what matters
Detect PII, owners and quality signals automatically. Tag once, every consumer respects it.
Lineage as a fact
Trace any column from raw source to published view. Search returns lineage breadcrumbs, not just rows.
From sprawl to a searchable index
Continuous scan, classify and rank — the catalog updates itself.
Scan
10+ engines · cron or webhook driven
Index
schemas · views · owners · freshness
Classify
pii · ownership · quality · sensitivity
Search
find · rank · explain · open in studio
One search API across every engine
Same ranking, same filters — whether you call it from Python, Node, REST or SQL.
from sofi import Sofi
sofi = Sofi(api_key="YOUR_KEY")
hits = sofi.search(
query="customer 360",
scope=["views", "tables"],
filters={"owner": "data-platform", "tag": "pii"},
limit=20,
)
for hit in hits:
print(hit.path, hit.score, hit.lineage)Search powers the boring-but-critical work
The unglamorous wins — onboarding, audits, deprecation — that compound month over month.
// pattern
Onboarding
New analyst types a question, finds the canonical view, opens lineage and queries in one click. No more #data-help threads.
// pattern
Audit & compliance
List every column tagged pii across the stack, prove who has access, export the trail. LGPD reviews stop being projects.
// pattern
Reuse over rebuild
Search ranks existing views ahead of raw tables. Stop re-deriving customer_ltv in five different dashboards.
// pattern
Impact analysis
Before deprecating a table, see every downstream view, dashboard and consumer. Ship breaking changes with confidence.
Search that keeps up with your stack
Sub-25 ms latency, automatic classification, lineage at the edges.
<24 ms
search latency p95
Inverted index lives in Postgres + Redis. Queries return ranked results before the user finishes typing.
10k+
indexed assets
Tables, views, columns, owners, dashboards and lineage edges — all in one searchable graph.
Auto
PII classification
Pattern + name heuristics tag email, cpf, phone and document fields out of the box. Override per tenant.
Live
freshness signals
Each asset shows last-seen-at, owner ping status and source health — no more querying stale tables.
Questions about search
What data leads ask before adopting Search as the catalog backbone.
A worker connects to each registered datasource on a schedule (or webhook trigger), reads metadata only — table names, columns, types, last_modified — and updates the catalog. No row data is read unless you opt-in to sampling.
// ready to discover
Make every dataset findable, this week.
Connect a source, run the scanner, watch the catalog populate. Lineage and PII tags appear automatically.