New · SOFI private deployment is ready for enterprise rolloutTalk to us
[ SEARCH ][ CATALOG ][ LINEAGE ][ < 24 MS ]
endpoint · /search

Find any data assetin your stack

Search across every database, view, owner and lineage edge — with PII tags, freshness signals and access trails out of the box. The catalog your analysts actually use.

https://private.sofi.local/catalog/search?q=customer+360
200 OK · 18 ms
customer 360

// facets

scopeviews, tables
tenantacme
tagpii · governed
ownerany

// stats

results

126

scanned

10.4k

// hits

customer_360·prod.views
92
viewpiigoverned
customer_v2·staging.views
78
viewdraft
customer_raw·postgres.tables
64
tablepii
customer_events·mongodb.tables
48
tablehigh-volume
top hit · 3 sources federated · 6 columns · 1 policy
[ 01 / 06 ]What you get
// Capabilities //

Discovery that explains itself

Every hit comes with owner, freshness, lineage and policy context — no spelunking required.

Auto-scan every source

Continuous catalog scanner walks your databases, file stores and views — schemas, owners and freshness in one index.

Classify what matters

Detect PII, owners and quality signals automatically. Tag once, every consumer respects it.

Lineage as a fact

Trace any column from raw source to published view. Search returns lineage breadcrumbs, not just rows.

[ 02 / 06 ]How it works
// Flow //

From sprawl to a searchable index

Continuous scan, classify and rank — the catalog updates itself.

step · 01

Scan

10+ engines · cron or webhook driven

step · 02

Index

schemas · views · owners · freshness

step · 03

Classify

pii · ownership · quality · sensitivity

step · 04

Search

find · rank · explain · open in studio

[ 03 / 06 ]Developer surface
// Query the catalog //

One search API across every engine

Same ranking, same filters — whether you call it from Python, Node, REST or SQL.

# pip install sofi
from sofi import Sofi

sofi = Sofi(api_key="YOUR_KEY")

hits = sofi.search(
    query="customer 360",
    scope=["views", "tables"],
    filters={"owner": "data-platform", "tag": "pii"},
    limit=20,
)

for hit in hits:
    print(hit.path, hit.score, hit.lineage)
[ 04 / 06 ]Use cases
// What teams build //

Search powers the boring-but-critical work

The unglamorous wins — onboarding, audits, deprecation — that compound month over month.

// pattern

Onboarding

New analyst types a question, finds the canonical view, opens lineage and queries in one click. No more #data-help threads.

<1 dayto first query

// pattern

Audit & compliance

List every column tagged pii across the stack, prove who has access, export the trail. LGPD reviews stop being projects.

100%PII discoverability

// pattern

Reuse over rebuild

Search ranks existing views ahead of raw tables. Stop re-deriving customer_ltv in five different dashboards.

−42%duplicate views

// pattern

Impact analysis

Before deprecating a table, see every downstream view, dashboard and consumer. Ship breaking changes with confidence.

0blind deprecations
[ 05 / 06 ]Performance
// Numbers //

Search that keeps up with your stack

Sub-25 ms latency, automatic classification, lineage at the edges.

<24 ms

search latency p95

Inverted index lives in Postgres + Redis. Queries return ranked results before the user finishes typing.

10k+

indexed assets

Tables, views, columns, owners, dashboards and lineage edges — all in one searchable graph.

Auto

PII classification

Pattern + name heuristics tag email, cpf, phone and document fields out of the box. Override per tenant.

Live

freshness signals

Each asset shows last-seen-at, owner ping status and source health — no more querying stale tables.

[ 06 / 06 ]FAQ
// FAQ //

Questions about search

What data leads ask before adopting Search as the catalog backbone.

A worker connects to each registered datasource on a schedule (or webhook trigger), reads metadata only — table names, columns, types, last_modified — and updates the catalog. No row data is read unless you opt-in to sampling.

// ready to discover

Make every dataset findable, this week.

Connect a source, run the scanner, watch the catalog populate. Lineage and PII tags appear automatically.