For academics, PhD students & independent researchers

Research-grade SEC data — without the $24K terminal.

Point-in-time, survivorship-free, and traceable to the filing — with a published, re-derivable accuracy baseline. The catastrophe this removes: the result that doesn't replicate because the vendor quietly revised the data under your paper.

Point-in-time, as-first-reported data — kill look-ahead bias before it inflates your results.
Full survivorship-free universe including delisted, bankrupt, and merged entities.
Every fact traceable to its source filing for citation and peer review.
A published accuracy baseline — 19,607 S&P 500 annual filings pass all 35 accounting-identity checks, 0 failures, CI-gated and re-derivable from one DuckDB script.

Start free — no card Read the methodology

Built for

Researchers

Point-in-time accurate
Survivorship-bias-free
Every number cited to its filing

Works where you do

Python SDKBulk Data APIMCP Server

Recommended plan

Pro

$24K+

terminal cost, avoided

1993→now

reproducible history

100%

of S&P 500 annual filings pass all 35 accounting-identity checks

The pain points we remove

Rigorous research needs clean, point-in-time, reproducible data — but the standard sources are expensive, gated, and quietly mutable. Valuein is built for the opposite.

The cost wall

Bloomberg is ~$24K+/user/yr; Compustat and CRSP come through WRDS, gated to whoever holds a university subscription. Independent researchers are locked out.

Look-ahead bias baked into vendor data

Look-ahead bias is present in common Compustat products — using the wrong vintage silently inflates results in studies of fundamentals and returns.

Survivorship bias

Testing on current constituents overstates returns because the underperformers dropped out. You need the delisted and bankrupt names present.

Reproducibility broken by silent revisions

When a vendor readjusts its time series after the fact, the dataset under your published paper changes — and replication breaks.

EDGAR is free but not usable

EDGAR is free but not trivial to scrape, and raw XBRL needs heavy processing. DIY normalization eats months you'd rather spend on the research.

The grind we take off your plate

From the daily check-ins to the month-end scramble — this is the recurring work Valuein automates so you spend your hours on the thesis, not the data.

Every day

Write and run analysis code
Clean and normalize raw data
Debug coverage gaps and tag mismatches

Every week

Construct datasets and factor panels
Run regressions and backtests
Validate against look-ahead and survivorship traps

Month-end & earnings

Refresh panels with new filings
Version data for reproducibility
Document provenance for submission and peer review

What you can do with Valuein

Each job you need done, mapped to the exact capability that delivers it.

Affordable research-grade access

Free sample + S&P500 tiers, then Pro at $49/mo — no $24K terminal, no university WRDS gate.

Datasets · all tiers

Point-in-time, as-first-reported data

accepted_at on every fact and as_of PIT enforcement in the SDK kill look-ahead bias.

Python SDK PIT

Full survivorship-free universe

The complete SEC population keyed on CIK — active plus inactive — back to 1993.

Survivorship-free universe

Reproducible, provenance-tracked results

verify_fact_lineage traces each number to its filing; versioned Parquet vintages plus deterministic, typed tools mean the same inputs give the same output — years later.

verify_fact_lineage · versioned schema

A data-quality claim you can check

The accuracy baseline (19,607 S&P 500 annual filings passing all 35 published accounting identities, 0 failures) is published and re-derivable from one DuckDB script — cite it, or reproduce it.

Published accuracy baseline

Pre-normalized XBRL

~11,966 raw tags mapped to 292 canonical concepts — comparable out of the box.

Standardized concepts

Works where you do

One Bearer token reaches the same point-in-time data from your AI agent, your notebook, or your browser. Use the surface that fits the job.

Python SDK

Reproducible, PIT-enforced queries with citable provenance for every fact.

Explore

Bulk Data API

Download the full Parquet universe for offline panel construction.

Explore

MCP Server

Explore and prototype hypotheses conversationally before you code the study.

Explore

Research-grade SEC fundamentals without a $24,000 terminal or a WRDS login.

Point-in-time and survivorship-free by default — kill look-ahead bias before it inflates your Sharpe.

Reproducible by design: same inputs, same output, and every fact traces back to its filing.

Frequently asked

Can I cite Valuein data in a paper, and is it reproducible?

Yes. Every fact resolves to its source filing via verify_fact_lineage, and the Parquet schema is versioned so a given vintage is immutable — you can re-run the exact dataset that backed your results.

How accurate is the standardization — and can I verify the claim?

We publish a measured baseline: all 19,607 S&P 500 annual filings pass every one of 35 active published accounting identities, with 0 failures. It's CI-gated and re-derivable from one DuckDB script, so you can check it rather than take it on faith.

Do you offer academic or student access?

The sample and S&P500 tiers are free (the S&P500 tier is full history, 1993-present, for the index). Pro at $49/mo opens the full 19,000+ universe — a fraction of a WRDS seat. Reach out for classroom or research-group needs.

How do you handle look-ahead and survivorship bias?

Point-in-time acceptance timestamps prevent look-ahead, and the universe includes delisted/bankrupt/merged entities so it's survivorship-free — the two biases most likely to invalidate an empirical finance result.

What's the difference from raw SEC EDGAR?

EDGAR is free but raw — inconsistent XBRL tags, no standardization, painful to scrape at scale. We normalize ~11,966 raw tags into 292 canonical concepts and serve them point-in-time as columnar Parquet.

Research-grade SEC fundamentals without a $24,000 terminal or a WRDS login.

111M+ standardized SEC facts across 19,000+ companies, 1993–present. Free to start — no credit card.

Start free — no card View pricing

Methodology Point-in-Time guide Data catalog