Point-in-Time Backtesting
Construct survivorship-bias-free factor portfolios using knowledge_at timestamps and PIT universe snapshots.
The Look-Ahead Bias Problem
Most financial datasets don't tell you WHEN a data point became available. If you use a company's 2023 annual results (filed March 2024) in a January 2024 backtest, you've cheated — that information wasn't available yet. Valuein's knowledge_at field records the exact SEC acceptance timestamp for every fact, enabling rigorous PIT backtesting.
Filtering by knowledge_at
Always filter by knowledge_at <= rebalance_date when selecting signals for a historical portfolio.
from valuein_sdk import ValueinClient, ValueinError
rebalance_date = "2022-01-01"
try:
with ValueinClient() as client:
# PIT-safe revenue for all companies as of 2022-01-01
revenue_pit = client.query(f"""
SELECT entity_id, numeric_value AS revenue, knowledge_at
FROM fact
WHERE standard_concept = 'TotalRevenue'
AND fiscal_period = 'FY'
AND knowledge_at <= TIMESTAMP '{rebalance_date}'
QUALIFY ROW_NUMBER() OVER (
PARTITION BY entity_id ORDER BY knowledge_at DESC
) = 1
""")
print(revenue_pit.head())
except ValueinError as e:
print(f"Error: {e}")Survivorship-Bias-Free Universe
Never use today's S&P 500 list for historical backtests. Use get_pit_universe to get the exact constituents at each rebalance date.
from valuein_sdk import ValueinClient, ValueinError
# MCP: get the exact S&P 500 on Jan 1, 2020
# get_pit_universe(as_of_date="2020-01-01", index="SP500")
# Python SDK:
try:
with ValueinClient() as client:
universe_2020 = client.query("""
SELECT r.entity_id, r.symbol, r.name, r.sector
FROM references r
JOIN index_membership im ON r.security_id = im.security_id
WHERE im.index_name = 'SP500'
AND im.start_date <= DATE '2020-01-01'
AND (im.end_date IS NULL OR im.end_date > DATE '2020-01-01')
""")
print(f"S&P 500 universe on 2020-01-01: {len(universe_2020)} companies")
except ValueinError as e:
print(f"Error: {e}")Full Monthly Rebalance Loop
Combine PIT universe + PIT signals to build a proper factor backtest.
import pandas as pd
from valuein_sdk import ValueinClient, ValueinError
rebalance_dates = pd.date_range("2018-01-01", "2023-12-01", freq="QS")
portfolio_returns = []
try:
with ValueinClient() as client:
for date in rebalance_dates:
date_str = date.strftime("%Y-%m-%d")
# Step 1: PIT universe on this date
universe = client.query(f"""
SELECT r.entity_id FROM references r
JOIN index_membership im ON r.security_id = im.security_id
WHERE im.index_name = 'SP500'
AND im.start_date <= DATE '{date_str}'
AND (im.end_date IS NULL OR im.end_date > DATE '{date_str}')
""")
# Step 2: Latest known revenue for universe
ids = ','.join(repr(x) for x in universe.entity_id.tolist())
signals = client.query(f"""
SELECT entity_id,
numeric_value AS revenue,
knowledge_at
FROM fact
WHERE standard_concept = 'TotalRevenue'
AND fiscal_period = 'FY'
AND knowledge_at <= TIMESTAMP '{date_str}'
AND entity_id IN ({ids})
QUALIFY ROW_NUMBER() OVER (
PARTITION BY entity_id ORDER BY knowledge_at DESC
) = 1
""")
# Step 3: Rank by revenue growth (simplified)
signals['rank'] = signals['revenue'].rank(pct=True)
top_quintile = signals[signals['rank'] > 0.8]
portfolio_returns.append({'date': date, 'n': len(top_quintile)})
except ValueinError as e:
print(f"Error: {e}")Up next
Building a Financial Agent with MCP
Configure Claude or Cursor to query SEC data via the Valuein MCP Server. Write prompts that generate investment research, screen for opportunities, and analyze risk.