← The Charter The Charterr

Why most PE firms are still getting
portfolio data architecture wrong.

The edge no longer comes from model access—it comes from proprietary context. After a decade of accumulating portfolio intelligence in spreadsheets and disconnected systems, the firms rebuilding their data architecture for the agentic AI era are pulling ahead.

Every private equity firm we speak with says roughly the same thing: they know their portfolio data is valuable, they know it should be working harder for them, and they know their current approach—spreadsheets, disconnected databases, systems that don't talk—is not sustainable. What they cannot agree on is what to do about it.

The problem is not new. For twenty years, PE firms have been accumulating data: operating metrics from portfolio companies, financial statements, deal histories, market comps, diligence documents, post-close integration plans. Most of it lives in email attachments, shared drives, or the institutional memory of analysts who left three years ago.

What is new is that the cost of this disorganization has changed. When models were proprietary, a good analyst with Excel could still add value. Now that models are commoditized, the analyst with better context—richer historical data, cleaner comps, a system that remembers every deal the firm has ever done—has the edge.

The firms that treated portfolio data as a strategic asset five years ago are not scrambling now. They are pulling ahead.

The three mistakes we see most often

After working with eighteen PE firms over the past two years, we have identified three architectural decisions that consistently cause problems down the line.

1. Building for reporting, not for intelligence

Most firms build their portfolio data systems to answer a known set of questions: What is the revenue of Company X? What are the EBITDA margins across the portfolio? How are our healthcare investments performing relative to industrials?

These are good questions. But they are backward-looking questions. The system optimizes for producing the quarterly LP report, not for surfacing insights that change how the team thinks about the next deal.

A data architecture built for intelligence looks different. It preserves granularity. It links deals to the analysts who ran them, the theses that drove them, the diligence questions that mattered, the post-close surprises that no one saw coming. It treats every data point as a potential training input for the next decision.

2. Treating portfolio companies as data sources, not collaborators

The default model is extraction: the PE firm asks the portfolio company CFO for a spreadsheet, the CFO emails it, an analyst copies it into the firm's system, and everyone moves on until next quarter.

This works until it doesn't. The format changes. The CFO leaves. The definitions drift. The firm ends up with five years of "revenue" that does not mean the same thing from one year to the next.

The firms that get this right treat portfolio companies as collaborators in a shared data architecture. They build lightweight APIs or standardized templates that make reporting easier for the CFO, not harder. They invest in making data contribution a service to the portfolio company, not just a compliance obligation to the GP.

3. Assuming the AI will fix it later

The most common thing we hear from firms that have deferred the data architecture work: "We'll wait for AI to get better at handling messy data."

This is backwards. AI does not fix bad data architecture—it amplifies it. If your system cannot answer "show me every consumer deal we've done in the last seven years where revenue growth exceeded 40% in year two," then an LLM built on top of that system will hallucinate an answer that sounds confident and is completely wrong.

The firms that are getting value from AI in diligence, deal sourcing, and portfolio monitoring all have one thing in common: they spent the previous three years cleaning up their data architecture so the models have something reliable to work with.

⁂ ⁂ ⁂

What the best firms are doing differently

The firms that are ahead did not wait for a perfect solution. They started with a minimum viable data architecture—usually a single source of truth for portfolio operating metrics—and iterated from there.

They also made a cultural decision: data quality is not the responsibility of the data team. It is the responsibility of the investment team. If a number goes into the system wrong, the partner signs off on it. This changes behavior fast.

Most importantly, they recognized that portfolio data is not a back-office problem. It is a competitive advantage that compounds. The firm that can tell an LP exactly how their last twelve consumer deals performed, broken down by vintage, check size, and sector, in under thirty seconds, is the firm that raises the next fund.

— Eleanor

Trace Analysis Private Equity Data Architecture
E
Eleanor Voss
Partner · New York

Eleanor co-founded Semperr in 2021. She leads the firm's work with private-equity clients and writes the Charter on matters concerning Trace and the House. Before Semperr she was a partner at a quantitative research firm in midtown.