A pension contract signed in 2005 is a promise that outlives every system, every architect, and every data dictionary that touched it. The policy is still on the books in 2026. The schema it was born into is not.
This is the part that data teams underestimate. In e-commerce or SaaS, a schema migration is a Friday afternoon Liquibase script. In pension and life insurance, the schema you used to issue a policy is part of the contractual and regulatory record. When the regulator asks how you computed the technical reserve in 2011, or when a beneficiary disputes a payout calculation from 2014, you do not get to answer with today's model.
Why long-dated contracts break schema discipline
A term life or unit-linked pension product has three structural traits that make schema versioning uniquely painful:
- Duration: 20 to 50 years of active state. The system that originated the contract is almost certainly decommissioned before the contract matures.
- Regulatory layering: Solvency II, IFRS 17, FATCA, CRS, local pension authority circulars, MASAK in Turkey, BaFin in Germany — each introduces fields, recalculations, or reclassifications that change what a "premium" or "reserve" means.
- Retrospective application: Many regulations require you to restate historical positions under the new framework. IFRS 17's transition was the obvious example; few teams realize they will do this exercise again.
The combined effect is that the data model is not a snapshot. It is a moving target with legal weight at every point in time.
The silent compounding
Most insurers handle schema change the same way:
- Add a new column when the regulator asks.
- Backfill it with a best-effort default for existing policies.
- Document the change in a Confluence page that nobody reads.
- Move on.
Each of these steps is individually defensible. The compounding effect is not. After ten years of this, you have a policy table where:
tax_residency_countrywas added in 2014 for FATCA and backfilled asTRfor everyone, including the German expat who became reportable in 2016.product_categorywas redefined in 2018 when the pension authority merged two product classes, but the old codes still appear in policies issued before the cutover.surrender_value_methodexists as a string field with seven distinct spellings of the same algorithm because three different teams wrote to it.- The
valid_from/valid_tocolumns on the reference tables were added in 2020, so anything before 2020 has no temporal boundary at all.
None of this is visible until a FATCA audit asks you to reproduce the reportable population as of December 31, 2017, using the rules and the data model that were in force on that date.
What "reconstruct the model" actually means
In a cross-border audit — FATCA, CRS, or a regulator-driven restatement — you are asked three things:
- What did you know on date X? Not what is in the system today, but what the operational systems contained on that date.
- Under what rules did you classify it? The schema version, the code lists, the calculation engines.
- Can you reproduce the output? Bit-for-bit, ideally. Approximately, at minimum.
If your only answer is "we have a backup tape from 2017 somewhere," you have already lost. Restoring a tape gives you data, not a runnable model. You need the DDL, the reference data as of that date, the calculation logic version, and the mapping from then-fields to now-fields.
Most insurers I have seen cannot do this past three years back. Some cannot do it past one.
What actually works
This is not solved by picking a fancier database. It is solved by treating schema as a first-class, versioned, time-bound artifact. Concretely:
- Bitemporal storage on every contract-relevant table. Two time axes:
valid_time(when the fact was true in the world) andsystem_time(when we recorded it). Not just on the policy table — on every reference table, code list, and rate table that feeds a calculation. - Schema-as-data. The DDL and the code list values for each regulatory regime should be queryable. When you compute reserves under IFRS 17 transition, you should be able to ask "what was the product taxonomy in force on 2015-06-30" the same way you ask any other question.
- Calculation engine versioning tied to schema versioning. The reserve calculation from 2011 referenced fields and code lists that existed in 2011. The engine version, the schema version, and the regulatory regime are a triplet. Store them together, deploy them together, retire them together.
- No silent backfills. When you add
tax_residency_countryin 2014, the value for a 2005 policy isunknown, notTR. The day you collect it, you record it with avalid_fromof the collection date, not the policy inception date. This is the single change that prevents the most damage. - Migration records, not migration scripts. A migration is a historical event. Store it: what changed, when, why, who approved, what the rollback would look like. The Liquibase changelog is necessary but not sufficient.
The honest part
None of this is greenfield work. If you are running a portfolio with policies from the 1990s on a core system that was modernized twice, you will not retrofit bitemporal storage across the board. You will not reconstruct the 2008 code list from memory.
What you can do is stop the compounding. Pick the next regulatory change — there will be one within twelve months — and treat it as the first entry in a properly versioned schema register. Backfill what you can defend. Mark the rest as unknown. The goal is not a perfect history. The goal is that five years from now, when the audit comes, the period after today is reconstructible.
The period before today is a negotiation. The period after today is a choice.