Every engineering team eventually writes a runbook that says something like: "On failure, restore to last known good snapshot and replay." It is the foundational assumption of modern disaster recovery. Database point-in-time restore, event sourcing with replay, blue/green deployments with rollback — they all rest on the same premise: yesterday's state is recoverable, and the world will tolerate you re-deriving the present from it.
In pension and insurance pipelines, that premise is wrong. Not partially wrong. Structurally wrong.
The Prior State Has Already Left the Building
When a BES (Bireysel Emeklilik Sistemi) contribution batch is processed, the resulting position is not a private fact inside your data warehouse. Within hours, sometimes minutes, that state has been:
- Submitted to EGM (Emeklilik Gözetim Merkezi) as part of the daily reconciliation file
- Used by the state contribution engine to calculate the 30% devlet katkısı entitlement
- Reflected in the participant's mobile app as a legally binding balance
- Referenced by intermediary distribution channels for commission accrual
- Captured in the fund's NAV calculation for that valuation date
If at 14:00 you discover that the 09:00 batch double-counted a transfer-in for 4,000 participants, the conventional playbook says: restore the 08:55 snapshot, fix the bug, replay. That playbook assumes you control all consumers of the intermediate state. You do not. The state contribution request is already in the queue. The NAV is already published. The participant has already seen the wrong number and possibly screenshotted it.
Rollback, in the e-commerce sense, would mean retracting facts that are no longer yours to retract.
What "Rollback" Actually Means Under Regulation
Regulators do not recognize the concept of rollback. They recognize correction. The distinction is not semantic.
A rollback says: "This never happened." A correction says: "This happened, here is what it should have been, here is the adjusting entry, here is the audit trail tying the two together." Insurance accounting under TFRS 17 and the pension regulations under the EGM framework are both built on the second model. Every state that ever became visible to a downstream system must remain reconstructable, with its lineage intact, even after it has been superseded.
This has concrete consequences for how pipelines must be designed:
- No destructive updates on positions that have been reported. Once a balance has been sent to EGM or used in a devlet katkısı calculation, the row is immutable. Corrections happen via offsetting entries with their own effective and booking dates.
- Bitemporal storage is not optional. You need both the business date (when the economic event occurred) and the system date (when you knew about it). A standard CDC pipeline that overwrites the current row is a regulatory liability.
- Replay is not idempotent in the way engineers assume. Replaying a contribution batch after a fix does not produce the same downstream effects, because some of those effects (state contribution claims, commission payments) have already been consumed.
The Failure Modes I Have Watched Teams Walk Into
A few patterns repeat across pension and life insurance organizations:
The "clean replay" trap. A team detects a bad batch, takes the platform offline, restores from snapshot, fixes the ETL, and replays. The participant balances now match the corrected truth — but the EGM submission from the bad run is still on file, and the daily reconciliation now shows a mismatch the team cannot explain because they deleted the evidence of what they originally sent.
The retroactive NAV problem. Fund accounting discovers a pricing error from three valuation dates ago. The instinct is to recalculate NAV from that date forward and update unit counts. But participants who exited between the error date and discovery date transacted at the wrong NAV, and their cash settlements are final. The "correct" history is not implementable; what is implementable is a compensation entry, booked today, with a clear reference to the erroneous valuation.
The deployment rollback that breaks compliance. A new pricing engine version goes live, processes a day of premiums, and a defect is found. Reverting the deployment is trivial. Reverting the policies that were issued under the new version's calculations — already bound, already on the regulator's product registry — is not. The deployment artifact and the data artifact have different rollback semantics, and treating them the same is how you end up issuing policies you cannot legally cancel.
What Actually Works
The pipelines that survive audit are built on a different mental model:
- Append-only by default. Corrections are new events, not mutations of old ones. The entire transaction log is the source of truth, not the latest snapshot of it.
- A clear boundary between internal and externalized state. Anything that has crossed the boundary to EGM, the state contribution engine, the participant-facing channel, or the fund accounting NAV is treated as fact. Internal staging state can be rebuilt; externalized state can only be corrected.
- Compensating workflows as first-class citizens. Every batch process needs a documented reversal procedure that produces offsetting entries, not deletions. This is closer to how accounting has always worked than how data engineering usually thinks.
- Replay tested against externalized consumers. Before you replay a batch, you confirm what downstream systems will accept the replay as a correction versus a duplicate. EGM, in particular, has specific reversal record types that must be used; sending a new positive submission to cancel an old one is not the same thing.
The Uncomfortable Conclusion
Most recovery tooling sold to enterprises today — snapshot-based DR, point-in-time restore, GitOps-style deployment rollback — was built for systems where the database is the system of record and nobody outside has acted on its intermediate states. Pension and insurance data does not work that way. The system of record is jointly held with the regulator, the participant, and several intermediaries, and they have already moved.
Designing a pension pipeline as if you can roll it back is not a technical shortcut. It is a compliance bet that you will never actually need to recover. That bet eventually loses, and the cleanup is not an engineering problem at that point — it is a legal one.