Every data engineer working in regulated industries learns to build for compliance. What most of them don't learn until they've shipped a few years of reports is that compliance has a time dimension that breaks naive architectures.
When you submit a GEV report to EGM, a HAYMER record to the insurance association, or a state contribution (devlet katkısı) calculation to the Treasury, you are not making a single permanent statement. You are making a statement that the regulator may reinterpret later, sometimes years later, using rules that did not exist when you submitted it. And when they do, you are expected to produce the corrected version — not the original.
This is the retroactive regulatory change problem, and almost every pipeline I have seen in Turkish finance is architecturally unprepared for it.
Two Different Compliance Problems
There are two distinct questions a regulator can ask about your historical data:
- Was this submission correct under the rules in force at submission time?
- Does this submission match the rules as currently defined?
Most teams build systems that answer the first question well. They log what was sent, when, with what schema, and against which validation logic. They can prove they were compliant on the day they submitted.
The second question is the one that actually shows up in audits, reconciliation requests, and correction cycles. And it is a fundamentally different engineering problem, because it requires reconstructing historical inputs against rule sets that did not exist at the time those inputs were created.
A Concrete Example from State Contribution
The devlet katkısı pipeline has been through several interpretation cycles since 2013. A simple case: the definition of what counts as an "eligible contribution" for matching purposes has been clarified, narrowed, and re-expanded multiple times. Edge cases around partial-month entries, employer-sponsored conversions, and refund reversals have all been redefined retroactively.
When a new interpretation lands, you typically get a circular that says, in effect: recalculate the last N months under the new rules, identify discrepancies, submit corrections, and reconcile the matching amounts already credited to participants.
If your pipeline architecture looks like this:
- Source systems → transformation → submission file → archive
then you are in trouble. Because the archive contains the output, not the inputs in their original form, and certainly not in a form that can be replayed against a different rule set. You end up reconstructing from production OLTP systems whose state has moved forward, which is exactly when teams discover that someone modified a participant record three months ago and there is no audit trail granular enough to undo it for replay.
What HAYMER and GEV Taught Me
HAYMER and GEV have similar dynamics. The technical schema stays relatively stable, but the semantic interpretation of fields drifts. A policy status code that meant one thing in 2019 gets a clarified definition in 2022, and now your 2019–2021 submissions are technically wrong by the 2022 rules — even though they were correct at the time.
When the regulator asks for a corrected resubmission, you need three things that most pipelines do not preserve:
- The raw input state as it existed at submission time, not just the transformed output
- The transformation logic version that produced the original submission
- A way to apply the new transformation logic to the old raw input state without contaminating it with current production state
Missing any one of these turns a one-week correction project into a six-month archaeology dig.
The Architectural Shift
The lesson, after going through enough of these cycles, is that submission pipelines should be modeled as bitemporal systems. You need to track two timelines independently:
- Business time: when the event actually occurred (when the contribution was made, when the policy was issued)
- System time: when your system learned about it, and which rule version was applied
Concretely, this means:
- Immutable raw input snapshots at submission time. Not the transformed output — the inputs. Stored separately from production OLTP so subsequent updates don't corrupt them.
- Versioned transformation logic, deployed as code with explicit version tags attached to every submission. "This file was produced by rule set v4.7" should be queryable.
- Replay capability as a first-class feature. You should be able to point any version of the transformation logic at any historical input snapshot and produce a deterministic output. This is non-negotiable and almost nobody builds it from day one.
- Diff tooling that compares the original submission against a replay under new rules, so you can produce defensible correction reports rather than wholesale resubmissions.
None of this is exotic technology. It is the same pattern as event sourcing, with the regulatory rule set treated as a versioned reducer. The hard part is not technical; it is convincing the business that the extra storage and complexity are worth it, before the first retroactive change forces the conversation.
What This Costs If You Skip It
The teams that build only for submission-time compliance pay for it in three predictable ways:
- Correction cycles become engineering projects, not operational tasks. Every retroactive rule change triggers a multi-week effort involving production data extracts, manual reconciliation, and risk of further errors during the reconstruction itself.
- Audit defensibility erodes over time. You can explain what you submitted, but you cannot easily demonstrate why it was the correct output of a specific rule set applied to a specific input state. "Trust us, this was the logic at the time" is not an acceptable answer.
- Reconciliation with the regulator becomes adversarial. When their recalculation disagrees with yours and you cannot replay your own history, you lose the argument by default.
The Practical Takeaway
If you are building or maintaining a regulatory pipeline in Turkish pension, insurance, or any similarly regulated domain, ask one question of your current architecture: can you, today, take the inputs from a submission made eighteen months ago and replay them through a rule set you will write next year, producing a deterministic and defensible output?
If the answer is no, you are not building for compliance. You are building for submission. Those are not the same thing, and the regulator is the one who decides when the difference matters.