← Back

2026-07-01

The Re-enrollment Problem: Why Automatic Enrollment Campaigns Are the Hardest Batch Operation in BES

Every two years, when the BES auto-enrollment cycle comes around, someone in a management meeting will say it out loud: "It's just adding new participants. The enrollment logic already works."

That sentence is where the incident bridge starts.

Regulatory auto-enrollment and re-enrollment mandates are not a marketing operation. They are a synchronized, legally-timestamped mass policy inception event, and they expose every optimistic assumption your contribution pipeline, EGM reporting stack, and state contribution (devlet katkısı) reconciliation layer was built on. The enrollment logic itself is rarely the problem. The problem is everything downstream of it.

What Actually Happens in the Batch Window

On a normal business day, a BES operator ingests maybe a few hundred new participant records across dozens of employer files. Contribution files arrive throughout the month. State contribution requests are generated on a predictable monthly cadence. EGM feeds are incremental.

During an auto-enrollment or re-enrollment cycle, all of that changes in one night:

The enrollment engine handles it. It was designed to. The rest of the stack was not.

Where Things Actually Break

1. The Contribution Pipeline Assumes Temporal Spread

Most contribution processing pipelines were designed under the implicit assumption that policy inception dates are distributed across the calendar. Indexes, partitioning strategies, and staging table designs all quietly depend on this.

When 200,000 policies share the same effective date, you get:

The pipeline doesn't fail. It just gets very, very slow, and it gets slow at the exact moment the operations team needs it to be fast.

2. EGM Reporting Was Built for Deltas

EGM (Emeklilik Gözetim Merkezi) reporting stacks are typically designed around incremental change. New policy today, status change tomorrow, contribution posted the day after. The reporting layer builds its extracts around "what changed since the last successful run."

An auto-enrollment batch inverts this. Everything changed. The delta is the population. Reports that normally emit a few megabytes now emit gigabytes. Validation routines that iterate row-by-row hit timeouts. And because EGM submission windows are regulatory, you cannot simply "try again tomorrow."

I have seen teams discover, at 3 AM on the morning after enrollment, that their EGM extract job has a hardcoded row limit somewhere in a stored procedure written seven years ago by someone who no longer works at the company. That limit was 100,000. The batch was 340,000.

3. State Contribution Reconciliation Assumes Steady-State

Devlet katkısı reconciliation is the quiet killer. The logic that matches state contribution requests against Treasury responses, and reconciles rejections back to participants, was built assuming a normal monthly volume with predictable rejection patterns.

Auto-enrollment produces:

That last point is the one that catches everyone. A participant who opts out within the legal window is entitled to a full refund including any state contribution accrual. If your reconciliation layer processed the state contribution request before the opt-out was recorded — and it will, because the opt-out window extends past the first contribution cycle — you now have a manual unwind for every single opt-out.

At scale, "manual unwind for every opt-out" means a dedicated team for six months.

The Assumptions That Fail Silently

Going through incident postmortems from several enrollment cycles, the pattern is consistent. The failures are almost never in code that was written for enrollment. They are in code that was written years earlier under assumptions that were true at the time:

What Actually Works

From experience, the operational patterns that survive an auto-enrollment cycle look like this:

  1. Treat the enrollment batch as a first-class release event, not a business-as-usual run. Freeze other changes, staff the bridge, pre-warm caches, pre-allocate sequence ranges.
  2. Shadow-run the batch against production-scale synthetic data at least twice before the real date. Not UAT volume. Production volume. The bugs only appear at scale.
  3. Decouple the enrollment write from the downstream fan-out. Get the participant records committed with correct effective dates first. Let EGM notification, document generation, and welcome communications flow through async queues with backpressure. Regulatory clocks care about the record; the SMS can wait an hour.
  4. Pre-calculate the state contribution reconciliation impact. Model the opt-out unwind scenario before the batch runs. Know your worst case.
  5. Instrument the batch with cohort-level metrics, not just row counts. "350,000 rows processed" tells you nothing. "Cohort X: 12,000 policies, 98.2% with valid TCKN, 87 flagged for manual review" tells you what to do next.

The Real Lesson

Auto-enrollment is not hard because the enrollment logic is complex. It is hard because it is the only operation in BES that simultaneously exercises every downstream system at peak load with a hard regulatory deadline and legally binding effective dates.

Everything that has ever been approximate, quietly under-tested, or built around "normal daily volume" surfaces in the same batch window. And unlike most production incidents, you cannot roll back. The effective dates are legally established the moment the batch commits.

The teams that treat auto-enrollment like a marketing campaign spend the following quarter cleaning up. The teams that treat it like a coordinated mass-inception release, with the same rigor as a core system migration, spend that quarter working on something else.