Microsoft Graph API throttling for contact sync: what IT admins need to know

How Graph API throttling actually works for /users and /contacts endpoints, why naive sync loops fail at 500+ mailboxes, and the four patterns (batching, delta, backoff, concurrency) that keep you under the limits.

Updated 2026-04-20 · 6 min read

How Graph throttling actually works

Microsoft Graph doesn’t expose a single “requests per second” number. It enforces a tiered set of limits that vary by endpoint, tenant size, app permissions, and Microsoft’s current capacity on the backend. For contact sync you care about three of them:

  • Per-app, per-mailbox: ~10,000 requests / 10 minutes against Outlook endpoints.
  • Per-app, per-tenant: concurrent-request and aggregate ceilings that scale with tenant size.
  • Resource-specific: separate limits for /users directory reads vs. /users/{id}/contacts writes.

When you hit any limit, Graph returns HTTP 429 with a Retry-After header telling you how long to wait (usually between a few seconds and a few minutes). Ignoring Retry-After and retrying immediately is the single fastest way to get an entire tenant blacklisted for hours.

The naive pattern that fails at scale

Most first-pass contact-sync scripts look like this:

foreach ($mailbox in $allMailboxes) {
  foreach ($contact in $allContacts) {
    POST /users/{mailbox}/contacts  # 1 request per contact per mailbox
  }
}

With 500 mailboxes and 500 contacts per mailbox, that’s 250,000 requests every sync cycle. You will be throttled within minutes.

The four patterns that keep you under the limit

1. JSON batching

Graph supports batching up to 20 operations per HTTP request via /$batch. A 500×500 sync drops from 250,000 requests to 12,500 — a 20× reduction in API pressure with zero logic change to the individual operations.

2. Delta queries

Never do full scans. /users/delta and /users/{id}/contacts/delta return only what changed since the last sync token. The first sync returns everything; subsequent syncs typically return 0–5% of the dataset. This alone eliminates 95% of steady-state traffic.

3. Retry-After-aware exponential backoff

When a 429 comes back, sleep for the Retry-After value. For transient 503/504 responses (not explicit throttling), use exponential backoff: 1s, 2s, 4s, 8s, capped at 60s, with full jitter. Never retry on 400-class errors other than 429.

4. Bounded concurrency

Parallelism helps throughput up to a point, after which it just increases the chance of 429s. A bounded semaphore at roughly 10 concurrent mailbox operations is the sweet spot for most tenants. CYNC uses 10 by default, sized to stay comfortably under Graph’s per-app-per-mailbox ceiling on directories into the thousands of users.

What CYNC does out of the box

Implementing these four patterns correctly takes real engineering effort and ongoing maintenance as Graph evolves. CYNC does all four by default: 20-op JSON batching, delta on both /users and /contacts, Retry-After-aware retry handlers, and a bounded concurrency semaphore. See the features page for the full implementation breakdown.

What to monitor

  • Ratio of 429 responses to total responses (should be < 0.1%).
  • Total delta-query payload size per cycle (should stabilize after initial sync).
  • Batch fill rate (are you averaging close to 20 ops per batch, or wasting slots?).
  • P95 sync cycle duration — spikes usually mean backoff events.

Stop fighting Graph throttling

Batching, delta, retry, and bounded concurrency — all built in. Free for up to 10 users.