API Polling Best Practices: How to Avoid Rate Limits

David AlfordApril 2, 20268 min read

Postman’s 2023 State of the API Report found that only about a third of public APIs offer webhooks.[1] That means polling isn’t a design choice for most integrations you’ll ever build, it’s a requirement. And most polling code gets rate-limited within days of real traffic.

I’ve spent too many weeknights debugging a cron job that hammered a partner’s API until the token got banned. These five api polling best practices separate a production-grade poller from a cron loop. Follow them and you’ll stop getting banned.

Why Cron-Loop Polling Gets Rate-Limited

Cron-loop polling gets rate-limited because it does three things wrong at once: it polls at fixed intervals regardless of what the API is telling it, it re-fetches data it already has, and it runs on the minute so every tenant’s request lands on the API at the same instant. Production pollers fix all three.

The failure mode is almost always the same. A developer writes a setInterval or a cron entry, picks a number that feels reasonable (one minute, five minutes, whatever), and ships it. The code works fine in testing. It works fine on day one. Then traffic ramps, tenants get added, the API in question has a quiet hour followed by a busy hour, and suddenly every request is coming back 429 or the token is temporarily suspended.

The fix isn’t to slow the interval down until the bans stop. The fix is to build a poller that actually listens to what the API is saying. Five techniques do the heavy lifting.

Five API Polling Best Practices That Prevent Rate Limits

A production-grade poller tracks state with cursors so it never re-fetches seen data. It adapts its interval based on response signals. It uses conditional requests when the API supports them. It adds jitter so concurrent tenants don’t cluster. And it retries 429 responses exactly the way the HTTP spec says to. The rest of this post walks through each one.

1. Track State With Cursors, Not Timestamps

A cursor is a pointer the API gives you that says “start here next time.” Using a cursor instead of a since=<now-5min> timestamp means you never ask for data you already have, never miss an item during clock drift, and you can restart safely after a crash. There are three common cursor shapes.

Incremental ID. The API returns auto-incrementing numeric IDs and you store the highest one you’ve seen. Next poll, you ask for everything greater than that. Simple, verifiable, and the cursor must always move forward.
Timestamp. The cursor is an ISO-8601 string. Lexical comparison works because ISO-8601 sorts correctly as text. Fine until the API’s clock drifts against yours, which is why timestamp cursors belong to the API, not your server.
Opaque. The API hands you a string that encodes its own paging state. You can’t compare two opaque cursors to know which one is newer. You trust the API’s ordering and store exactly what it gave you.

The subtle bug cursors introduce: they have to move forward. If your code accidentally replays a stale cursor after a crash, you re-process the same items as fresh events. A poller has to validate that the new cursor is actually newer than the old one before persisting it. For opaque cursors that validation is impossible by inspection, so you fall back to trusting the API and making sure your dedup layer catches anything that slips through.

2. Let the API Tell You When to Poll

Adaptive polling means your interval changes based on what the API’s responses are telling you. When items arrive, poll faster. When responses come back empty, poll slower. When you hit N consecutive empty responses, reset to a base interval. Clamp both ends so you never exceed the API’s sustained rate or stretch the interval so wide that fresh events go stale.

The algorithm is four lines of logic. Divide the current interval by a factor when data arrives. Multiply it by a factor when the response is empty. Reset to the base after a configurable number of consecutive empties. Clamp to [minIntervalMs, maxIntervalMs] so you never oscillate out of the allowed band.

Shopify’s leaky bucket is the textbook case.[2] Standard Shopify stores give each app a 40-token bucket that refills at 2 tokens per second. A burst of 40 requests drains the bucket, and every subsequent call returns 429 until the bucket refills. An adaptive poller reads the X-Shopify-Shop-Api-Call-Limit header after each response, sees it climbing toward the ceiling, and widens its interval before hitting the wall. A cron-loop poller doesn’t look at the header and walks straight into the ban.

3. Use Conditional Requests to Skip Empty Polls

Conditional requests send a fingerprint of the last response (an ETag or a Last-Modified timestamp) and let the API respond with 304 Not Modified if nothing changed. The 304 is tiny, costs almost nothing, and on some APIs it doesn’t count against your rate limit at all. It’s the single biggest rate-limit saver available when the API supports it.

GitHub’s REST API is explicit about this. Responses with a 304 status don’t count against your primary rate limit.[3] If you’re polling a repository for new issues and nothing has changed since your last check, you send an If-None-Match header with the ETag from the previous response, the server returns 304 with an empty body, and your rate-limit budget is untouched. The poller can also short-circuit its extract, dedup, and dispatch cycle because there’s nothing to process.

Not every API supports conditional requests. Payment APIs usually don’t. A lot of niche SaaS APIs don’t. But when an API does support them, using them is non-negotiable. You’re leaving a free exemption on the table otherwise.

4. Add Jitter Before You Add Workers

Jitter is a small random offset added to each poll’s schedule. Without jitter, every subscription polling on a five-minute interval fires at :00, :05, :10, and every tenant’s request lands on the API at the same instant. With jitter, those requests spread across the interval and the API never sees a spike.

Amazon’s engineering team popularized the “full jitter” algorithm: sleep for a random value between 0 and the full backoff window, not for the backoff plus a small random offset.[4] It sounds counterintuitive. You’re spreading requests across the whole window instead of clustering near the target time. But the simulation in the AWS post shows full jitter substantially reduces both client work and server contention, and it clearly beats any fixed-offset schedule.

This matters more for agencies than anyone. If you’re polling Salesforce on behalf of 50 clients and every poll fires at the top of the minute, you’ve built a self-inflicted DDoS against your own business. Stagger the schedule per tenant with a random offset, and the problem disappears without adding a single line of infrastructure.

5. Retry 429s the Way the RFC Says

When a 429 response arrives, don’t guess. The server tells you what to do. RFC 9110 defines the Retry-After header as either a number of seconds or an HTTP date.[5] Read it, wait exactly that long plus a bit of jitter, and retry. Cap your retry count so a persistently throttled API doesn’t queue up requests forever.

The three common retry bugs that turn a temporary throttle into a banned token:

Retrying immediately. Most APIs increase the rate-limit penalty on repeated offenders. A retry inside the cooldown window deepens the hole.
Ignoring Retry-After. The server is telling you exactly when it will accept the next request. Respect it. Guessing shorter is how tokens get suspended.
Retrying forever. Without a budget, a persistently failing API turns into an ever-growing retry queue. Cap the retry count at two or three and surface the failure.

If your workflow fires a write in response to what polling found, the same rule applies in reverse. Send an idempotency key on every write, so a retry after a network timeout doesn’t double-charge a customer or create a duplicate record. Polling is usually a read operation, but the downstream steps it triggers rarely are.

Frequently Asked Questions

How often should you poll an API?

Start at the interval the API documentation recommends, or 60 seconds if there isn’t one, and let adaptive polling adjust from there. Don’t pick a fixed interval based on how “real-time” you want the data to feel. Pick one based on what the API tolerates, read the rate-limit headers on every response, and widen or narrow the interval based on what the headers say.

What does HTTP 429 Too Many Requests mean?

A 429 response means you’ve exceeded the API’s rate limit and the server is refusing the request. It’s not an error in your code, it’s the API telling you to slow down. Respect the Retry-After header in the response, wait that long plus a small random jitter, and retry with a capped budget. Do not retry immediately.

How do ETags reduce rate limit consumption?

An ETag is a fingerprint the API returns with each response, like a hash of the content. On your next request, you send the ETag back in an If-None-Match header. If nothing has changed, the API responds 304 Not Modified with an empty body. On GitHub, 304 responses don’t count against your primary rate limit at all, which makes polling almost free when nothing is happening.

Polling gets a bad reputation because cron-loop code gets banned and the postmortem writes itself. But a production poller isn’t a cron loop. It tracks state with cursors, adapts its interval from what the API is actually returning, uses conditional requests where available, staggers across tenants with jitter, and retries 429s the way the spec tells it to.

TaskJuice handles cursor progression, adaptive intervals, and 429-aware retries with jitter out of the box. You pick the endpoint and the cursor shape, and the platform tracks state, widens or narrows the interval from response signals, and retries rate-limited responses the way the RFC specifies. Conditional request headers and per-tenant staggered start times are available as configuration when the API you’re integrating supports them.

References

[1] The State of the API Report, Postman: postman.com/state-of-api/api-technologies/

[2] REST Admin API rate limits, Shopify: shopify.dev/docs/api/usage/rate-limits

[3] Best practices for using the REST API, GitHub Docs: docs.github.com/en/rest/using-the-rest-api/best-practices-for-using-the-rest-api

[4] Exponential Backoff And Jitter, AWS Architecture Blog: aws.amazon.com/blogs/architecture/exponential-backoff-and-jitter/

[5] RFC 9110, HTTP Semantics, Retry-After: www.rfc-editor.org/rfc/rfc9110#name-retry-after