Batch Overnight, or Real-Time Right Now: Choosing the Right Tempo for AI Processing

A thought experiment on why the most expensive-feeling engineering mistake in an AI system is often processing something in real time that never needed to be instant.

Batch Overnight, or Real-Time Right Now: Choosing the Right Tempo for AI Processing

This one's speculative — a thought experiment about design principles, not a report on something I've built.

Real-time is the default assumption, and it's often wrong

It's easy to build every AI-driven process as a real-time, on-demand call, because that's the simplest mental model — a request comes in, the system responds. A meaningful share of AI workloads don't actually need that immediacy, and processing them as if they did means paying real-time infrastructure costs and accepting real-time failure modes for work that had hours, or a full overnight window, to complete.

Batch processing buys real engineering slack

Work that can run overnight in a batch, rather than on demand in real time, can use cheaper compute pricing, tolerate a slower or retried model call without anyone waiting on it, and recover cleanly from a transient failure by simply retrying before the batch window closes — none of which is available to a real-time path where a slow or failed call is a customer-visible problem the instant it happens.

The actual question is when does the answer need to exist

The right tempo for a given piece of AI processing isn't a technology preference — it's a direct consequence of when the result actually needs to be available. If nobody needs the answer until tomorrow morning, computing it at 2am in a batch and computing it the instant the triggering event happens cost the same in outcome and very different amounts in engineering risk and infrastructure spend.

Audit the system for fake urgency

A useful exercise for any AI-heavy system is going through each real-time call and asking, honestly, whether anything downstream is actually waiting on that specific millisecond, or whether it was built real-time by default and nobody ever revisited the assumption. The answer is often that a meaningful share of it could move to batch with no one downstream ever noticing the difference — and with a real reduction in cost and operational risk.


I'm Jesse Myers — Marine veteran, 32 years in enterprise IT, now building production AI systems. This site is where I write about what I've actually built, and occasionally about ideas I haven't built yet but think are worth taking seriously.