Ask HN: What's your biggest LLM cost multiplier?

6 points

by teilom

2 days ago

"Tokens per request" has been a misleading cost model for us in production. The real drivers seem to be multipliers: retries/429s, tool fanout, P95 context growth, and safety passes.

What’s been the biggest cost multiplier in your prod LLM systems, and what policies worked (caps, degraded mode, fallback, hard fail)?

5 comments

Almost there! We're setting everything up for you.