Hacker News logo

Ask HN: What's your biggest LLM cost multiplier?

6 points
by teilom
2 days ago
5 comments
"Tokens per request" has been a misleading cost model for us in production. The real drivers seem to be multipliers: retries/429s, tool fanout, P95 context growth, and safety passes.

What’s been the biggest cost multiplier in your prod LLM systems, and what policies worked (caps, degraded mode, fallback, hard fail)?


5 comments

Loading...

Almost there! We're setting everything up for you.

Built by Troy Ciesco
Hacker News API