Infrastructure on Guru Dude

Infrastructure on Guru Dude https://shane.greaves.casa/categories/infrastructure/ Recent content in Infrastructure on Guru Dude Hugo -- 0.131.0 en shane@greaves.casa (Shane Greaves) shane@greaves.casa (Shane Greaves) Sun, 31 May 2026 18:30:00 -0500 Fixing Empty Responses from a Local LLM https://shane.greaves.casa/posts/2026-05-31-fixing-empty-responses-from-a-local-llm/ Sun, 31 May 2026 18:30:00 -0500shane@greaves.casa (Shane Greaves) https://shane.greaves.casa/posts/2026-05-31-fixing-empty-responses-from-a-local-llm/ The Symptom I spent some time chasing a frustrating failure mode in a self-hosted agent stack: the model was clearly alive, but some requests came back empty, or with enough hidden reasoning overhead that the whole system felt sluggish. The confusing part was that the usual “is the service up?” checks all looked fine. The API responded. The model was loaded on the GPU. Short prompts worked. Health checks passed. But once the prompts got larger, the system started to misbehave in ways that were hard to separate: