How WikiWayne builds local-AI guides
Guides focus on open-weight models and self-hosted inference: Ollama, llama.cpp, vLLM, ComfyUI workflows, and the hardware that actually runs them. We prioritize setups readers can reproduce on consumer or prosumer GPUs rather than cloud-only demos.
VRAM tables and fit charts reflect model size, quantization (Q4, Q5, Q8, and similar), and context length assumptions stated in each article. When we quote tokens per second or load times, the post names the GPU, driver generation, and stack version used for that run. We do not publish synthetic benchmark scores or third-party “lab results” we cannot tie to a documented test.
Release notes, upstream documentation, and issue trackers inform news and how-to posts. When a vendor changes licensing or deprecates a model, we update the guide and bump updatedAt where the change affects recommendations. Corrections are noted in the article when material facts change.
Affiliate links never determine which tools we cover or how we rank them. See editorial standards and disclosures for policy detail.