Everything looks fine at 4-bit
Summary
The discussion explores the trade-offs of running quantized local LLMs, specifically contrasting them with unquantized BF16 models. A fact-based test referencing moonwalkers is used to evaluate factual accuracy, concluding that even base models can have errors, emphasizing the need for verification.