Gemma4 12B 如何跑在 16G 显存上？

Google 在博客里专门强调了 Laptop ready: Small enough to run locally with just 16GB of VRAM or unified memory.

这是怎么做到能在 16G 显存上跑的？
还是说 BF16 的不能跑，要 FP8 量化的才行？但这种量化之后能在 16G 卡上跑的模型很多了，还有很多参数量更大的模型。

v2ex

创建于 2026年2月27日

公开

访客

贡献

版主

u/alive_fighter6701