• 6 Posts
  • 18 Comments
Joined 5 years ago
cake
Cake day: June 30th, 2020

help-circle


  • It really depends on how you quantize the model and the K/V cache as well. This is a useful calculator. https://smcleod.net/vram-estimator/ I can comfortably fit most 32b models quantized to 4-bit (usually KVM or IQ4XS) on my 3090’s 24 GB of VRAM with a reasonable context size. If you’re going to be needing a much larger context window to input large documents etc then you’d need to go smaller with the model size (14b, 27b etc) or get a multi GPU set up or something with unified memory and a lot of ram (like the Mac Minis others are mentioning).











  • I think the point about the 3 year upgrade cycle being optimal unless you are a real enthusiast is right on the money. I’d say I’m moderately interested in tech improvements in this space and it seems ideal for me as I will be considering an upgrade from an iPhone 13 Pro (likely informed by how gimmicky or actually useful the Apple Intelligence stuff becomes and if I can offload my ChatGPT Pro subscription). If you’re really not a huge phone tech person or looking for the best camera quality then every 4 years is probably completely reasonable as well. At that point, the battery life is probably starting to suffer as well…