Oh and I typically get 16-20 tok/s running a 32b model on Ollama using Open WebUI. Also I have experienced issues with 4-bit quantization for the K/V cache on some models myself so just FYI
Oh and I typically get 16-20 tok/s running a 32b model on Ollama using Open WebUI. Also I have experienced issues with 4-bit quantization for the K/V cache on some models myself so just FYI
It really depends on how you quantize the model and the K/V cache as well. This is a useful calculator. https://smcleod.net/vram-estimator/ I can comfortably fit most 32b models quantized to 4-bit (usually KVM or IQ4XS) on my 3090’s 24 GB of VRAM with a reasonable context size. If you’re going to be needing a much larger context window to input large documents etc then you’d need to go smaller with the model size (14b, 27b etc) or get a multi GPU set up or something with unified memory and a lot of ram (like the Mac Minis others are mentioning).
Is it possible to use StreetComplete on iOS?
I think we can all agree that modifications to these models which remove censorship and propaganda on behalf of one particular country or party is valuable for the sake of accuracy and impartiality, but reading some of the example responses for the new model I honestly find myself wondering if they haven’t gone a bit further than that by replacing some of the old non-responses and positive portrayals of China and the CPC with a highly critical perspective typified by western governments which are hostile to China (in particular the US). Even the name of the model certainly doesn’t make it sound like neutrality and accuracy is their primary aim here.
I used to daily drive Ubuntu some years ago for work/personal use but have been back on Win 10 primarily for the last 4-5 years. I was considering trying to go back due to how much Windows sucks (despite some proprietary software only being available on it) but remembering the trouble I had with some networking/printer drivers and troubleshooting those issues and then seeing this article Is definitely making me reconsider…
Yeah I use voyager pretty much exclusively on my iPhone so maybe I should request a feature like that there? Seems like it would be something that many people would appreciate. Not sure why I end up seeing posts with -10, -15 votes… Those are generally trash haha
It would be cool if they would provide some useful statistics about the aggregated data as well. Maybe something like showing the percentile for pay to the ED/CEO or for the total compensation compared to other organizations in the sector.
I didn’t scour the site so maybe this does exist.
I think the point about the 3 year upgrade cycle being optimal unless you are a real enthusiast is right on the money. I’d say I’m moderately interested in tech improvements in this space and it seems ideal for me as I will be considering an upgrade from an iPhone 13 Pro (likely informed by how gimmicky or actually useful the Apple Intelligence stuff becomes and if I can offload my ChatGPT Pro subscription). If you’re really not a huge phone tech person or looking for the best camera quality then every 4 years is probably completely reasonable as well. At that point, the battery life is probably starting to suffer as well…
Not sure. I’ve never heard of that one. I’ll check it out though.
That would also be great
Yeah I’d love one for how restaurants and companies treat their workers. There was a group that made something like that for the NYC area years back I think but I don’t believe it ever took off. I believe it was released by ROC.
I there a changelog available?
I had a similar issue on watchOS 9 and apple after many attempts would not help me revert the update which seems to have caused the issue. I ended up having to pay for Apple care and get a replacement watch, which I refuse to let update. Pretty annoying. I definitely won’t be allowing my current one to update to version 10 now. It’s sad that the bar is this low, but at least Apple doesn’t force you to update in this case (at least so far)
AMD only and not Nvidia? That’s what I was seeing based on a quick search. Unfortunately, I don’t have an AMD GPU.
This is impressive and interesting, but what about hardware ray tracing support? Proton has been very impressive but I thought that RT on DX12 was basically non-existent on Linux.
So my grandchildren will be more than likely be belters. Got it.
Here’s a non Google AMP link to the article: https://www.theregister.com/2023/07/01/chiplet_market/
Looks like it now has Docling Content Extraction Support for RAG. Has anyone used Docling much?