General
Samsung Resaerch shrinks 30b parameter ai models to 3gb
I'm researching Samsung currently and they have some pretty amazing things in their development pipeline that will hit the market. The lab developed working tech to run ~30B-parameter models fully on-device by compressing them from ≳16 GB down to <3 GB! They are using aggressive quantization + a smart runtime that juggles CPU/GPU/NPU for that. They basically showed it is possible to run that on a smartphone without blowing RAM/VRAM limits, grilling the battery and you don't need constant cloud access! That means the real competition is shifting from "who has the biggest model?" to “who can extract the most usable intelligence per Joule and per GB on-device”with such things possible.
If there is interest for a more detailed explanation let me know in the comments!
https://www.techbuzz.ai/articles/samsung-reveals-how-it-s-shrinking-30b-parameter-ai-models-to-3gb