Local AI and sustainability - any assessments / opinions yet?

m_andrasch · January 10, 2026, 9:36am

Will 2026 be the year of local AI use in coding, without subscription costs? And can we somehow use it to get a handle on environmental damage – or will it become even more inefficient and energy-intensive if everyone runs it on their own laptops instead of in well-cooled data centres?

These questions were brought to mind again today by this video from c’t 3003:

‘Local AI is now REALLY useful (and it runs on this hardware)’ (german)

In the last year, I often read “Local AI is fine for prompting/chatting, but not agentic usage & co”. This seemed to have changed a bit.

The models Qwen3 Coder 30B and Mistral Small 3.2 are mentioned, among others. The video suggests using the LM Studio tool instead of Ollama for downloading and running the models, but this is purely a matter of preference.

The local models can then be integrated into an IDE such as Visual Studio Code using plugins like Roo Code (Roo Code - Visual Studio Marketplace). I needed to increase the context size in my first test since it’s set to a low number by default.

I was also told about the Zed IDE (with OpenCode) and the Continue tooling (https://www.continue.dev/).

Does anyone happen to have any current, maybe scientifically based assessments of the sustainability of local AI?

Thanks very much in advance!

ArneTR · January 12, 2026, 12:34am

Hey @m_andrasch,

nice to see you also like the c`t 3003 videos A fan myself, but I seem to have missed this particular one.

What kind of scientific study are you looking for regarding sustainability? Since you will be running them locally only I guess really only the embodied stuff of your hardware and the direct power consumption matters. Water usage / cooling should not really be present and development / Training time could be hard to come by …? What do you say?
Since you were successful running the model already: What setup have you used? The 19 GB VRAM the model needs seem really high for consumer hardware. Did you have a special card already at home? What did you buy? Or are you running the model on CPU?

m_andrasch · January 18, 2026, 12:47pm

Hi, thanks for reply!

I had a MacBook Pro M3 available for testing at work and did a brief test with https://lmstudio.ai/ and Qwen 3 Coder. Connected LMStudio via RooCode in VS Code Roo Code - Visual Studio Marketplace

But hadn’t more time to test different scenarios unfortunately.

Regarding studies:

My main question is if we could achieve something like “sustainable AI prompting/agentic coding” on our own laptops / own hardware - or if this is a dumb idea, which does not scale and using a company-internal AI model in a (green energy powered) data centre is much more efficient.

Because my heart bleeds when I do my daily coding with AI models (in the cloud) currently - without knowing if I use coal plant powered data centres.

Training of models is another topic of course, but “using it daily” would be a first I guess.

But I’m no expert at all in all of this and I haven’t digged into the AI talks of ecocompute (yet).

ArneTR · January 19, 2026, 12:34am

Ahh I see. That clarifies the NVRAM question

For me the main question, before even the sustainability discussion can be openend, is can we get to a comparable setup. macOS is pretty unique here with the shared VRAM in two ways:

You will likely not be able to create the same setup on a linux / windows box as no one can afford the GPU / AI inference cards with the NVRAM needed for these savy bigger models
Mac NVRAM is not dedicated and thus also alters the whole embodied carbon discussion. Your hardware is not “sitting idle“ as not being dedicated only for AI jobs

Having said that I believe the discussion can only be comparably done if you try to answer the question: Can a macOS user with a beefy macBook pro be more sustainably by ditiching ChatGPT usage instead of using a cloud based model.

And for that I believe we can generate ready numbers:

You can just turn on powermetrics to measure a job on your system.
Just type `sudo powermetrics` in your shell and it will give you:
- CPU Power: XX mW
- GPU Power: XX mW
- ANE Power: XX mW
- Combined Power (CPU + GPU + ANE): XXX mW
Count the number of tokens. Create mW per Token value.
Then compare to published numbers from for instance GreenPT (11,59 mWh per 100 tokens via https://www.eco-compute.io/files/slides_2025/01_Thursday/01_SoHa/08_Keus_GreenPT.pdf)

In this rough estimate you can then compare how much power draw you experience compared.

Then you can start comparing the embodied carbon perspective, which would be:

How much smaller would my macBook / laptop be if I did not have to run AI models on it (prospected saving)
How much shared resources am I blocking at OpenAI to run my model in their cloud (cloud overhead)

If this could peek your interest and you find some time to create some numbers would be interesting to see them

m_andrasch · January 19, 2026, 8:50am

Oh wow, thanks for the detailed response! Much appreciated!

Unfortunately, I don’t have access to that MacBook Pro M3 anymore.

m_andrasch · April 29, 2026, 8:13am

PS: Der Standard / Daniel Koller published an extensive read regarding local AI/LLMs Meine Hardware, meine KI: Wie nützlich sind lokale LLMs? - KI - derStandard.at › Web

He tested on a mac mini with 16GB RAM