![](https://lemmy.blahaj.zone/pictrs/image/1AxRiGwL8E.png)
![](https://fry.gs/pictrs/image/c6832070-8625-4688-b9e5-5d519541e092.png)
Yeah, I was thinking diesel powered trains
Yeah, I was thinking diesel powered trains
This article is comparing apples to oranges here. The deepseek R1 model is a mixture of experts, reasoning model with 600 billion parameters, and the meta model is a dense 70 billion parameter model without reasoning which preforms much worse.
They should be comparing deepseek to reasoning models such as openai’s O1. They are comparable with results, but O1 cost significantly more to run. It’s impossible to know how much energy it uses because it’s a closed source model and openai doesn’t publish that information, but they charge a lot for it on their API.
Tldr: It’s a bad faith comparison. Like comparing a train to a car and complaining about how much more diesel the train used on a 3 mile trip between stations.
I was thinking as a cost cutting measure. As long as performance is comparable to a moderate CPU GPU combination, it’s less silicone, interconnections, ram, and coolers and less likely to break during shipping / assembly. Like a gaming console.
Such a PC could still use sockets with upgradable APUs or CPUs, as well as PCI slots for dedicated gpus, retaining basic upgradability. A lot depends on the upcoming AMD APUs.
Imo, 4060ti performance in a 600 to 800 dollar box running a amd Apu with 16 to 32 gb of shared ram. That’s all they need.
I actually don’t think this is shocking or something that needs to be “investigated.” Other than the sketchy website that doesn’t secure user’s data, that is.
Actual child abuse / grooming happens on social media, chat services, and local churches. Not in a one on one between a user and a llm.
Yes, sorry, where I live it’s pretty normal for cars to be diesel powered. What I meant by my comparison was that a train, when measured uncritically, uses more energy to run than a car due to it’s size and behavior, but that when compared fairly, the train has obvious gains and tradeoffs.
Deepseek as a 600b model is more efficient than the 400b llama model (a more fair size comparison), because it’s a mixed experts model with less active parameters, and when run in the R1 reasoning configuration, it is probably still more efficient than a dense model of comparable intelligence.