r/MachineLearning

r/MachineLearning - Reddit

RelatedHighlights

M3 max is objectively worse than the M2 for inference.

The M2 ultra has a higher max RAM size of 192 GB

The M1 ultra has 128 GB max ram.

When it comes to these ram numbers something like 2/3 of it is available for inference.

So I see no reason why not to make a general recommendation for the M1 ultra unless you have some reason you want to run q5_K_M 1... See more

r/LocalLLaMA - Reddit

The Mac Studio is an absolute monster for inferencing; but there are a couple of caveats.

Its slower, pound for pound, than a 4090 when dealing with models the 4090 can fit in its VRAM. So a 13b model on the 4090 is almost twice as fast as it running on the M2.

The M1 Ultra Mac Studio with 128GB costs far less ($3700 or so) and the inference speed is

r/LocalLLaMA - Reddit

Deep-ML

deep-ml.com

Training great LLMs entirely from ground zero in the wilderness as a startup — Yi Tay

Yi Tay yitay.net