Or, just use the LM studio front end, it's better than anything I've used for desktop use.
I get 35t/s gemma 15b Q8 - you'll need a smaller one, probably gemma 3 15b q4k_l. I have a 3090, that's why.
Or, just use the LM studio front end, it's better than anything I've used for desktop use.
I get 35t/s gemma 15b Q8 - you'll need a smaller one, probably gemma 3 15b q4k_l. I have a 3090, that's why.