A friend asked me if I had used a GPU for my tests described in blog post “Quickpost: The Electric Energy Consumption Of LLMs”. Because he had tried running an LLM on a machine without GPU, and it was too slow.
I did a quick test, just redoing previous test but without GPU (by setting environment variable CUDA_VISIBLE_DEVICES=-1).
Answering my query took 17 seconds, and required 1.13 Wh (again, for the whole PC).
Click to Open Code Editor