Most of the HF models have a code snippet that you can use in order to run infer...

Most of the HF models have a code snippet that you can use in order to run inference on the model. The transformers library will take care of the download as a dependency when you run the code. Typically, a python 3.10-3.11 environment is sufficient as environment. Example: https://huggingface.co/HuggingFaceTB/SmolLM2-1.7B-Instruct#t...

If you have a MBP, you need to adjust the device name in the examples from "cuda" to "mps".