Llama cpp install

Install local

CUDA local
This forces reinstall and build llama cpp python with CUDA so that GPU can be used

CMAKE_ARGS="-DGGML_CUDA=on -DGGML_CUDA_FORCE_CUBLAS=on -DLLAVA_BUILD=off -DCMAKE_CUDA_ARCHITECTURES=native" FORCE_CMAKE=1 pip install llama-cpp-python --no-cache-dir --force-reinstall --upgrade

The only thing that worked outright was creating a new venv with conda and then installing llama cpp python lib. Every other option either failed to install or failed to GPU after install

Llama cpp cli

git clone https://github.com/ggml-org/llama.cpp.git
cd llama.cpp
mkdir build
cd build
cmake -B build
cmake --build build --config Release

llama-cli -m "mistral-7b-instruct-v0.1.Q2_K.gguf" -cnv

Setup python virtual environment on linux

Local docker compose MQTT(Mosquitto)