Using KoboldCpp

You can find the full KoboldCpp documentation here.

Step 1 - Clone the repo

git clone https://github.com/LostRuins/koboldcpp
cd koboldcpp

Step 2 - Download the model

For example, we will use OpenChat 3.5 model, which is what is used on the demo instance. There are many models to choose from.

Navigate to TheBloke/openchat_3.5-GGUF and download one of the models, such as openchat_3.5.Q5_K_M.gguf. Place this file inside the ./models directory.

Step 3 - Build KoboldCpp

make

Step 4 - Run the server

./koboldcpp.py ./models/openchat_3.5.Q5_K_M.gguf

Step 5 - Enable the server in the client

First select KoboldCpp as the backend in the client:

settings -> ChatBot -> ChatBot Backend -> KoboldCpp

Then configure KoboldCpp:

settings -> ChatBot -> KoboldCpp

Inside of "Use KoboldCpp" ensure that "Use Extra" is enabled. This will allow you to use the extra features of KoboldCpp, such as streaming.

PreviousUsing Ollama NextUsing OpenAI

Last updated 1 year ago