Why I start a devlog
There are good reasons to document projects. For me personally, these are above all:
- A look back: how have I developed and what have I learned?
- To share inspiration: Others can benefit from my journey
- Self-reflection: Understanding my work better and questioning it critically
I am starting my devlog based on these considerations. I will record my development steps here - as a reminder for myself and perhaps also as motivation or inspiration for others.
Ollama as a backend
I rely on local AI because I am convinced that AI systems should run on private hardware. There are various solutions for running Large Language Models (LLMs) directly on your own computer. However, I am particularly fond of Ollama. I will therefore use Ollama as the LLM backend for my AI applications.
QV Ollama SDK
Ollama not only impresses with its simple use of Large Language Models (LLMs), but also with its efficiency. The Python API is user-friendly and straightforward. When I started building an AI chat application, I quickly realized that I needed a small solution to have conversations with a full conversation history. This means that the LLM not only receives the latest message, but the entire context of previous requests and responses. To simplify this process, I developed a small Python SDK that allows messages to be saved in chat histories. The LLM's response can either be streamed or output as a whole. Here is an example of a simple output.
```python
from qv_ollama_sdk import OllamaChatClient
# Create a client with a system message
client = OllamaChatClient(
model_name="gemma2:2b",
system_message="You are a helpful assistant."
)
# Simple chat - uses Ollama's default parameters
response = client.chat("What is the capital of France?")
print(response)
# Continue the conversation
response = client.chat("And what is its population?")
print(response)
# Set specific parameters only when you need them
client.temperature = 1.0 # Using property setter
client.max_tokens = 500 # Using property setter
client.set_parameters(num_ctx=2048) # For multiple parameters
# Get conversation history
history = client.get_history()
```
Or if you want to stream the answer of the LLM:
```python
from qv_ollama_sdk import OllamaChatClient
client = OllamaChatClient(model_name="gemma2:2b")
# Stream the response
for chunk in client.stream_chat("Explain quantum computing."):
print(chunk, end="", flush=True)
```
If you want to try it out, here is the link to the Github repo:
Maybe this little helper is also helpful for you. Next I will build a super simple user interface with which we can interact with an LLM.
Until the next devlog, best regards
Thomas from the Quantyverse
P.S.: Visit my website Quantyverse.ai for product, bonus content, blog posts and more