A series of C# functions wrapping the ollama APIs, mainly for UnityEngine
The user's system needs to have a working ollama setup already:
- Download and Install ollama
- Pull a model of choice from the Library
- Recommend
llama3.1
for general conversationollama pull llama3.1
- Recommend
gemma2:2b
for device with limited memoryollama pull gemma2:2b
- Recommend
llava
for image captioningollama pull llava
- Recommend
In Unity, you need the Newtonsoft.Json
package:
- Unity Editor
- Window
- Package Manager
- Add package by name
- Name:
com.unity.nuget.newtonsoft-json
- Add
The following functions are avaliable under the Ollama class
All functions are asynchronous
- List()
- Return an array of
Model
, representing all locally available models - The
Model
class follows the official specs
- Return an array of
Tip
You can use the families
attribute to determine if a model is multimodal (see #2608)
- Generate()
- The most basic function that returns a response when given a model and a prompt
- GenerateStream()
- The streaming variant that returns each word as soon as it's ready
- Requires a
callback
to handle the chunks
- GenerateJson()
- Return the response in the specified
class
/struct
format
- Return the response in the specified
Important
You need to manually tell the model to use a JSON format in the prompt
- Chat()
- Same as
Generate()
, but now with the memory of prior chat history, thus allowing you to further ask about previous conversations - Requires either
InitChat()
orLoadChatHistory()
to be called first - Example:
>> Tell me a joke "..." >> Explain the joke "..."
- Same as
- ChatStream()
- Same as above
- InitChat()
- Initialize / Reset the chat history
historyLimit
: The number of messages to keep in memory
- SaveChatHistory()
- Save the current chat history to the specified path
- LoadChatHistory()
- Load the chat history from the specified path
- Calls
InitChat()
automatically instead if the file does not exist
Retrieval Augmented Generation
- Ask()
- Ask a question based on given context
- Requires both
InitRAG()
andAppendData()
to be called first
- InitRAG()
- Initialize the database
- Requires a model to generate embeddings
- Can use a different model from the one used in
Ask()
- Can use a regular LLM or a dedicated embedding model, such as
nomic-embed-text
- Can use a different model from the one used in
- AppendData()
- Add a context (eg. a document) to retrieve from
Note
How well the RAG performs is dependent on several factors...
A demo scene containing 3 demo scripts showcasing various features is included:
-
Generate Demo
List()
Generate()
GenerateJson()
KeepAlive.unload_immediately
-
Chat Demo
InitChat()
ChatStream()
-
RAG Demo
InitRAG()
AppendData()
Ask()
Note
Recommended to not enable multiple demos at the same time...