Llamaj.cpp

llamaj.cpp (contraction of llama.cpp and java/jextract) is a port of llama.cpp in the JVM using jextract.

Requirements

Java 21
mvn
MacOS M-series / Linux x86_64 (CPU) (you can check the last section if you do not see your platform here)

How to use

Include the dependency in your pom.xml

    <dependencies>
        ...
        <dependency>
            <groupId>io.gravitee.llama.cpp</groupId>
            <artifactId>llamaj-cpp</artifactId>
            <version>x.x.x</version>
        <dependency>
    </dependencies>

Build

Get jextract

Make sure the jextract folder is in the same path level as your repository

On Linux:

Since we are using JDK 21 you can download a prebuilt version of jextract

$ wget https://download.java.net/java/early_access/jextract/21/1/openjdk-21-jextract+1-2_linux-x64_bin.tar.gz
$ tar -xzf openjdk-21-jextract+1-2_linux-x64_bin.tar.gz
$ rm openjdk-21-jextract+1-2_linux-x64_bin.tar.gz
$ echo 'export PATH="$(pwd)/jextract/bin:$PATH"' >> ~/.bashrc

On MacOS: For JDK21, there is not a version of jextract for MacOS aarch64, only for x86_64, so we have to build it ourselves

$ git clone https://github.com/openjdk/jextract
$ cd jextract
$ git checkout jdk21

Make sure your $JAVA_HOME points to your jdk21

Since jextract for jdk21 uses gradle with a jdk17 version, we need to upgrade gradle version:

$ sed -i '' 's#gradle-7\.3\.3-bin\.zip#gradle-8.5-bin.zip#g' gradle/wrapper/gradle-wrapper.properties

Install llvm:

$ brew install llvm

Then execute the gradle command:

$ sh ./gradlew -Pjdk21_home=$JAVA_HOME -Pllvm_home=$(brew --prefix llvm) clean verify

Set jextract binaries to your path:

$ ln -sf $(pwd)/build/jextract/bin $(pwd)/bin
$ echo "PATH=$PATH:$(pwd)/bin" >> ~/.zshrc
$ source ~/.zshrc

Clone llama.cpp

Make sure llama.cpp folder is in the same path level as your repository

$ git clone https://github.com/ggml-org/llama.cpp

Download binaries and generate the sources

$ mkdir $HOME/.llama.cpp
$ cd llamaj.cpp/
$ mvn clean generate-sources -Pmacosx-aarch64,linux-x86_64
$ export LLAMA_CPP_LIB_PATH="$HOME_DIR/llamaj.cpp/target/generated-sources/<<macosx|linux>>/<<x86_64|aarch64>>"
$ mvn install

Run

$ mvn exec:java -Dexec.mainClass=io.gravitee.llama.cpp.Main \
    -Dexec.args="--model /path/to/model/model.gguf --system 'You are a helpful assistant. Answer question to the best of your ability'"

or

$ java --enable-preview -jar llamaj.cpp-<version>.jar \
  --model models/model.gguf \
  --system 'You are a helpful assistant. Answer question to the best of your ability'

On linux, don't forget to link your libraries with the environment variable below:

$ export LD_LIBRARY_PATH="$HOME/.llama.cpp:$LD_LIBRARY_PATH"

There are plenty of models on HuggingFace, we suggest the one here

Usage

Usage: java -jar llamaj.cpp-<version>.jar --model <path_to_gguf_model> [options...]
Options:
--system <message>       : System message (default: "You are a helpful AI assistant.")
--n_gpu_layers <int>     : Number of GPU layers (default: 999)
--use_mlock <boolean>    : Use mlock (default: true)
--use_mmap <boolean>     : Use mmap (default: true)
--temperature <float>    : Sampler temperature (default: 0.4)
--min_p <float>          : Sampler min_p (default: 0.1)
--min_p_window <int>     : Sampler min_p_window (default: 40)
--top_k <int>            : Sampler top_k (default: 10)
--top_p <float>          : Sampler top_p (default: 0.2)
--top_p_window <int>     : Sampler top_p_window (default: 10)
--seed <long>            : Sampler seed (default: random)
--n_ctx <int>            : Context size (default: 512)
--n_batch <int>          : Batch size (default: 512)
--n_seq_max <int>        : Max sequence length (default: 512)
--quota <int>            : Iterator quota (default: 512)
--n_keep <int>         : Tokens to keep when exceeding ctx size (default: 256)
--log_level <level>      : Logging level (ERROR, WARN, INFO, DEBUG, default: ERROR)

Use your own llama.cpp build

Clone llama.cpp repository

Make sure the jextract folder is in the same path level as your repository

$ git clone https://github.com/ggml-org/llama.cpp
$ cd llama.cpp

Compile sources

Make sure you have gcc / g++ compiler

$ gcc --help
$ g++ --help

On Linux:

$ cmake -B build
$ cmake --build build --config Release -j $(nproc)

On MacOs:

$ cmake -B build
$ cmake --build build --config Release  -j $(sysctl -n hw.ncpu)

If you wish to build llama.cpp with particular configuration (CUDA, OpenBLAS, AVX2, AVX512, ...) Please refer to the llama.cpp documentation

Link sources

You can use the environment variable LLAMA_CPP_LIB_PATH=/path/to/llama.cpp/build/bin/ This will directly load the dynamically shared object library files (.so for linux, .dylib for macos) You can also decide to copy these files into a temporary folder using the environment variable LLAMA_CPP_USE_TMP_LIB_PATH=true The path temporary file will be used to load the shared object libraries

Beyond Apple M-Series and Linux x86_64

While we don't support other platforms/architecture pair out-of-the-box for many reasons, you can still manage to use gravitee-io/llamaj.cpp:

Build llama.cpp on your infrastructure
Add the *.so or *.dylib to ~/.llama.cpp/ or use the LLAMA_CPP_LIB_PATH and LD_LIBRARY_PATH
Build the according java bindings using jextract (without --source option) and bundle them into a jar

$ jextract -t io.gravitee.llama.cpp.<os>.<platform>\
    --include-dir ggml/include \
    --output /path/to/your/output include/llama.h
$ jar cf <name-of-your-file>.jar -C . .

Put the jextract source in io.gravitee.llama.cpp.<os>.<arch>:
- io.gravitee.llama.cpp.macosx.x86_64
- io.gravitee.llama.cpp.linux.aarch64
- io.gravitee.llama.cpp.windows.x86_64
- io.gravitee.llama.cpp.windows.aarch64 4.Add it to your classpath:`

gravitee-io/llamaj.cpp will pick up at runtime the os and architecture and will call the according bindings using reflection.

Name		Name	Last commit message	Last commit date
Latest commit History 68 Commits
.circleci		.circleci
.mvn		.mvn
scripts		scripts
src		src
.gitignore		.gitignore
CONTRIBUTING.adoc		CONTRIBUTING.adoc
LICENSE.txt		LICENSE.txt
README.md		README.md
SECURITY.md		SECURITY.md
pom.xml		pom.xml
renovate.json		renovate.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Llamaj.cpp

Requirements

How to use

Build

Run

Usage

Use your own llama.cpp build

Beyond Apple M-Series and Linux x86_64

About

Uh oh!

Releases 19

Packages

Contributors 3

Uh oh!

Languages

License

gravitee-io/llamaj.cpp

Folders and files

Latest commit

History

Repository files navigation

Llamaj.cpp

Requirements

How to use

Build

Run

Usage

Use your own llama.cpp build

Beyond Apple M-Series and Linux x86_64

About

Topics

Resources

License

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 19

Packages 0

Contributors 3

Uh oh!

Languages

Packages