Pinned Loading
Repositories
- tokenkit Public
Fast, Rust-backed word-level tokenization for Ruby. Unlike subword tokenizers (BPE, WordPiece) designed for LLMs, TokenKit provides linguistic tokenization for search engines, text mining, and NLP pipelines—preserving domain-specific patterns like gene names, measurements, and technical terms while handling Unicode correctly.
scientist-labs/tokenkit’s past year of commit activity - parsekit Public
Ruby document parsing toolkit with zero runtime dependencies. Parse PDFs, DOCX, XLSX, and images (with OCR) using a single, lightweight gem. Statically links MuPDF and Tesseract at compile time for hassle-free installation - no system libraries or external tools required.
scientist-labs/parsekit’s past year of commit activity - spellkit Public
Fast, safe typo correction for Ruby. SymSpell-based spell checker with Rust performance, term protection via regex patterns, and hot-reloadable dictionaries. Sub-millisecond latency, zero dependencies.
scientist-labs/spellkit’s past year of commit activity - phrasekit Public
Weak supervision for NER: mine domain-specific phrases from unlabeled corpora, score by salience, and auto-generate training labels. Ruby gem with high-performance Rust engines.
scientist-labs/phrasekit’s past year of commit activity - ragnar-cli Public
Ragnar is a pure Ruby command-line RAG (Retrieval-Augmented Generation) tool with zero external dependencies. It provides local document indexing, semantic search, and LLM-powered query processing. Built to be hackable, it lets Ruby developers experiment with agentic workflows and RAG pipelines natively in Ruby.
scientist-labs/ragnar-cli’s past year of commit activity - red-candle Public
Ruby gem for running state-of-the-art language models locally. Access LLMs, embeddings, rerankers, and NER models directly from Ruby using Rust-powered Candle with Metal/CUDA acceleration.
scientist-labs/red-candle’s past year of commit activity - thor-interactive Public
Turn any Thor CLI into an interactive REPL with persistent state, auto-completion, and configurable default handlers for unrecognized input.
scientist-labs/thor-interactive’s past year of commit activity - lancelot Public
Ruby bindings for the Lance columnar data format. Built on the Lance Rust crate, Lancelot brings high-performance vector search, full-text search, and hybrid retrieval to Ruby applications with a native, idiomatic API.
scientist-labs/lancelot’s past year of commit activity
Top languages
Loading…
Most used topics
Loading…