Data transformation framework for AI. Ultra performant, with incremental processing.
-
Updated
Aug 6, 2025 - Rust
Data transformation framework for AI. Ultra performant, with incremental processing.
Big Data and Machine Intelligence Course in Autumn 2019.
Patient Intake Form Extraction using llm
🌲 Improved Interval B+ tree implementation, in TS 🌲
This repository contains an application designed to recommend scientific papers that are most similar to a given input paragraph. The application uses the llama and weaviate libraries to achieve this.
A zero-dependency library of classes that make filtering, sorting and observing changes to arrays easier and more efficient.
Designed to store and retrieve high-dimensional data, such as embeddings, efficiently. It enables fast similarity searches by leveraging techniques.
System for Managing the data generated by the SEAGrid Science Gateway
BORDS is an open-access reaction search engine that leverages Google's Open Reaction Database to provide ultra-fast, comprehensive access to millions of chemical reactions. Built with a modern cloud stack, it streamlines reaction data extraction, transformation, and indexing for researchers in chemistry and related fields.
Datafast Runtime is a high-performance subgraph processing runtime which is written from scratch and designed to handle subgraphs with unparalleled speed & storage-efficiency
Time series analysis showing trend, seasonality, and periodicity decomposition; and forecasting using Facebook Prophet. The analysis makes extensive use of indexing data tools and of the Pandas and Datetime libraries.
A comprehensive guide to building a modern data warehouse using medallion Data Warehouse Architecture with SQL Server, including ETL processes, data modeling, and analytics.
Python implementation of a TF-IDF/cosine based search engine
Solana dex swap data indexer: This is solana dex data indexer using substream. It is supporting various dexes such as pumpfun, pumpswap, bonkfun, meteora, raydium and orca
Add a description, image, and links to the data-indexing topic page so that developers can more easily learn about it.
To associate your repository with the data-indexing topic, visit your repo's landing page and select "manage topics."