Skip to content

shoupzwu/GRASP

Repository files navigation

GRASP Project

This is a implementation of the paper: Data-Agnostic Cardinality Learning from Imperfect Workloads.

This repo contains:

  • 🪐 A simplified PyTorch implementation of GRASP, containing core functionalities of the GRASP system.
  • ⚡️ A PyTorch implementation of ArCDF, improving on prior work NeuroCDF.
  • 🛸 A self-contained Python file for reproducing the main experiments on CEB-IMDb-full.
  • 🛸 A self-contained Python file for reproducing the main experiments on DSB.
  • 🎉A Python script for running the query end-to-end experiments.

Preparation

Dataset/Workloads

  1. Download CEB-IMDb-full (i.e., CEB-IMDb-13k) benchmark, and place the entire directory in your IMDB_DIRECTORY in train_grasp_ceb.py .
  2. The DSB workload is contained in this file.

Query Optimization

  1. Please download and install the modified PostgreSQL from here.
  2. Download the IMDb dataset from here, and download the populated DSB dataset used in the paper from here.
  3. Please load the data into PostgreSQL.

Usage

Training GRASP over CEB-IMDb-full

To train the GRASP model over CEB-IMDb-full, run the following command:

python train_grasp_ceb.py

Training GRASP over DSB

To train the GRASP model over DSB, run the following command:

python train_grasp_dsb.py

Configuration

The training scripts can be configured by modifying the parameters in the respective train_grasp_*.py files. Key parameters include:

  • epoch: Number of training epochs
  • feature_dim: Dimension of CardEst models
  • lcs_dim: dimension of the Learned Count Sketch Models
  • bs: Batch size
  • lr: Learning rate

Utilities

The project includes various utility functions and classes located in the CEB_utlities and dsb_utlities directories. These utilities are used for data/workloads processing.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Contact

If you have any questions, feel free to contact me through email ([email protected]).

About

Code for GRASP (VLDB 2025)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages