Skip to content

aaivu/distributed_imm

Repository files navigation

Distributed IMM

This repo has the development code for the distributed IMM algorithm. The final implementation can be found in d_imm_scala.

Scalable Iterative Mistake Minimization (IMM) for Clustering Explanations

Distributed IMM is a scalable PySpark implementation of the IMM algorithm for clustering explanations. It includes Cython-optimized histogram-based splitting and K-Means initialization for efficiency.

Features

  • Distributed IMM computation for large datasets
  • Optimized histogram-based splitting
  • Optimized mistake calculation with histograms
  • K-Means initialization for clustering

License

MIT License

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •