dslice and dsperse file type integration #37

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

0pendansor wants to merge 10 commits into main from dan/dslice

Collaborator

0pendansor commented Oct 21, 2025

Add dsperse and dslice files.

0pendansor added 8 commits

September 25, 2025 13:45


          Enhance compiler with layer selection and extend slicer model mappings

958035c


          prodice .dslice files on slice.

71b64a5


          Restructure and get working conversions with dslices

f587784


          CLI commands fix

2d31e65


          Update .gitignore to exclude additional model-related files and formats

d90e00e


          Allow for compilation with dsperse and dslice files.

b827b86


          update metadata for dslice compilation

1b4fa73


          Dslice and Dsperse run prove and verify

fb510bc

0pendansor requested review from HudsonGraeme and shirin-shahabi

October 21, 2025 04:06

0pendansor self-assigned this


          Enhance slicer, compiler, and CLI to support single-slice mode and in…

f8eb480

…put via .dsperse/.dslice files.

shirin-shahabi force-pushed the dan/dslice branch from 04c3c84 to f8eb480 Compare

October 30, 2025 14:58


          Enhanced prove/verify with --from flag for auto-unpacking and run det…

0898cdf

…ection; Added exclude_patterns to Converter for pk exclusion during packing

HudsonGraeme requested changes

View reviewed changes

Member

HudsonGraeme left a comment

Looks awesome! Left some small comments, questions and thoughts here 🙏

The main points:

In the subnet we can't share DSperse / DSlice files back and forth at runtime as it'd be very inefficient / incur a huge amount of data transfer so I think ideally we'd transmit minimal witness inputs and proof data back and forth
Overall if we could improve pathing logic in the future to have users indicate what kinds of files they've provided and where those are located then fail early if any are missing then it'd save us tons of effort in terms of searching files and path verification.
Since we're supporting JSTprove (this is probably out of scope for the PR) we should be cognizant of areas where we're hardcoding ezkl related paths

dsperse/src/analyzers/runner_analyzer.py

    
                                  logger.warning(f"Failed to read slice metadata at {slice_meta_path}: {e}")

                          # Extract IO shapes and deps

                          io_meta = slice_level_meta.get('io', {}) if isinstance(slice_level_meta, dict) else {}

Member

HudsonGraeme Nov 5, 2025

Would recommend a single helper function to handle defensive path resolution in slices and main files to DRY things out.
As well, can we assume we have the correct paths / file existence if the base file exists? The only program creating these files is DSperse itself and this it should be guaranteed that these paths exist.

dsperse/src/analyzers/runner_analyzer.py

    
                      execution_chain = {

                          "head": "segment_0" if segments else None,

                          "head": ordered_keys[0] if ordered_keys else None,

Member

HudsonGraeme Nov 5, 2025

I'd recommend validating existence of one or more slices when the command is called to avoid defensive checks further down the codebase (such as here). If we don't have a slice (ie no ordered keys) this would be a fatal error.

dsperse/src/analyzers/runner_analyzer.py

    
                          use_circuit = bool(meta.get('ezkl')) and has_circuit and has_keys

                          next_slice = ordered_keys[i + 1] if i < len(ordered_keys) - 1 else None

                          execution_chain["nodes"][slice_key] = {

Member

HudsonGraeme Nov 5, 2025

Curious about this. This seems to make a hole in the array at idx zero and stop early before the last element if I understand correctly because the last i + 1 will exceed the length of ordered_keys - 1

dsperse/src/analyzers/runner_analyzer.py

    
                              execution_chain["fallback_map"][compiled_circuit_path] = onnx_slice_path

                          elif onnx_slice_path:

                              execution_chain["fallback_map"][segment_key] = onnx_slice_path

                          if has_circuit and onnx_path:

Member

HudsonGraeme Nov 5, 2025

I'm not sure this is necessary - this onnx path is already included in the execution chain node and application logic should assume that anything passed as circuit_path is a circuit rather than raw onnx

dsperse/src/analyzers/runner_analyzer.py

    
                          execution_chain["nodes"][slice_key] = {

                              "slice_id": slice_key,

                              "primary": circuit_path if use_circuit else onnx_path,

                              "fallback": onnx_path,

Member

HudsonGraeme Nov 5, 2025

This is a duplicate of onnx_path

dsperse/src/cli/prove.py

    
                  # Ensure candidate is normalized in case prompt returned a path-like

                  candidate = normalize_path(candidate)

                  # Check if candidate is a .dsperse or .dslice file - unpack if needed

Member

HudsonGraeme Nov 5, 2025

Is there a way we can directly one-time check path exists on the provided run directory to avoid searching?

dsperse/src/cli/prove.py

    
                      print(f"{Fore.GREEN}✓ Proof generation completed in {elapsed_time:.2f} seconds!{Style.RESET_ALL}")

                      # If we unpacked a dsperse file, repack it with proofs (exclude pk for verification)

Member

HudsonGraeme Nov 5, 2025

At runtime I think this will be difficult as the verifying party does not need any of the information inside the dsperse file except for the proof and verification key (it should already have the verification key) and therefore I would lean towards avoiding repacking the file. It would incur a significant amount of data transfer overhead to share models back and forth between parties
It also becomes unintuitive codewise because we'd be checking for presence of _proved in the filename in order to inform our unpacking process (the filename provides no hard guarantee of contents)

dsperse/src/cli/slice.py

    
                                            help='Select output format: dsperse (single bundle), dslice (one .dslice per slice), or dirs (unpacked directories). Default: dirs')

                  # Sub-commands under slice

                  sub = slice_parser.add_subparsers(dest='slice_subcommand', help='Slice sub-commands')

Member

HudsonGraeme Nov 5, 2025

This may be a duplicate of dsperse convert

dsperse/src/cli/verify.py

    
                      return []

                  # Normalize the run root directory to ensure absolute paths

                  run_root_dir = normalize_path(run_root_dir)

Member

HudsonGraeme Nov 5, 2025

This file shares a significant amount of code with run, ideally we'd DRY the common code out

dsperse/src/run/runner.py

    
                              slices_dir = self._converted_dir

                              model_dir = slices_dir.parent

                          # Detect single-slice mode: a directory that itself is a slice (has metadata.json + payload) and no slice_* children

Member

HudsonGraeme Nov 5, 2025

If we can have the user specify what they're inputting or structure out subcommands in such a way that inherently specifies the incoming file, we could remove all logic that attempts to detect the format or search paths

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet