Skip to content

Conversation

@0pendansor
Copy link
Collaborator

Add dsperse and dslice files.

…ection; Added exclude_patterns to Converter for pk exclusion during packing
Copy link
Member

@HudsonGraeme HudsonGraeme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks awesome! Left some small comments, questions and thoughts here 🙏

The main points:

  • In the subnet we can't share DSperse / DSlice files back and forth at runtime as it'd be very inefficient / incur a huge amount of data transfer so I think ideally we'd transmit minimal witness inputs and proof data back and forth
  • Overall if we could improve pathing logic in the future to have users indicate what kinds of files they've provided and where those are located then fail early if any are missing then it'd save us tons of effort in terms of searching files and path verification.
  • Since we're supporting JSTprove (this is probably out of scope for the PR) we should be cognizant of areas where we're hardcoding ezkl related paths

logger.warning(f"Failed to read slice metadata at {slice_meta_path}: {e}")

# Extract IO shapes and deps
io_meta = slice_level_meta.get('io', {}) if isinstance(slice_level_meta, dict) else {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would recommend a single helper function to handle defensive path resolution in slices and main files to DRY things out.
As well, can we assume we have the correct paths / file existence if the base file exists? The only program creating these files is DSperse itself and this it should be guaranteed that these paths exist.


execution_chain = {
"head": "segment_0" if segments else None,
"head": ordered_keys[0] if ordered_keys else None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd recommend validating existence of one or more slices when the command is called to avoid defensive checks further down the codebase (such as here). If we don't have a slice (ie no ordered keys) this would be a fatal error.

use_circuit = bool(meta.get('ezkl')) and has_circuit and has_keys

next_slice = ordered_keys[i + 1] if i < len(ordered_keys) - 1 else None
execution_chain["nodes"][slice_key] = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious about this. This seems to make a hole in the array at idx zero and stop early before the last element if I understand correctly because the last i + 1 will exceed the length of ordered_keys - 1

execution_chain["fallback_map"][compiled_circuit_path] = onnx_slice_path
elif onnx_slice_path:
execution_chain["fallback_map"][segment_key] = onnx_slice_path
if has_circuit and onnx_path:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is necessary - this onnx path is already included in the execution chain node and application logic should assume that anything passed as circuit_path is a circuit rather than raw onnx

execution_chain["nodes"][slice_key] = {
"slice_id": slice_key,
"primary": circuit_path if use_circuit else onnx_path,
"fallback": onnx_path,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a duplicate of onnx_path

# Ensure candidate is normalized in case prompt returned a path-like
candidate = normalize_path(candidate)

# Check if candidate is a .dsperse or .dslice file - unpack if needed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way we can directly one-time check path exists on the provided run directory to avoid searching?


print(f"{Fore.GREEN}✓ Proof generation completed in {elapsed_time:.2f} seconds!{Style.RESET_ALL}")

# If we unpacked a dsperse file, repack it with proofs (exclude pk for verification)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At runtime I think this will be difficult as the verifying party does not need any of the information inside the dsperse file except for the proof and verification key (it should already have the verification key) and therefore I would lean towards avoiding repacking the file. It would incur a significant amount of data transfer overhead to share models back and forth between parties
It also becomes unintuitive codewise because we'd be checking for presence of _proved in the filename in order to inform our unpacking process (the filename provides no hard guarantee of contents)

help='Select output format: dsperse (single bundle), dslice (one .dslice per slice), or dirs (unpacked directories). Default: dirs')

# Sub-commands under slice
sub = slice_parser.add_subparsers(dest='slice_subcommand', help='Slice sub-commands')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may be a duplicate of dsperse convert

return []

# Normalize the run root directory to ensure absolute paths
run_root_dir = normalize_path(run_root_dir)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file shares a significant amount of code with run, ideally we'd DRY the common code out

slices_dir = self._converted_dir
model_dir = slices_dir.parent

# Detect single-slice mode: a directory that itself is a slice (has metadata.json + payload) and no slice_* children
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can have the user specify what they're inputting or structure out subcommands in such a way that inherently specifies the incoming file, we could remove all logic that attempts to detect the format or search paths

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants