-
Notifications
You must be signed in to change notification settings - Fork 2
dslice and dsperse file type integration #37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…put via .dsperse/.dslice files.
04c3c84 to
f8eb480
Compare
…ection; Added exclude_patterns to Converter for pk exclusion during packing
HudsonGraeme
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks awesome! Left some small comments, questions and thoughts here 🙏
The main points:
- In the subnet we can't share DSperse / DSlice files back and forth at runtime as it'd be very inefficient / incur a huge amount of data transfer so I think ideally we'd transmit minimal witness inputs and proof data back and forth
- Overall if we could improve pathing logic in the future to have users indicate what kinds of files they've provided and where those are located then fail early if any are missing then it'd save us tons of effort in terms of searching files and path verification.
- Since we're supporting JSTprove (this is probably out of scope for the PR) we should be cognizant of areas where we're hardcoding ezkl related paths
| logger.warning(f"Failed to read slice metadata at {slice_meta_path}: {e}") | ||
|
|
||
| # Extract IO shapes and deps | ||
| io_meta = slice_level_meta.get('io', {}) if isinstance(slice_level_meta, dict) else {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would recommend a single helper function to handle defensive path resolution in slices and main files to DRY things out.
As well, can we assume we have the correct paths / file existence if the base file exists? The only program creating these files is DSperse itself and this it should be guaranteed that these paths exist.
|
|
||
| execution_chain = { | ||
| "head": "segment_0" if segments else None, | ||
| "head": ordered_keys[0] if ordered_keys else None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd recommend validating existence of one or more slices when the command is called to avoid defensive checks further down the codebase (such as here). If we don't have a slice (ie no ordered keys) this would be a fatal error.
| use_circuit = bool(meta.get('ezkl')) and has_circuit and has_keys | ||
|
|
||
| next_slice = ordered_keys[i + 1] if i < len(ordered_keys) - 1 else None | ||
| execution_chain["nodes"][slice_key] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious about this. This seems to make a hole in the array at idx zero and stop early before the last element if I understand correctly because the last i + 1 will exceed the length of ordered_keys - 1
| execution_chain["fallback_map"][compiled_circuit_path] = onnx_slice_path | ||
| elif onnx_slice_path: | ||
| execution_chain["fallback_map"][segment_key] = onnx_slice_path | ||
| if has_circuit and onnx_path: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this is necessary - this onnx path is already included in the execution chain node and application logic should assume that anything passed as circuit_path is a circuit rather than raw onnx
| execution_chain["nodes"][slice_key] = { | ||
| "slice_id": slice_key, | ||
| "primary": circuit_path if use_circuit else onnx_path, | ||
| "fallback": onnx_path, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a duplicate of onnx_path
| # Ensure candidate is normalized in case prompt returned a path-like | ||
| candidate = normalize_path(candidate) | ||
|
|
||
| # Check if candidate is a .dsperse or .dslice file - unpack if needed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way we can directly one-time check path exists on the provided run directory to avoid searching?
|
|
||
| print(f"{Fore.GREEN}✓ Proof generation completed in {elapsed_time:.2f} seconds!{Style.RESET_ALL}") | ||
|
|
||
| # If we unpacked a dsperse file, repack it with proofs (exclude pk for verification) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At runtime I think this will be difficult as the verifying party does not need any of the information inside the dsperse file except for the proof and verification key (it should already have the verification key) and therefore I would lean towards avoiding repacking the file. It would incur a significant amount of data transfer overhead to share models back and forth between parties
It also becomes unintuitive codewise because we'd be checking for presence of _proved in the filename in order to inform our unpacking process (the filename provides no hard guarantee of contents)
| help='Select output format: dsperse (single bundle), dslice (one .dslice per slice), or dirs (unpacked directories). Default: dirs') | ||
|
|
||
| # Sub-commands under slice | ||
| sub = slice_parser.add_subparsers(dest='slice_subcommand', help='Slice sub-commands') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may be a duplicate of dsperse convert
| return [] | ||
|
|
||
| # Normalize the run root directory to ensure absolute paths | ||
| run_root_dir = normalize_path(run_root_dir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file shares a significant amount of code with run, ideally we'd DRY the common code out
| slices_dir = self._converted_dir | ||
| model_dir = slices_dir.parent | ||
|
|
||
| # Detect single-slice mode: a directory that itself is a slice (has metadata.json + payload) and no slice_* children |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we can have the user specify what they're inputting or structure out subcommands in such a way that inherently specifies the incoming file, we could remove all logic that attempts to detect the format or search paths
Add dsperse and dslice files.