Skip to content

Conversation

astro-friedel
Copy link

Description

This PR introduces a new wrapper in apps called bash_watch. This wrapper was created as a way to implement using DynamicFileLists with bash_apps. It will monitor specific directories for newly created files, during a bash_app execution. These files are then registered with the file provenance framework.

Changed Behaviour

This work expands the functionality of Parsl to enable full file provenance tracking for bash_app.

Fixes

None

Type of change

  • New feature

initial code to handle file related monitoring messages
Copy link
Member

@yadudoc yadudoc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @astro-friedel, I've got some open questions about the interface by which a user specifies these dynamic lists as well as the mechanism used to track files:

  • I'm not clear on the benefits of tracking files as they are created rather than figuring out a delta of file listings in specific directories. The latter is lightweight and should yield the same information.

  • The second concern is the mechanism by which a user indicates what paths will be tracked. We want to be able to specify paths to track and we want to be able to tell the app what these paths are. I'd guess that we'd want to say something like: DynamicFileList(/path/*regex*) or similar to match only relevant files. We might want to keep this to directories because passing the target directory to the app is harder when the target is a regex string.

self.added_files = list(updated - set(self.files))


class BashObserver(Observer):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@astro-friedel Based on our discussion at the parsl dev calls, I was under the impression that the plan was to simplify the mechanism for detecting app specific files. We'd discussed capturing the delta of the files in a directory path before and after executing the command, as being a much lighter solution than relying on continuous tracking. Let me know what you think.

@astro-friedel astro-friedel marked this pull request as draft August 18, 2025 15:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants