-
Notifications
You must be signed in to change notification settings - Fork 208
Adding bash_watch for File Provenance Tracking #3905
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…le_inputs_and_outputs
initial code to handle file related monitoring messages
…t the one that used them last
…ng the future as "done"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @astro-friedel, I've got some open questions about the interface by which a user specifies these dynamic lists as well as the mechanism used to track files:
-
I'm not clear on the benefits of tracking files as they are created rather than figuring out a delta of file listings in specific directories. The latter is lightweight and should yield the same information.
-
The second concern is the mechanism by which a user indicates what paths will be tracked. We want to be able to specify paths to track and we want to be able to tell the app what these paths are. I'd guess that we'd want to say something like:
DynamicFileList(/path/*regex*)
or similar to match only relevant files. We might want to keep this to directories because passing the target directory to the app is harder when the target is a regex string.
self.added_files = list(updated - set(self.files)) | ||
|
||
|
||
class BashObserver(Observer): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@astro-friedel Based on our discussion at the parsl dev calls, I was under the impression that the plan was to simplify the mechanism for detecting app specific files. We'd discussed capturing the delta of the files in a directory path before and after executing the command, as being a much lighter solution than relying on continuous tracking. Let me know what you think.
Description
This PR introduces a new wrapper in apps called
bash_watch
. This wrapper was created as a way to implement usingDynamicFileLists
withbash_apps
. It will monitor specific directories for newly created files, during abash_app
execution. These files are then registered with the file provenance framework.Changed Behaviour
This work expands the functionality of Parsl to enable full file provenance tracking for
bash_app
.Fixes
None
Type of change