Skip to content

Conversation

astro-friedel
Copy link

@astro-friedel astro-friedel commented Jan 21, 2025

Description

Parsl has an issue with files that are created by an app when they are not specified in the arguments when the app is called. For example:

@python_app
def process(inputs=[], outputs=[]):
    with open(inputs[0], 'r') as fh:
        lines = fh.readlines()
    for i in [1,2,3]:
        with open(f"dat.{i}.log", 'w') as wh:
            wh.write("xyz\n")
        outputs.append(File("dat.{i}.log"))

@python_app
def compact(inputs=[], outputs=[]):
    with open(outputs[0], 'w') as fh:
        for f in inputs:
            lines = open(inputs[i]).readlines()
            fh.write(lines)

outs = []
p = process(inputs=File("input.dat"), outputs=outs)
c = compact(inputs=p.outputs, outputs=File("compact.log"))

While process will properly write the log files to the list. compact is unlikely to see them. This is because Parsl sees the outputs from process as an empty list and does not know that any files are actually created. This causes the constructed DAG to not create any connection between process and compact, allowing them to run in parallel, instead of the expected serial.

To fix this I have created the DynamicFileList class. This class behaves just like a list, but is also a future. If outp in the above example is an instance of this class, then Parsl will know that there are files being created and will make a dependency in the DAG for compact, forcing it to not execute until process is complete.

Changed Behaviour

Using the DynamicFileList class will allow for the user to write code which has an unknown number of outputs be properly tracked and linked, and run as expected.

Type of change

Choose which options apply, and delete the ones which do not apply.

  • New feature

initial code to handle file related monitoring messages
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants