Replies: 2 comments 11 replies
-
To the contrary, allowing subworkflows to publish channels would hurt modularity. If a subworkflow is to be re-used across different pipelines, it should not impose publishing behavior on the calling workflow. For example, I may call a subworkflow and only want to publish some of its channels. If the subworkflow inherently publishes all of its output channels instead of emitting them, that choice is taken from me as the caller. This is also a best practice in software engineering more generally, to keep I/O at the "boundaries" (beginning and end) of your code. Makes components easier to test and re-use in different contexts. |
Beta Was this translation helpful? Give feedback.
-
You would need to propagate the output channels for each subworkflow or leave them empty: workflow {
main:
println "$workflow.profile pipeline"
if( workflow.profile == "analyze" )
ch_analyze = Analyze()
else
ch_analyze = channel.empty()
if( workflow.profile == "reanalyze" )
ch_reanalyze = Reanalyze(params.datatype, params.resultFolderToReanalyze)
else
ch_reanalyze = channel.empty()
if( workflow.profile == "test" )
ch_test = Test()
else
ch_test = channel.empty()
publish:
analyze = ch_analyze
reanalyze = ch_reanalyze
test = ch_test
} Of course this example assumes one output channel per subworkflow. I understand that this becomes cumbersome when each subworkflow has many output channels. I have noticed from studying nf-core pipelines that subworkflows often have many related outputs (e.g. BAM_MARKDUPLICATES_PICARD). I'm guessing that you follow a similar pattern? I think this is a huge source of verbosity in general, but especially in your case of propagating outputs through multiple levels of subworkflows. Ideally I think this subworkflow would output a single channel of records with the following structure: [
id: meta.id,
// other meta properties...
bam: /* ... */ ,
cram: /* ... */ ,
metrics: /* ... */ ,
bai: /* ... */ ,
crai: /* ... */ ,
csi: /* ... */ ,
stats: /* ... */ ,
flagstat: /* ... */ ,
idxstats: /* ... */ ,
] This is much easier to emit back up to the entry workflow and publish. Nextflow can even write out the channel into a CSV or JSON file now that you have the map keys. And I know that many people have asked for this kind of pattern just because it's better in general. This is currently difficult to do because of how process inputs/outputs work. We are working on some possible improvements to the process syntax to make it possible. In the meantime, you could try to create a single output channel with the result = bam
.join(cram)
.join(metrics)
// ... But until we improve the process input/output syntax, I think this pattern will be ugly either way 🙁 |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi, I was wondering if there’s a specific reason the new publish directive is only available in the entry workflow?
https://www.nextflow.io/docs/latest/workflow.html#workflow-outputs
It feels like a missed opportunity that this can't be used within named subworkflows. This limitation means I have to bubble up all my channels to the entry workflow just to publish files, which adds unnecessary overhead and reduces modularity.
Otherwise I would need to use publish-only processes:
Beta Was this translation helpful? Give feedback.
All reactions