Skip to content

new tool: galaxy wrappers for the pcdl physicell data lader command line commands #7034

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

elmbeech
Copy link

@elmbeech elmbeech commented Jun 8, 2025

FOR CONTRIBUTOR:

  • [ yes] I have read the CONTRIBUTING.md document and this tool is appropriate for the tools-iuc repo.
  • [ yes] License permits unrestricted use (educational + commercial)
  • [ yes] This PR adds a new tool or tool collection
  • [ no] This PR updates an existing tool or tool collection
  • [ no] This PR does something else (explain below)

Copy link
Member

@bgruening bgruening left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@elmbeech thanks! I added a few comments that apply for all tools. I think a macro file and a proper formatting will help a lot for further reivew.

Please have a look at the Galaxy Language Server, it has some build-in tools for auto-formatting.

<requirement type="package" version="3.3.7">pcdl</requirement>
</requirements>

<stdio> <exit_code range=":" level="log" description="hello world!"/> </stdio>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hello world is not a good description :)
But I guess this entire line can be removed

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi @bgruening ,
thank you for reviewing!

unfortunately, this line cannot be removed.
if you write a python package with command line hocks, the python output is piped to sterr.
if i remove this line, galaxy thinks it always caught an error when the script runs successfully.
this is why i have to catch all exit_codes as log.
i agree, hello world! is not a good description.
i will change this to catch python script output.

would this be acceptable?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what you are looking for is https://docs.galaxyproject.org/en/latest/dev/schema.html#error-detection e.g. detect_errors="exit_code" on the command tag. See other Python based tools here in IUC.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, exactly not.
As you can see, if I do so an error is detected, even if there is none.
image

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, that mean that your tool is reporting an error code of 1, is that the case?

For any other tool, text on stderr is not a problem, you can instruct Galaxy to only react on error codes, not text on stderr.

Copy link
Author

@elmbeech elmbeech Jun 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no it is not reporting an error code of 1.
as you can see, in this particular command (which in the pcdl command line version just output to the screen) , the bash line is catching the standard error with 2> and is writing this to version.txt.
and galaxy interpret this as error code 1.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i followed this instruction in the galaxy documentation:

type="text" value="PhysiCell_settings.xml" optional="true"
help="the settings.xml that is loaded, from which the cell type ID label mapping, is extracted, if this information is not found in the output xml file. set to None or False if the xml file is missing! default is PhysiCell_settings.xml."
/>
<param
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The verbose parameter is a bit useless in the Galaxy context. You can set it to true without exposing this parameter to the user.

Copy link
Author

@elmbeech elmbeech Jun 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if possible, i would like to have it there for the user.
here are my arguments:

  1. my aim is, that the galaxy wrapper not differ dor the regular pcdl command line command.
  2. if you in the history, click the "info" (dataset details) you can see the difference between verbose and non-verbose in "Tool Standard Output". this might very well be interesting information for the power user.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we do capture the verbose output, that why I think just enable it by default if it contains useful information. This way all user have all informations. And then the need for this param in the UI is not needed anymore.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, but this might be a lot of output and uses up space that is not needed.
additionally, no-verbose processing is faster.
I would prefer to have it as an option with the default set to false.

]]></command>

<inputs>
<section name="positional_arguments" title="positional arguments:" expanded="true">
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section does not make much sense to a user, does it?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it does, if you take a look at the command line help above.
in my opinion, the galaxy wrapper should really just be a wrapper for the regular command line command.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The IUC is really trying hard to make the UI even more useable than a CLI. Improving help text, arranging params in sections that cluster nicely together, avoiding params that we can deduce elsewhere etc ...
I'm not sure which user-group you are targeting, but I'm petty sure many Galaxy users have no idea what "positional arguments" mean.

Copy link
Author

@elmbeech elmbeech Jun 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. will do so. will change.

Copy link
Author

@elmbeech elmbeech Jun 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look, numerous parameters are there to reduce RAM consumption and offer processing speed up, if possible.
I think it would be bad to kick out all these options from the GUI because we assume the users are not clever enough to use them.
I can rearrange the parameters and make an "essential" and a collapsed "advanced" section, if this is ok, but I really don't wanna kick out these parameters!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good example. RAM consumption, or memory allocation, is done by the admin or the server and should not be handled by the Galaxy tool logic. A user, even an experienced user, does not know at which server the tool will run, or how much memory the node will have. Nor does the user know how much memory is available. So Galaxy has an entire subsystem to do this for each tool, configurable by the server admin, the compute provider etc ...

Lets maybe consider an extreme case like number of CPUs. We don't want user to specify the number of CPUs for the given arguments above - at least not on public Galaxy servers. If the user provides a number that is too high, its might be inefficient, or the job might not start at all. If the number is too low the tool might crash (there are tools like that ...). Since Galaxy knows about available "destinations" we adjust the CPU count to fit optimally into the infrastructure where it is supposed to be running.

I hope that makes sense.

Copy link
Author

@elmbeech elmbeech Jun 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this makes sense. What I am trying to say is something totally different.

When we do agent-based model simulation, we record agents, the substrate concentrations at each location in the domain, and possible internal state of the cells. Now when you do a sudden plot or downstream analysis where you e.g. would like to analyze the agent, but you will for that not need the information you have about the substrate concentration at every place in the domain, and the internal cell states you should be able to tell that you not like to load this data because it will cause unnecessary use of resources and time. And only you, who is analyzing the data, knows whether you have to load this data or not. Your subsystem will not know or be able to help you there.
When you do a one-off plot or analysis, this does not matter. But when you analyze a parameter sweep with hundreds or even thousands of runs in a workflow, then it does, and an advanced user should be able to finetune this.

@@ -0,0 +1,79 @@
<tool id="pcdl_make_gif" name="pcdl_make_gif" version="3.0.0+galaxy0" python_template_version="3.5" profile="21.05">
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a very generic tool that takes images and creates a gif? Or are there some PCDL specific bits in it?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is generic, but is one of the python and command line command that pcdl offers. so, i wanna offer it as well in galaxy.

]]></command>

<inputs>
<section name="positional_arguments" title="positional arguments:" expanded="true">
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please remove those sections, if they don't enhance UX for users

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same argument as above.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok! will do so.

@elmbeech
Copy link
Author

elmbeech commented Jun 9, 2025

I think this is what you mean with "langauage server".
Looks like i have to check my implementation with vscode .
ok, i can do that.

Could you please point me to what you mean with "macro"?
Thank you!

@bgruening
Copy link
Member

I think this is what you mean with "langauage server". Looks like i have to check my implementation with vscode . ok, i can do that.

* https://github.com/galaxyproject/galaxy-language-server

Could you please point me to what you mean with "macro"? Thank you!

Sure. https://planemo.readthedocs.io/en/latest/writing_advanced.html#macros-reusable-elements
e.g. here: https://github.com/galaxyproject/tools-iuc/blob/main/tools/bedtools/macros.xml

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants