Skip to content

avoid bam2msa to create BAM index in inputdir #3986

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Sep 25, 2021

Conversation

pavanvidem
Copy link
Member

FOR CONTRIBUTOR:

  • - I have read the CONTRIBUTING.md document and this tool is appropriate for the tools-iuc repo.
  • - License permits unrestricted use (educational + commercial)
  • - This PR adds a new tool or tool collection
  • - This PR updates an existing tool or tool collection
  • - This PR does something else (explain below)

@bernt-matthias
Copy link
Contributor

Hey @pavanvidem .. how do you identify tools writing to the input dir?

@pavanvidem
Copy link
Member Author

First, we extracted the dataset names from input dir that do not end with .dat using something like find . -not -name "*.dat" -type f | grep -v 'dataset_.*_files'. Then we used gxadmin query q to query the job table using each dataset name and extracted the tools out of it. Finally, we manually check/run each tool and fix it.

@bernt-matthias
Copy link
Contributor

@pavanvidem I started some discussion here: galaxyproject/planemo#1189

@bernt-matthias
Copy link
Contributor

Still the manual approach will be needed (since I guess we won't have perfect test coverage)...

@@ -7,11 +7,13 @@
</macros>
<expand macro="requirements"/>
<command detect_errors="exit_code"><![CDATA[
## avoid bam2msa to create .bai in inputdir
ln -s '$input' input_bam &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Galaxy already stores the index files (only for bam) and you can access it with:$input.metadata.bam_index. So ln -s '$input.metadata.bam_index' input_bam.bai may avoid that bam2msa needs to recreate it?

But I guess then we need a version bump.

@bernt-matthias
Copy link
Contributor

First, we extracted the dataset names from input dir that do not end with .dat using something like find . -not -name "*.dat" -type f | grep -v 'dataset_.*_files'. Then we used gxadmin query q to query the job table using each dataset name and extracted the tools out of it. Finally, we manually check/run each tool and fix it.

Hey @pavanvidem just thought a bit more about this. The problem is that old versions of the tool will still write to Galaxy's file dir. How about configuring your galaxy to run jobs as a separate user that does not have write permissions for Galaxy's file dir:

https://github.com/galaxyproject/galaxy/blob/40ddc72f485ae233f3a4aed63847ccb041003320/lib/galaxy/config/sample/galaxy.yml.sample#L2121

Still the problem needs to be fixed: I' currently running IUC's weekly CI with an extended version of planemo (galaxyproject/planemo#1190) : https://github.com/bernt-matthias/tools-iuc/actions/runs/1277704614

@pavanvidem
Copy link
Member Author

Hey @pavanvidem just thought a bit more about this. The problem is that old versions of the tool will still write to Galaxy's file dir. How about configuring your galaxy to run jobs as a separate user that does not have write permissions for Galaxy's file dir:

This is a good idea. I should have mentioned that I queried the EU Galaxy database, not my local instance. This might have covered most of the old tool versions and the tools outside IUC.

@natefoo
Copy link
Member

natefoo commented Oct 29, 2021

Your excellent work was foiled by the tool again @pavanvidem. It looks like this one generates the index regardless of whether it exists. If it's a symlink to a read-only file, then that fails.

I can generate a PR upstream, but for current versions of the tool I think we will have to remove the bai symlink and let the tool generate the bai itself, unfortunately.

@natefoo
Copy link
Member

natefoo commented Oct 29, 2021

Upstream PR: veg/BioExt#45

@bernt-matthias
Copy link
Contributor

@pavanvidem can you check if a bump of the tool version to 0.20.4 fixes this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants