Skip to content

CONVERTER_uncompressed_to_gz writes to input dir #13916

Closed
@natefoo

Description

@natefoo

Describe the bug
CONVERTER_uncompressed_to_gz uses bgzip to create a compressed version of the input, and writes that to the correct output path, but it also generates an index that is written next to the input because of the use of the -i flag, which fails when the input dir is not writeable (such as when running in a container). Per the bgzip documentation:

-c, --stdout Write to standard output, keep original files unchanged.

-i, --index Create a BGZF index while compressing. Unless the -I option is used, this will have the name of the compressed file with .gzi appended to it.

-I, --index-name FILE Index file name.

We are using the -c option to write to stdout, and bgzip has no knowledge of what the compressed filename will be in this case, and is using what it would be if the file were being compressed in-place (without -c). We need to add the -I option to set the filename if we actually want the index file. But - do we? If we do, it should probably be a MetadataFile.

Galaxy Version and/or server at which you observed the bug
22.01.1.dev0 (da5894f)

To Reproduce
Steps to reproduce the behavior:

  1. Go to usegalaxy.org
  2. Run MiModD VCF Filter on an uncompressed vcf
  3. See error

Expected behavior
Dataset is correctly converted to compressed vcf

Screenshots
Command line:

cp '/corral4/main/jobs/042/665/42665018/configs/tmpy0nm2mle' 'galaxy.json' && bgzip -@ ${GALAXY_SLOTS:-1} -ci '/corral4/main/files/043/259/dataset_43259150.dat' > '/corral4/main/jobs/042/665/42665018/outputs/galaxy_dataset_1465c575-7505-4eb5-9bda-fdbb371c80dd.dat'

Error:

[E::bgzf_index_dump] Error opening /corral4/main/files/043/259/dataset_43259150.dat.gz.gzi : Read-only file system
Could not write index to '/corral4/main/files/043/259/dataset_43259150.dat.gz.gzi'
[E::bgzf_index_dump] Error opening /corral4/main/files/043/259/dataset_43259150.dat.gz.gzi : Read-only file system
Could not write index to '/corral4/main/files/043/259/dataset_43259150.dat.gz.gzi'

Additional context
N/A

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions