Skip to content

Conversation

@kh4rdur
Copy link

@kh4rdur kh4rdur commented Jul 11, 2025

changes to arguments and implementation before the indexing of the files.

Your checklist for this pull request

  • I've read the contributing guideline.
  • [x ] I've tested my changes by building and running mquery, and testing changed functionality (if applicable)
  • I've added automated tests for my change (if applicable, optional)
  • I've updated documentation to reflect my change (if applicable) -- no documentation available

Describe the problem

While trying to index samples form a S3 bucket i've noticed that samples that where pulled to my "workdirectory", would not be indexed and would give the error that the file could not be found.

In my setup i've mounted the /opt/samples directory on my linux box to the /mnt/samples directory in mquery.
The Workdir option that needs to be used is used to temporary download the files while indexing.
using /tmp for that results in the the ursadb never being able to find the samples.

So to make sure the downloaded files will be shown in ursa; we make the workdir: /opt/samples/workdir directory for this!

Conclusion; still won't work. cause when we whe try indexing the files in ursadb; it looks up /opt/samples/workdir; and this folder does not exist, cause on the Parent device the directory is in /opt/
But the samples can be found in /mnt/sampels on the ursadb/mquery docker.

Solution

I've changed the script on my own system to use an extra argument (default set to none) called --docker-mountpoint; this string value can be configured to overrule the workdir option while indexing.
This means that you still push in the workdirectory as always; but also can discribe the path the files can be found for ursadb.

so if the working directory is set to /opt/samples/workdir, and ursadb/mquery is set to mount /opt/samples to /mnt/samples ; ursadb will be able to find the files in /mnt/samples/workdir.

configuring --docker-mountpoint to /mnt/samples/workdir will make sure that while writing the files down on disk; the import still uses the correct folder to match the file location on disk.

Closing issues
fixes #449

kh4rdur added 3 commits July 11, 2025 13:58
changes to arguments and implementation before the indexing of the files.
Added a little something to fix removing the files.
Little change to the workdir_batch being emptied after cleanup.
@msm-cert
Copy link
Member

Hi!

Sorry that this review takes so long. I'm in progress of improving/reworking s3 indexing (plugins will keep working in the same way, the script should still work, but there will be some support in core). I want to finish this part first. I'll merge this as soon as I'm done.

@kh4rdur
Copy link
Author

kh4rdur commented Sep 21, 2025

No worries there!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

No mountpoint specification for S3Index utility

2 participants