Skip to content

cue/interpreter/embed: globs in hidden directories are ignored #3889

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
jpluscplusm opened this issue Apr 11, 2025 · 4 comments
Open

cue/interpreter/embed: globs in hidden directories are ignored #3889

jpluscplusm opened this issue Apr 11, 2025 · 4 comments

Comments

@jpluscplusm
Copy link
Collaborator

What version of CUE are you using (cue version)?

$ cue version
cue version v0.13.0-alpha.3

go version go1.24.0
      -buildmode exe
       -compiler gc
  DefaultGODEBUG gotestjsonbuildtext=1,multipathtcp=0,randseednop=0,rsa1024min=0,tlsmlkem=0,x509rsacrt=0,x509usepolicies=0
     CGO_ENABLED 1
          GOARCH amd64
            GOOS linux
         GOAMD64 v1
cue.lang.version v0.13.0

Does this issue reproduce with the latest stable release?

Yes, v0.12.1 fails identically.

What did you do?

exec cue mod init
exec cue export --out yaml
cmp stdout out
-- file.cue --
@extern(embed)
package p

x: _ @embed(glob=dir1/*.yml)
x: _ @embed(glob=.dir2/*.yml)

o: _ @embed(file=.dir2/data.yml)
-- dir1/data.yml --
A: 1
-- .dir2/data.yml --
B: 2
-- out --
x:
  dir1/data.yml:
    A: 1
  .dir2/data.yml:
    B: 2
o:
  B: 2

What did you expect to see?

A passing test.

What did you see instead?

> exec cue mod init
> exec cue export --out yaml
[stdout]
x:
  dir1/data.yml:
    A: 1
o:
  B: 2
> cmp stdout out
diff stdout out
--- stdout
+++ out
@@ -1,5 +1,7 @@
 x:
   dir1/data.yml:
     A: 1
+  .dir2/data.yml:
+    B: 2
 o:
   B: 2

FAIL: file.embedding.hidden.directory.txtar:3: stdout and out differ
@jpluscplusm jpluscplusm added embed NeedsInvestigation Triage Requires triage/attention labels Apr 11, 2025
cueckoo pushed a commit to cue-lang/cuelang.org-trybot that referenced this issue Apr 11, 2025
DO NOT REVIEW
DO NOT SUBMIT

WIP because of cue-lang/cue#3889.

Preview-Path: /docs/draft/cldd/validating-several-github-actions-files/
Preview-Path: /docs/draft/cldd/checking-existing-github-actions-files/
Signed-off-by: Jonathan Matthews <[email protected]>
Change-Id: I90511bcb8cf4d1f3f8253f07a8a2a39b7fff752f
Dispatch-Trailer: {"type":"trybot","CL":1213359,"patchset":2,"ref":"refs/changes/59/1213359/2","targetBranch":"master"}
cueckoo pushed a commit to cue-lang/cuelang.org-trybot that referenced this issue Apr 11, 2025
DO NOT REVIEW
DO NOT SUBMIT

WIP because of cue-lang/cue#3889.

Preview-Path: /docs/draft/cldd/validating-several-github-actions-files/
Preview-Path: /docs/draft/cldd/checking-existing-github-actions-files/
Signed-off-by: Jonathan Matthews <[email protected]>
Change-Id: I90511bcb8cf4d1f3f8253f07a8a2a39b7fff752f
Dispatch-Trailer: {"type":"trybot","CL":1213359,"patchset":3,"ref":"refs/changes/59/1213359/3","targetBranch":"master"}
@jpluscplusm
Copy link
Collaborator Author

jpluscplusm commented Apr 14, 2025

I note that this behaviour is intended, as per https://cuelang.org/cl/1196775.

Without requiring that CL's mention of an option to vary embed's behaviour such that hidden directories can be permitted, perhaps a reasonable compromise that doesn't downgrade security could be for globs not to include hidden directories (or files) unless no wildcard elements of the glob match a hidden path component.

In other words, to permit hidden directories (and files?) so long as each dot-prefixed path-/file-name element is explicitly mentioned in the @embed(glob=...) CUE, and isn't part of the glob expansion.

@myitcv
Copy link
Member

myitcv commented Apr 14, 2025

I note that this behaviour is intended, as per https://cuelang.org/cl/1196775.

It is indeed behaving as documented there, but I believe it is overly restrictive.

I propose relaxing the logic as follows:

  • We still only ever embed files from within the module boundary
  • * (or in the future **) at the start of a path element only expands to non-hidden files/directories

Thus the following will result in embedded files (assuming the corresponding files exist on disk) where they currently do not:

# current
.foo
.foo/bar
.foo/*.yaml    # all non-hidden .yaml files in .foo

# future - with **
**/.*          # all dot-prefixed files in all non-hidden subdirectories

@myitcv myitcv removed the Triage Requires triage/attention label Apr 14, 2025
@myitcv
Copy link
Member

myitcv commented Apr 14, 2025

@mvdan points out the following wording from the bash manpage as an alternative to my wording in the second bullet above:

When matching filenames, the dotglob shell option determines the set of filenames that are tested: when dotglob is enabled, the set of filenames includes all files beginning with ‘‘.'', but ‘‘.'' and ‘‘..'' must be matched by a pattern or sub-pattern that begins with a dot; when it is disabled, the set does not include any filenames beginning with ‘‘.'' unless the pattern or sub-pattern begins with a ‘‘.''. As above, ‘‘.'' only has a special meaning when matching filenames.

cueckoo pushed a commit to cue-lang/cuelang.org-trybot that referenced this issue Apr 14, 2025
This adds a draft guide demonstrating the validation of multiple
configuration files against the same GitHub Actions schema using file
embedding; and links to it from the "preceding" guide in the narrative
flow. This new guide will be used as a template for other curated
modules whose technologies support the existence of multiple config
files.

For technologies whose config files' canonical filesystem locations
includes a "." prefix (such as GitHub Actions) cue-lang/cue#3889 means
that file embedding has to use the "file" parameter rather than "glob".
This leads to some less optimal CUE, which is acceptable whilst a
relaxation of the rules causing that issue is considered.

DO NOT SUBMIT
until this change contains more than just GitHub Actions guides.

Preview-Path: /docs/draft/cldd/validating-several-github-actions-files/
Preview-Path: /docs/draft/cldd/checking-existing-github-actions-files/
Signed-off-by: Jonathan Matthews <[email protected]>
Change-Id: I90511bcb8cf4d1f3f8253f07a8a2a39b7fff752f
Dispatch-Trailer: {"type":"trybot","CL":1213359,"patchset":4,"ref":"refs/changes/59/1213359/4","targetBranch":"master"}
@myitcv
Copy link
Member

myitcv commented Apr 16, 2025

After some good offline exchanges with @rogpeppe and @mvdan here is a hopefully complete summary:

  • The language in https://cuelang.org/cl/1196775 around 'hidden files' is potentially misleading, because strictly speaking hidden files are a platform specific concept. The definition of "hidden" on Unix and Windows are different, and we cannot have such a distinction when it comes to embedding. Not least because we might not be embedding files from an actual file system.
  • Instead, the term 'hidden' in that CL is more likely a reference to the much looser, principally Unix convention that "dotfiles" (those with a . prefix) are those that generally relate to configuration and are generally more private than public. The critical point here being there are no hard and fast rules: not all . prefixed files and directories are config related, nor are they all exclusively non-public. So this is more a "general sense" rather than a specific rule.
  • Building on that "general sense", in the configuration space in which CUE operates, the intent behind the change in https://cuelang.org/cl/1196775 is thus interpreted as something along the lines of "it is therefore more likely to be the case that * not matching .-prefixed files/directories is more useful than not, i.e. it will match the user intent more often than not"
  • Furthermore, there is the "break glass" that .* allows for their explicit inclusion. Granted such a "fix" does not extend to working with ** if/when we implement that.
  • On that basis, this issue does indeed correctly point out a bug with the current implementation. .hidden/* under this new definition should include files that don't begin with . in the .hidden directory.

Looking at the benefits of moving forward on this basis:

  • it aligns with the bash concept of *, and so is not out on a limb in that respect
  • it aligns with the concept of config files, which are generally speaking "more private", being . prefixed

Looking at the costs of moving forward on this basis:

  • it differs from Go's file embedding, and filepath.Glob's behaviour
  • it imposes something of a Unix bias on the situation, granted that WSL and the . prefix is fairly commonplace
  • there is no clear way of "breaking glass" with ** if we support that

Assuming there is not disagreement with the above, we should move forward with fixing the bug presented in this issue, and carefully documenting that the rules regarding * expansion in file embedding should not be conflated with the file system concept of hidden files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants