-
Notifications
You must be signed in to change notification settings - Fork 473
selectors: add matchParentBinaries selector #4254
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
✅ Deploy Preview for tetragon ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
4f73f6d to
e31cabe
Compare
|
Probably I should add docs for this selector |
e31cabe to
70ac377
Compare
kkourt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
One thing that would help reviewing this PR would be to split it into multiple commits.
See https://tetragon.io/docs/contribution-guide/submitting-a-pull-request/.
Before the implementation, though, it would be interesting to specify the fetaure so that's it is clear what the semantics are. This can be done in the PR itself (e.g,. in a commit message), in an issue, or in a CFP (see https://github.com/cilium/design-cfps). Whatever works best for you!
For example, it would be very useful to have an example policy that uses the newly introduced operator and discuss how it works
|
Another thing to note is that the functionality is similar to https://tetragon.io/docs/concepts/tracing-policy/selectors/#follow-children. So I wonder if something like: Would achieve a similar result. The first thing to check would be if |
|
@kkourt So listing multiple operators in match binary selector will fail. |
|
One more thing to note is that this looks related to
Thanks! I don't remember why this limitation exists, but my guess is that this is something that can be fixed. So, then, the question in my mind is whether the use-case you are describing is better served by matching on the immediate parent or all on ancestors (for which some support already exists). PS: Might be worth updating the docs to reflect above. |
I wonder if supporting 2 binary selectors would be easier than supporting N, but would potentially achieve the result without supporting parent selectors altogether? We could maybe have a limit to how many "generations" of children we follow. In this case, a limit of 1 generation would prevent reporting grandchildren. Haven't looked to see if this is easy or not however. |
Currently exactly 1 Moreover, maybe explicit parents filtering, working in the same way as current process binary filtering, is easier to understand and to read. |
With current existing support we cannot exclude the binary itself because of the selector number limitation. |
70ac377 to
d503dbd
Compare
If we want to match concrete child process, seems like this wouldn't achieve the same result. |
d503dbd to
12bf008
Compare
|
@kkourt |
Thanks! Can you please add some context in the commit messages (see: https://tetragon.io/docs/contribution-guide/submitting-a-pull-request/)? Before merging the PR, we would need tests and documentation updates. If you are looking for early feedback, can you add an example (maybe in the commit message) on how to use the new selector? |
|
I'm also seeing some CI failures: |
Is it an issue of this PR? |
eeab3e7 to
5d54ba8
Compare
mtardy
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, overall looks good to me, just a few nits
Probably I should add docs for this selector
Please update the documentation with this new selector (see https://tetragon.io/docs/concepts/tracing-policy/selectors/).
|
I see checkpatch is complaining and other CI checks, you can run |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM (barring the red test results).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it be possible to handle this as special case of current BinarySelector, having some parent bool like:
diff --git a/pkg/k8s/apis/cilium.io/v1alpha1/types.go b/pkg/k8s/apis/cilium.io/v1alpha1/types.go
index 6c3210b0acf5..b5133b1e5fe2 100644
--- a/pkg/k8s/apis/cilium.io/v1alpha1/types.go
+++ b/pkg/k8s/apis/cilium.io/v1alpha1/types.go
@@ -124,6 +124,10 @@ type BinarySelector struct {
// +kubebuilder:validation:Optional
// +kubebuilder:default=false
FollowChildren bool `json:"followChildren"`
+ // match parent binary
+ // +kubebuilder:validation:Optional
+ // +kubebuilder:default=false
+ Parent bool `json:"parent"`
}
then all the code dups with binary selectors would go away
we could perhaps think a bit about some maps max_entries tuning
5d54ba8 to
600672b
Compare
|
@olsajiri Hi!
Moreover, |
|
@olsajiri |
pkg/k8s/apis/cilium.io/client/crds/v1alpha1/cilium.io_tracingpoliciesnamespaced.yaml
Show resolved
Hide resolved
41d8652 to
afdfd36
Compare
This adds genericMatchBinariesSelector enumeration, renames match binaries maps and changes match binaries parsing to allow parsing other matchBinaries-like selectors in future with the same code. Signed-off-by: Kobrin Ilay <[email protected]>
This commit introduces new matchParentBinaries selector,
which is used for filtering parent process binaries. New
selector works similarly to matchBinaries selector, including
followChildren option, which allows to match not only
direct parent binaries, but transitive parent binaries
as well.
matchParentBinaries selector is needed in cases when we
want to have granular filters on parent binaries for some
events. For example, we may want to intercept calls for
specific system call from specific binary only in case
it was executed with interactive shell. For that we can
add following selector, which will match only process with
bash, sh or zsh parent:
```
- matchParentBinaries:
- operator: "In"
values:
- "/usr/bin/bash"
- "/usr/bin/sh"
- "/usr/bin/zsh"
```
Signed-off-by: Kobrin Ilay <[email protected]>
This commit adds crds generated for matchParentBinaries selector Signed-off-by: Kobrin Ilay <[email protected]>
afdfd36 to
5d3179d
Compare
mtardy
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't check the code of this, but for the docs change LGTM! Thanks for taking the feedback :)
olsajiri
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good, thanks
kkourt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
pkg/sensors/tracing/kprobe_test.go
Outdated
| observertesthelper.LoopEvents(ctx, t, &doneWG, &readyWG, obs) | ||
| readyWG.Wait() | ||
|
|
||
| if err := exec.Command("/usr/bin/bash", "-c", "echo '/usr/bin/tail /etc/passwd' | /usr/bin/bash").Run(); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need both /usr/bin/bash -c and cmd | bash... Shouldn't one be enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for some reason when I test locally and set filters on both parent and current processes binaries, I don't catch the event when execute /usr/bin/bash -c '/usr/bin/tail /etc/passwd', but catch it when execute /usr/bin/bash", "-c", "echo '/usr/bin/tail /etc/passwd' | /usr/bin/bash
but when I match only current process binary and execute /usr/bin/bash -c '/usr/bin/tail /etc/passwd', I catch the event and see, that parent process binary is /usr/bin/bash..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we need both /usr/bin/bash -c and | /usr/bin/bash for correct test: the first one executes shell, in which we can use pipe to redirect echo output to shell again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if I understand. If I do bash -c "tail -1 /etc/hostname", I'm seeing the following event:
{
"process_exec": {
"process": {
"binary": "/usr/bin/tail",
"arguments": "-1 /etc/hostname",
"flags": "execve",
},
"parent": {
"binary": "/usr/bin/bash",
"arguments": "-c \"tail -1 /etc/hostname\"",
}
},
}So I would expect the tail process to be matched by the matchParentBinary selector.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, there is such event. But it is not matched by selector.
But if you check exec ids of process and parent, they will be the same (like on screenshot I attached above): executing bash -c ... under strace shown that both execve("/usr/bin/bash") and execve("/usr/bin/tail") are called in the same process, and because of the fact that both execve are called within the same process, when we get parent process, we get parent of bash, not tail (so we get grandparent actually).
Unfortunately, I don't see the way to fix it, because when execve call is invoked in the same process, we overwrite execve map value and loose info about bash process. We could pass parent path, which is known, to match_binaries, but we won't have info like mbset, etc, so I suggest to leave it as it is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checked, event looks like this:
{
"process_tracepoint": {
"process": {
"exec_id": "a2luZC1jb250cm9sLXBsYW5lOjcyMTQ2MzcwMTIzMTc4ODoyMTQ0MzM0",
"pid": 2144334,
"uid": 1001,
"cwd": "/home/kobrineli",
"binary": "/usr/bin/tail",
"arguments": "/etc/passwd",
"flags": "execve",
"start_time": "2025-11-18T02:00:56.515413635Z",
"auid": 1001,
"parent_exec_id": "a2luZC1jb250cm9sLXBsYW5lOjcyMTQ2MzY4NzA2Njc2NzoyMTQ0MzM0",
"refcnt": 1,
"cap": {},
"tid": 2144334
},
"parent": {
"exec_id": "a2luZC1jb250cm9sLXBsYW5lOjcyMTQ2MzY4NzA2Njc2NzoyMTQ0MzM0",
"pid": 2144334,
"uid": 1001,
"cwd": "/home/kobrineli",
"binary": "/usr/bin/bash",
"arguments": "-c \"/usr/bin/tail /etc/passwd\"",
"flags": "execve clone",
"start_time": "2025-11-18T02:00:56.501249715Z",
"parent_exec_id": "a2luZC1jb250cm9sLXBsYW5lOjcyMTQ1OTE4NzgwNjY3MToyMTQ0MjU4",
"cap": {},
"tid": 2144334
},
"subsys": "syscalls",
"event": "sys_exit_execve",
"action": "KPROBE_ACTION_POST"
}
}process and parent pids, tids are same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm... I don't see how we could merge this if we cannot catch the above (simple) use-case. One solution would be to keep the parent binary in the execve map, but that's (a lot of) additional memory overhead.
I think that such cases where execve is called in parent process instead of creating a new one with fork is more an exclusion than a normal situation.
Normally processes are created as combination of fork + execve, and we will be able to catch such cases.
I agree, that saving parent process info in execve map will result in big memory overhead.
Maybe it makes sense to add a note about such cases in docs and merge in current state?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that such cases where execve is called in parent process instead of creating a new one with fork is more an exclusion than a normal situation.
Even if just an exception (which I'm not sure if it's the case), we still need to do proper filtering for it. And given how matchBinaries work and what users would expect from a matchParentBinaries, I don't see how documentation could fix this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is that perhaps just an issue with the test? when I run the test by hand it seems to work properly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is that perhaps just an issue with the test? when I run the test by hand it seems to work properly
The test works correctly.
The problem is that matchParentBinaries selector cannot catch cases in which process and its parent has same pids, because execve is called twice without a fork, like in bash -c <cmd>, so the test with bash -c without | bash wouldn't work.
5d3179d to
c85dc58
Compare
This commits adds logic for parsing matchParentBinaries selector. New selector is stored in same maps as matchBinaries selector, but has key offset equal to MaxSelectors. Signed-off-by: Kobrin Ilay <[email protected]>
This commit adds bpf code for parent binaries filtering for new matchParentBinaries selector. Match binaries bpf maps sizes are increased to MAX_SELECTORS * 2 to store both selectors options. Parent filtering is processed with the same code, but key for matchParentBinaries has offset equal to MAX_SELECTORS. Signed-off-by: Kobrin Ilay <[email protected]>
This commit adds test for new matchParentBinaries selector. In the test we verify that all operations (In, NotIn, Prefix, NotPrefix, Postfix, NotPostfix) work correctly. Signed-off-by: Kobrin Ilay <[email protected]>
Signed-off-by: Kobrin Ilay <[email protected]>
c85dc58 to
b810fa7
Compare
|
I don't know why golangci-lint failed on windows only with issues unrelated to the PR. |
|
CI failed on the same job here: https://github.com/cilium/tetragon/actions/runs/19268199834/job/55091515756?pr=4316 |
kkourt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Marking this as "Request changes" to avoid accidentally being merged.
See #4254 (comment)
|
@olsajiri @kkourt The bug is that when we match for parent binaries, we search for parent by current process real parent pid. But if There are 2 possible solutions:
And we will work with them in And then, when we match parent binaries, we check for the entry in Separate map could give us not so big memory overhead as parent binary field in execve map value, because we could set its size to be N times less than execve map (because case when parent pid and current process pid are same is not common) What do you think about it? |
we were discussing this with @kkourt and IIUC we lean to have have documented that both binaries selectors match only existing processes.. so we disregard cases were process does multiple execs that said, if you would have the change for option 2) it'd be great to check that and see if that might be way out |
Signed-off-by: Kobrin Ilay <[email protected]>
Adds tests for cases when bash runs binary in same process. Also adds check not only for true positive, but also for true negative. Signed-off-by: Kobrin Ilay <[email protected]>
Signed-off-by: Kobrin Ilay <[email protected]>
Signed-off-by: Kobrin Ilay <[email protected]>
Signed-off-by: Kobrin Ilay <[email protected]>
|
@olsajiri @kkourt What changed:
Also I've added more tests, which include running I've checked changes locally, now it works fine with all cases, including using All tests passed, I'll fix checkpatch in the end if everything else is fine. If you don't like the idea, I'm okay with reverting these changes and documenting that current selectors work only with real processes and don't with for processes created with multiple execs. |
Description
This change adds
matchParentBinariesselector, which might be useful for proper and granular filtering of parent binaries, which is needed in some specific cases.For instance, there is a python script, which invokes some system calls, which we want to intercept and report. If such script is executed by some system process, we want to filter it out. Otherwise, we report it. For this filtering we need a selector for parent binary, because we cannot filter out events only by current binary, which in case of python script execution is always
python.For more real example, consider we want to hook all calls of
chmodsystem call to prevent creating new binaries on the system manually.apt-keybinary, when it installs some packages (such cases we don't want to report), doesn't callchmoddirectly, but uses/usr/bin/chmodbinary, which callschmodsystem call inside.matchParentBinariesselector would help to create accurate exclusion for this case.Example of policy with
matchParentBinariesselector:Consider we want to get events about all files, which were made executable, with
chmodsyscall, but don't want to get events aboutapt-keymaking files executable. Unfortunately,apt-keydoesn't make files executable with syscall directly, but uses/usr/bin/chmodbinary, which callschmodfunction, so to filter such events we need to have selectors for both parent and current binary, so the resulting policy will look like this:If current binary is
/usr/bin/chmod, we don't care about parent binary, but if current binary is/usr/bin/chmod, we don't want the parent binary to be/usr/bin/apt-key.Changelog