Add call frame information (CFI) annotations #214

jargh · 2025-03-26T18:31:45Z

Issue #, if available: #197

Description of changes:

This includes call frame information annotations in all the code, roughly following the explanation here: https://www.imperialviolet.org/2017/01/18/cfi.html

The change is made for both ARM and x86, largely by wrapping certain relevant instructions in macros that augment the core instruction with appropriate CFI directives, e.g. for ARM

#define CFI_PUSH2(lo,hi) stp lo, hi, [sp, #-16]! ; .cfi_adjust_cfa_offset 16 ; .cfi_rel_offset lo, 0 ; .cfi_rel_offset hi, 8

and for x86

#define CFI_PUSH(reg) push reg ; .cfi_adjust_cfa_offset 8 ; .cfi_rel_offset reg, 0

As part of this, two other systematic changes are made:

All local labels are prefixed with L to make their local-ness explicit
The internal header is split into separate ones for arm, x86 and x86_att

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

This is the start of a transformation that will annotate the code properly with call frame information (CFI) directives, a lack in the current s2n-bignum codebase noted in awslabs#197 So far, this picks the principal CFI-relevant instructions such as calls and returns and those that modify the stack pointer, and simply wraps the existing instructions inside macros that can then later be augmented with CFI directives. The definitions of the macros in "include/_internal_s2n_bignum.h" show those instructions that are wrapped, for ARM and/or x86: both CFI_START <marker for entry point> both CFI_RET ret ARM CFI_BL(target) bl target ARM CFI_PUSH2(lo,hi) stp lo, hi, [sp, #-16]! ARM CFI_POP2(lo,hi) ldp lo, hi, [sp], awslabs#16 ARM CFI_INC_SP(offset) add sp, sp, #(offset) ARM CFI_DEC_SP(offset) sub sp, sp, #(offset) x86 CFI_CALL(target) call target x86 CFI_PUSH(reg) push reg x86 CFI_POP(reg) pop reg x86 CFI_INC_RSP(offset) add rsp, offset x86 CFI_DEC_RSP(offset) sub rsp, offset

The original script was a bit too stupid and changed these too.

For a relatively small number of functions, the Windows ABI form literally calls the standard ABI form as a subroutine. For these we add CFI_START to the start of the standard subroutine as well, within the scope of the Windows ABI "#if ... #endif" block.

This adds directives for where the callee-saved registers are directly stored onto the stack at an offset without modifying the stack pointer. Not very many wrappers are written in this style, but a few are.

Also fix up functions with multiple exits so we only wrap the final ret as a CFI_RET.

This mostly seems to work as expected, but the combination of saving and restoring registers in the Windows ABI form and having multiple exit points seemed to make x86/generic/bignum_add.S need a manual reset of the CFA offset after the first return.

This fixes an issue assembling the CFI-ful files with recent Mac OS assemblers, which objected to labels inside the CFI blocks unless explicitly stated to be local. The error messages are not very clear, but this seems to be the issue that is here attributed to an LLVM17 change: https://gitlab.winehq.org/wine/wine/-/merge_requests/4547

This involves trifurcating the "_internal_s2n_bignum.h" header into three different versions with _arm, _x86 and _x86_att suffixes. This is useful in the _x86 and _x86_att cases to distinguish the definitions of the CFI adjustment macros, and for _arm to eliminate irrelevant CET tests.

This got missed earlier because of the space before it and then not caught because it was only relevant for the Windows ABI.

These files were just forgotten earlier. Although this code is not part of the final build, it is used as a stepping-stone in the proofs.

This is done via a macro S2N_BN_FUNCTION_TYPE_DIRECTIVE so that it can be disabled on Apple, since the native Mach-O format does not seem to use any analogous directive. Otherwise this unpacks to the usual directive of the form (example for "bignum_mul") .type bignum_mul, %function

This seems to matter for some versions of objdump, which can affect the ctCheck test.

briansmith · 2025-03-27T19:15:35Z

Have you tried using objtool to verify the annotations? See https://github.com/torvalds/linux/blob/3a90a72aca0a98125f0c7350ffb7cc63665f8047/tools/objtool/Documentation/objtool.txt for documentation on it. My understanding from reading the LKML is that it was expressly designed to support non-Kernel use cases too.

jargh · 2025-03-29T07:18:10Z

Thanks, objtool is very interesting, and I didn't know about it at all. I've built it and started to play around with it on this CFI-enabled s2n-bignum code. If I try running objtool -s x86/generic/bignum_mul.o I get:

x86/generic/bignum_mul.o: warning: objtool: bignum_mul() is missing an ELF size annotation

That happens on many (all?) functions. A further 28 have errors like this (presumably they are those with embedded function calls).

x86/curve25519/bignum_sqrt_p25519.o: warning: objtool: .text+0x89: unannotated intra-function call

I'm still not quite sure about whether I'm using the appropriate options nor how to interpret the output, but one encouraging difference is that the s2n-bignum mainline functions (not this CFI-enabled branch) generate zillions of errors with pretty much all instructions considered unreachable. So the CFI annotations have made a difference of some sort :-)

torben-hansen · 2025-03-31T13:52:01Z

Btw, this change is currently not workable in AWS-LC's FIPS build mode. The delocator is choking.

As a quick sniff test. I took the following files, copied them into AWS-LC:

x86_att/curve25519/edwards25519_scalarmulbase.S, and
include/_internal_s2n_bignum_x86_att.h.

I then ran a typical FIPS build:

$ cmake3 -GNinja -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=$(pwd)/../awslc-install -DBUILD_SHARED_LIBS=OFF -DFIPS=1 ..
$ ninja-build
[...]
[2/242] Generating bcm-delocated.S                                                                                           
FAILED: crypto/fipsmodule/bcm-delocated.S /home/ec2-user/humble/aws-lc/awslc-build/crypto/fipsmodule/bcm-delocated.S         
[...]                                                              
error while parsing "curve25519/edwards25519_scalarmulbase.S.S": 
parse error near WS (line 47 symbol 24 - line 47 symbol 25):
"  .cfi_adjust_cfa_offset ((15*32 +8))\n\n\n\n    "

I don't think it will be terribly hard to support, but it needs to be added. The PEG parser already contains some directive argument support:

Directive <- '.' DirectiveName (WS Args)?
DirectiveName <- [[A-Z0-9_]]+
[...[
Args <- Arg ((WS? ',' WS?) Arg)*
Arg <- QuotedArg / [[0-9a-z%+\-*_@.]]*

aqjune-aws · 2025-03-31T15:13:32Z

I wonder whether 'canonicalizing' an .S file by (1) compiling it into .o using an assembler and (2) disassembling it through objdump and (3) using the disassembled file is supposed to pass the delocator.
I suppose delocator is failing when one is using a kind of exotic syntax; if delocator is supposed to accept syntax printed from objdump, we can reduce the complexity of this problem.

briansmith · 2025-03-31T15:20:34Z

Btw, this change is currently not workable in AWS-LC's FIPS build mode.

See also https://issues.chromium.org/issues/42290158.

briansmith · 2025-03-31T15:34:18Z

x86/generic/bignum_mul.o: warning: objtool: bignum_mul() is missing an ELF size annotation

Look at this fragment of BoringSSL assembly code:

.globl	Whatever
.type	Whatever,\@abi-omnipotent
.align	16
Whatever:
.cfi_startproc
	_CET_ENDBR
	ret
.cfi_endproc
.size	Whatever, .-Whatever

It looks like your assembly is missing the .size line. . means "the current address" so ".-Whatever" means "the difference between the current address and the address of the Whatever label," i.e. the size of the function. OpenSSL had these .size directives even before it added CFI.

Probably you want to create a macro similar to the Linux kernel's SYM_FUNC_END(name) macro that is used at the end of every function, which would do this .size directive and other boilerplate.

briansmith · 2025-03-31T15:35:51Z

x86/curve25519/bignum_sqrt_p25519.o: warning: objtool: .text+0x89: unannotated intra-function call

You can see how somebody fixed it here, in the kernel: https://lore.kernel.org/linuxppc-dev/[email protected]/. Basically they used the SYM_FUNC_START_LOCAL(name) macro at the start of the function and SYM_FUNC_END(name) at the end. Those two macros expand to some CFI-related metadata.

torben-hansen · 2025-03-31T16:55:49Z

I wonder whether 'canonicalizing' an .S file by (1) compiling it into .o using an assembler and (2) disassembling it through objdump and (3) using the disassembled file is supposed to pass the delocator. I suppose delocator is failing when one is using a kind of exotic syntax; if delocator is supposed to accept syntax printed from objdump, we can reduce the complexity of this problem.

Maybe(?) But then you assume objdump is available. I don't think this is too hard to fix. Probably just have to extend PEG to understand the argument.

jargh · 2025-04-01T00:20:03Z

Thanks for the interpretation of the objtool errors and the advice on fixing them; I'll give that a try.

I've not been successful in finding an objtool that works for aarch64 - any ideas there? I did find this but it seems to be work in progress:https://lwn.net/Articles/893115/. if nothing else I can use the errors that pop up on x86 as guidance for analogous things on ARM.

briansmith · 2025-04-01T17:49:49Z

arm/curve25519/bignum_cmul_p25519.S

@@ -44,6 +47,7 @@

 S2N_BN_SYMBOL(bignum_cmul_p25519):


I am a bit confused here. Is the idea that bignum_cmul_p25519 and bignum_cmul_p25519_alt are the same function, i.e. one is an alias for the other? I wonder if this pattern is allowed by all ABIs and/or if it might be the source of some of the issues here. In OpenSSL when they have such aliases, the alias is implemented as a whole separate function that jmps to the function it is aliasing.

We were initially trying to support a uniform naming scheme for both ARM and x86. The bifurcation into "alt" and "non-alt" arises when for at least one architecture there is value in different variants with different microarchitectural targets. Cases like this where the same function has both names reflect those where on the other architecture (only) there is value in two versions, but on this architecture it didn't seem worth it. Currently, consumers like AWS-LC handle the dispatch according to uarch. Now, we no longer quite maintain the uniformity of this naming scheme since there are a few extra functions for ARM, so perhaps we should revisit this.

briansmith · 2025-04-01T18:05:08Z

include/_internal_s2n_bignum_x86_att.h

+// Variants of instructions including CFI (call frame information) annotations
+
+#define CFI_START .cfi_startproc
+#define CFI_RET retq ; .cfi_endproc


Why CFI_RET is used instead of just keeping normal ret and then defining a CFI_END or similar?

Why combine ret and .cfi_endproc with a macro but not .cfi_startproc and _CET_ENDBR?

I was attracted to wrapping things up into a unified single pseudo-instruction on the grounds that it perhaps makes it less likely that code and annotations will get out of step. I agree the current approach is not quite consistent in how it does it, though; as for _CET_ENDBR, that arose in a different PR and only applies to x86, so I'd not been thinking of it in combination, but indeed it could make sense to unify it as an "all the initial boilerplate for a function" pseudo-instruction. One slight difference is that I'm also using .cfi_startproc at internal subroutines, but those don't have the ENDBR decorations since they are only the target of internal direct calls. But I may have to think again about what CFI annotations are appropriate for internal subroutines anyway since that was one of the things that objtool was complaining about.

briansmith · 2025-04-01T18:53:23Z

x86/sm2/bignum_montsqr_sm2.S

-        push    r12
-        push    r13
-        push    r14
-        push    r15


I noticed much of the SM2 code uses rbp as a GP register whereas almost none of the other code seems to? I didn't look at the SM2 code but I suspect that wherever RBP is changed, more work needs to be done. [Edit: nevermind.]

briansmith · 2025-04-01T18:54:41Z

x86_att/p256/p256_montjmixadd.S


-        subq    $NSPACE, %rsp
+        CFI_DEC_RSP(NSPACE)

        movq    %rdx, %rbp


Here is one of the few places outside of SM2-specific code where RBP is changed, AFAICT.

Actually I think there are many more RBP changes, since very often it's hidden under macros. If you look at the objdump output and MAYCHANGE fields in the proof, you can see them all. Try for example:
grep define x86/*/*.S | grep rbp$
In particular, it's often used as a zero for ADCX/ADOX which do not allow immediate arguments.

Indeed, I think I overlooked the other uses because it is referenced as ebp elsewhere, presumably as an optimization (e.g. xor ebp, ebp).

In any case, I seems like I was misremembering, since CFI will use rsp as the default base instead of rbp like I thought, so AFAICT we don't have to pay any special attention to rbp.

Ah yes indeed, there is also the ebp optimization. This codebase started with NASM, which makes a few optimizations such as xor ebp, ebp for you automatically if you write rbp everywhere. But we made it explicit when switching to the GNU assembler (or its clang counterpart).

briansmith · 2025-04-01T19:33:51Z

include/_internal_s2n_bignum_arm.h

+#define CFI_START .cfi_startproc
+#define CFI_RET ret ; .cfi_endproc
+
+#define CFI_BL(target) bl target


Do we need these function call wrapper macros? it seems like they don't do anything interesting?

No, that's true - I must have thought there would be some annotation to add later. I am still wondering if I might need something to pacify objtool, as suggested in its documentation, but perhaps the real issue is that the subroutine itself isn't properly annotated or is not considered as a distinct function from a CFI point of view (as was, provisionally at least, intended). I'll have to look more at the pointers you sent above.

file.o: warning: unannotated intra-function call

This warning means that a direct call is done to a destination which
is not at the beginning of a function. If this is a legit call, you
can remove this warning by putting the ANNOTATE_INTRA_FUNCTION_CALL
directive right before the call.

fixing a few automerge failures for the combination of CFI updates and the __LF change to instruction separators.

This adopts the change awslabs#230 but for the newly bifurcated structure of the include file(s).

jargh and others added 12 commits March 21, 2025 22:41

Undo CFI macro transformations inside comments

0a66845

The original script was a bit too stupid and changed these too.

Merge branch 'awslabs:main' into cfi

89726d6

Add CFI start directives at internal/nested procedure entry points

b0126a7

Add CFI directives for non-push/pop save and restore

07dce91

This adds directives for where the callee-saved registers are directly stored onto the stack at an offset without modifying the stack pointer. Not very many wrappers are written in this style, but a few are.

Add actual cfi_startproc and cfi_endproc directives

e0deebf

Also fix up functions with multiple exits so we only wrap the final ret as a CFI_RET.

Add saved register offsets to the CFI macro definitions

4bc18f5

Add local prefix to a rogue label that was missing it

11416fa

This got missed earlier because of the space before it and then not caught because it was only relevant for the Windows ABI.

jargh mentioned this pull request Mar 26, 2025

Call Frame Information (CFI) and frame pointer considerations #197

Open

jargh added 3 commits March 26, 2025 15:39

Update local header file name in the ARM unopt code as well

85bc16c

These files were just forgotten earlier. Although this code is not part of the final build, it is used as a stepping-stone in the proofs.

Add a function type directive for internally called functions

08f8ce5

This seems to matter for some versions of objdump, which can affect the ctCheck test.

briansmith mentioned this pull request Mar 31, 2025

Document support Intel CET IBT (Indirect Branch Tracking) #216

Open

Merge branch 'awslabs:main' into cfi

87df1a1

briansmith reviewed Apr 1, 2025

View reviewed changes

jargh and others added 10 commits April 2, 2025 14:35

Merge branch 'awslabs:main' into cfi

3d48f62

Merge branch 'awslabs:main' into cfi

418d53b

Merge branch 'awslabs:main' into cfi

df81139

Merge remote-tracking branch 'awslabs/main' into cfi

8abbd61

fixing a few automerge failures for the combination of CFI updates and the __LF change to instruction separators.

Merge branch 'awslabs:main' into cfi

707c5d4

Merge branch 'awslabs:main' into cfi

2c038d9

Merge branch 'awslabs:main' into cfi

c357785

Merge remote-tracking branch 'awslabs/main' into cfi

956e288

Adopt refinement to _CET_ENDBR definition in bifurcated include files

5715ba9

This adopts the change awslabs#230 but for the newly bifurcated structure of the include file(s).

Merge branch 'awslabs:main' into cfi

f5ddda4

Add call frame information (CFI) annotations #214

Are you sure you want to change the base?

Add call frame information (CFI) annotations #214

Uh oh!

Conversation

jargh commented Mar 26, 2025

Uh oh!

briansmith commented Mar 27, 2025

Uh oh!

jargh commented Mar 29, 2025

Uh oh!

torben-hansen commented Mar 31, 2025

Uh oh!

aqjune-aws commented Mar 31, 2025

Uh oh!

briansmith commented Mar 31, 2025

Uh oh!

briansmith commented Mar 31, 2025

Uh oh!

briansmith commented Mar 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

torben-hansen commented Mar 31, 2025

Uh oh!

jargh commented Apr 1, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

briansmith Apr 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

briansmith commented Mar 31, 2025 •

edited

Loading

briansmith Apr 1, 2025 •

edited

Loading