Skip to content

Commit ca38a5d

Browse files
committed
Add draft of inline-skating
1 parent 3c792a4 commit ca38a5d

File tree

1 file changed

+240
-0
lines changed

1 file changed

+240
-0
lines changed
Lines changed: 240 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,240 @@
1+
---
2+
layout: post
3+
title: C4 - Inline skating
4+
date: 2025-09-12 00:00:00 -0500
5+
categories: general
6+
---
7+
8+
Inlining allows the compiler to replace some function calls with direct code insertion, improving performance. There are two mechanisms of inlining in Clojure: user-defined inlining and intrinsic operations.
9+
10+
After the heavy lifting of the last few posts, this post is just a little palate cleanser.
11+
12+
In [C4 - Primitive urges][TBD], we looked at how unnecessary boxing of primitive types can be avoided. That is accomplished by signficant compiler magic spread over a large number of classes.
13+
14+
In contrast, inlining and intrinsic operations are very localized and easy to understand.
15+
16+
## Inlining for function definitions
17+
18+
Clojure allows to specify inlining instructions on function definitions. This is done by adding metadata to the function definition. This is done extensively in the Clojure core and math libraries. It can also be used by regular folks in application code.
19+
20+
We will take our examples from the Clojure core library. We start with a simple example: `compare`.
21+
22+
```Clojure
23+
(defn compare
24+
"Comparator. Returns a negative number, zero, or a positive number
25+
when x is logically 'less than', 'equal to', or 'greater than'
26+
y. Same as Java x.compareTo(y) except it also works for nil, and
27+
compares numbers and collections in a type-independent manner. x
28+
must implement Comparable"
29+
{
30+
:inline (fn [x y] `(. clojure.lang.Util compare ~x ~y))
31+
:added "1.0"}
32+
[x y] (. clojure.lang.Util (compare x y)))
33+
```
34+
35+
Ignoring the metadata, we have a simple function definitions that does host interop, calling a static method in the `clojure.lang.Util` class.
36+
37+
You can ignore the `:static` metadata; it is no longer used. Of relevance is the `:inline` metadata. It's value is a function that takes a single argument and returns a list. It is reminiscent of a macro definition, and in fact its use is similar.
38+
39+
Suppose we are compiling and reach the expression:
40+
41+
```Clojure
42+
(compare x "abc")
43+
```
44+
45+
This code will make its way down to `Compiler.AnalyzeSeq`. That method will first try to macroexpand it; `compare` is not a macro, so there is no change. The next step is to check `compare` for inlining metadata.
46+
47+
```C#
48+
// in this example, op = `compare`
49+
// the count is the number of arguments, here 2
50+
IFn inline = IsInline(op, RT.count(RT.next(form)));
51+
52+
// If there is inline data, it will be a function.
53+
// apply the function to get the a replacement form to be analyzed.
54+
// Transfer metadata from the source to the replacement code and analyze that instead.
55+
if (inline != null)
56+
return Analyze(pcon, MaybeTransferSourceInfo(PreserveTag(form, inline.applyTo(RT.next(form))), form));
57+
```
58+
59+
For our example, `(compare x "abc")` will be tranformed to:
60+
61+
```Clojure
62+
(. clojure.lang.Util compare x "abc")
63+
```
64+
65+
Rather than compiling to a call to the compare function, which would itself call `clojure.lang.Util.compare`, we directly compile the interop call, bypassing the function call overhead.
66+
67+
One could achieve the same effect by defining `compare` as a macro. However, that would preclude using `compare` as a first-class function; for example, you would not able to passing it as an argument to higher-order functions. Inlining allows the use as a first-class function while still allowing the compiler to optimize away the function call overhead.
68+
69+
70+
## Inline arity
71+
72+
You will note that the method `Compiler.IsInline` is passed the operator (`compare`) and the argument count (2). We can specify which arities should be inlined. For example,
73+
74+
```Clojure
75+
(defn =
76+
"Equality. Returns true if x equals y, false if not. Same as
77+
Java x.equals(y) except it also works for nil, and compares
78+
numbers and collections in a type-independent manner. Clojure's immutable data
79+
structures define equals() (and thus =) as a value, not an identity,
80+
comparison."
81+
{:inline (fn [x y] `(. clojure.lang.Util equiv ~x ~y))
82+
:inline-arities #{2}
83+
:added "1.0"}
84+
([x] true)
85+
([x y] (clojure.lang.Util/equiv x y))
86+
([x y & more]
87+
(if (clojure.lang.Util/equiv x y)
88+
(if (next more)
89+
(recur y (first more) (next more))
90+
(clojure.lang.Util/equiv y (first more)))
91+
false)))
92+
```
93+
94+
If `:inline-arities` is specified, its value should be a function that takes an integer and returns truthy if a call of that arity should be inlined. In this case, the function is a set; `IPersistentSet` implements `IFn` on one argument and checks for membership. The set `#{2}` will return true only for the argument 2. One can find examples like `#{2 3}`, allowing several designated arities to be inline. Also, you can apply an arbitrary function as a predicate. For exmaple, there are multiple places in the core library where one finds `:inline-arities >1?`, which is a function that returns true for any argument greater than 1.
95+
96+
```Clojure
97+
(defn ^:private >1? [n] (clojure.lang.Numbers/gt n 1))
98+
```
99+
100+
Note that this is private, so you'll have to define your own version if you need this test.
101+
102+
103+
__TMI warning!__: One last note on the core code. You will see one mysterious function, `nary-inline`, used in several places. For example:
104+
105+
```Clojure
106+
(defn +
107+
"Returns the sum of nums. (+) returns 0. Does not auto-promote
108+
longs, will throw on overflow. See also: +'"
109+
{:inline (nary-inline 'add 'unchecked_add)
110+
:inline-arities >1?
111+
:added "1.2"}
112+
([] 0)
113+
([x] (. clojure.lang.RT (NumberCast x))) ;;; (cast Number x))
114+
([x y] (. clojure.lang.Numbers (add x y)))
115+
([x y & more]
116+
(reduce1 + (+ x y) more))
117+
```
118+
119+
`nary-inline` is also a private function. It picks one of the two method names passed to it based on whether `*unchecked-math*` is true or not, and constructs a call to the corresponding static method in `clojure.lang.Numbers`. It is used in several places in the core library to handle math operations.
120+
121+
122+
## Intrinsic operations
123+
124+
Now that we have user-defined inlining, we can look at intrinsic operations. These are used to replace calls to certain static methods with direct MSIL/bytecode instructions. This form is not user-definable; it is hard-coded in the compiler and applies to basic arithmetic, logical, and bitwise operations on primitive types.
125+
126+
Consider:
127+
128+
```clojure
129+
(defn f [x y] (+ x y))
130+
```
131+
132+
Inlining will help us. `f` will compile directly to a call to `clojure.lang.Numbers.add`, avoiding the overhead calling `f.invoke`:
133+
134+
```C#
135+
public static object invokeStatic(object P_0, object P_1)
136+
{
137+
return Numbers.add(P_0, P_1);
138+
}
139+
```
140+
141+
There are a lot of overloads of `Numbers.add` to handle different types of arguments. In this case, because we do not have more specific type information about `x` and `y`, we will end up calling the version that takes two `object` arguments. This will result in boxing if `x` and `y` are primitive type values, and a bunch of type checking and dispatching inside `Numbers.add`.
142+
143+
We learned in [C4 - Primitive urges][TBD] how to avoid boxing. If we restrict the types of `x` and `y` to, say, `double`, we can avoid boxing:
144+
145+
146+
```clojure
147+
(defn f ^double [x y] ^double (+ x y))
148+
```
149+
150+
One would expect this to compile to:
151+
152+
```C#
153+
public static double invokeStatic(double P_0, double P_1)
154+
{
155+
return Numbers.add(P_0, P_1);
156+
}
157+
```
158+
159+
where now we will be calling the `double Numbers.add(double, double)` overload of `Numbers.add`.
160+
And that would be case if not for intrinsic operation inlining. In the method that does code generation for host interop calls where we know the actual method to invoke (i.e., not a reflection situation), we find:
161+
162+
```C#
163+
// ... Code to put the arguments on the stack ...
164+
if (IsStaticCall)
165+
{
166+
if (Intrinsics.HasOp(_method))
167+
// We have an intrinsic operation definition for this method.
168+
// Generate equivalent IL instructions directly.
169+
Intrinsics.EmitOp(_method, ilg);
170+
else
171+
// No intrinsic operation; defined.
172+
// Generate regular static method call
173+
ilg.Emit(OpCodes.Call, _method);
174+
}
175+
else
176+
{
177+
// ...
178+
}
179+
```
180+
181+
There are internal tables that map certain static methods to sequences of MSIL/bytecode instructions. In our example, the call is to `clojure.lang.Numbers.add(double, double)`. There is an entry in the table matching that method a single instruction, `Opcodes.Add`. No method call is generated. The `Add` opcode is inserted in the instruction stream.
182+
183+
```C#
184+
// return P_0 + P_1;
185+
IL_0000: ldarg.0
186+
IL_0001: ldarg.1
187+
IL_0002: add
188+
IL_0003: ret
189+
```
190+
191+
There are intrinsic definitions for many of the static methods in `clojure.lang.Numbers`, typically similar to the one given here, where the arguments are one or two primitive types. There are also intrinsic definitions for other operations, such as array access and numeric conversions.
192+
193+
A separate category is for predicates: Tests to be performed that return a boolean result and branch.
194+
These are called with a label for the false branch. This substitution is done only in ode generation for the test expression in an `IfExpr`. For example, consider
195+
196+
```clojure
197+
198+
```
199+
200+
`clojure.core/<` is defined as:>
201+
202+
```Clojure
203+
(defn <
204+
"Returns non-nil if nums are in monotonically increasing order,
205+
otherwise false."
206+
{:inline (fn [x y] `(. clojure.lang.Numbers (lt ~x ~y)))
207+
:inline-arities #{2}
208+
:added "1.0"}
209+
([x] true)
210+
([x y] (. clojure.lang.Numbers (lt x y)))
211+
([x y & more]
212+
(if (< x y)
213+
(if (next more)
214+
(recur y (first more) (next more))
215+
(< y (first more)))
216+
false)))
217+
```
218+
219+
Regular inlining reduces `(< x y)` to `(. clojure.lang.Numbers (lt x y))`. There is an intrinsic definition for `Numbers.lt(double,double)`. We end up generating the code:
220+
221+
For example, a call to `Numbers.equiv(double,double)` will be inlined with:
222+
223+
```C#
224+
// return (P_0 >= P_1) ? P_1 : P_0;
225+
IL_0000: ldarg.0
226+
IL_0001: ldarg.1
227+
IL_0002: bge IL_000d
228+
229+
IL_0007: ldarg.0
230+
// (no C# code)
231+
IL_0008: br IL_000e
232+
233+
IL_000d: ldarg.1
234+
235+
IL_000e: ret
236+
```
237+
238+
Intrinsic operations are a powerful optimization technique. However, it is a closed set of operations defined in the compiler, not available for user extension. Perhaps that's a good thing?
239+
240+

0 commit comments

Comments
 (0)