Skip to content

Commit 8d56968

Browse files
committed
(TK-487) Allow friendly init/start fail fast via ::exit throw
Allow init and start methods to throw a request-shutdown style ex-info map to short circuit the startup process and exit with a specified message and status, rather than a backtrace. This just provides a short circuiting (immediate) counterpart to the existing, deferred shutdown requests provided by request-shutdown.
1 parent 383f621 commit 8d56968

File tree

5 files changed

+97
-24
lines changed

5 files changed

+97
-24
lines changed

documentation/Built-in-Shutdown-Service.md

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -68,9 +68,14 @@ specifying a process exit status and final messages like this:
6868
:messages [["Unexpected filesystem error ..." *err*]]}}
6969
```
7070

71-
which will finally be thrown from `run` as an `ex-info` of `:kind`
72-
`:puppetlabs.trapperkepper.core/exit` like this:
73-
71+
This map is exactly the same map that can be thrown from an `init` or
72+
`start` method via `ex-info` to initiate an immediate shutdown.
73+
(Calls to `request-shutdown` only trigger a shutdown later, currently
74+
after all of the services have been initialized and started.)
75+
76+
Whether via `request-shutdown` or a throw from `init` or `start`, a
77+
shutdown request will eventually cause `run` to throw an `ex-info` of
78+
`:kind` `:puppetlabs.trapperkepper.core/exit` like this:
7479

7580
```clj
7681
{:kind :puppetlabs.trapperkepper.core/exit`

documentation/Defining-Services.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,9 @@ The default implementation of the lifecycle functions is to simply return the se
3131

3232
Trapperkeeper will call the lifecycle functions in order based on the dependency list of the services; in other words, if your service has a dependency on service `Foo`, you are guaranteed that `Foo`'s `init` function will be called prior to yours, and that your `stop` function will be called prior to `Foo`'s.
3333

34+
If an exception is thrown by `init` or `start`, Trapperkeeper will
35+
[initiate an immediate shutdown](Error-Handling.md).
36+
3437
### Example Service
3538

3639
Let's look at a concrete example:

documentation/Error-Handling.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,19 @@ If the `init` or `start` function of any service throws a `Throwable`, it will c
66

77
If the `init` or `start` function of your service launches a background thread to perform some costly initialization computations (like, say, populating a pool of objects which are expensive to create), it is advisable to wrap that computation inside a call to `shutdown-on-error`; however, you should note that `shutdown-on-error` does *not* short-circuit Trapperkeeper's start-up sequence - the app will continue booting. The `init` and `start` functions of all services will still be run, and once that has completed, all `stop` functions will be called, and the process will terminate.
88

9+
If the exception thrown by `init` or `start` is an `ex-info` exception
10+
containing the same kind of map that
11+
[`request-shutdown`](Built-in-Shutdown-Service.md#request-shutdown)
12+
accepts, then Trapperkeeper will print the specified messages and exit
13+
with the specified status as described there. For example:
14+
15+
(ex-info ""
16+
{:kind :puppetlabs.trapperkepper.core/exit`
17+
{:status 3
18+
:messages [["Unexpected filesystem error ..." *err*]]}})
19+
20+
The `ex-info` message string is currently ignored.
21+
922
## Services Should Fail Fast
1023

1124
Trapperkeeper embraces fail-fast behavior. With that in mind, we advise writing services that also fail-fast. In particular, if your service needs to spin-off a background thread to perform some expensive initialization logic, it is a best practice to push as much code as possible outside of the background thread (for example, validating configuration data), because `Throwables` on the main thread will propagate out of `init` or `start` and cause the application to shut down - i.e., it will *fail fast*. There are different operational semantics for errors thrown on a background thread (see previous section).

src/puppetlabs/trapperkeeper/internal.clj

Lines changed: 35 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -176,6 +176,29 @@
176176
required []]
177177
(first (ks/cli! cli-args specs required))))
178178

179+
(def exit-request-schema
180+
"A process exit request like
181+
{:status 7
182+
:messages [[\"something for stderr\n\" *err*]]
183+
[\"something for stdout\n\" *out*]]
184+
[\"something else for stderr\n\" *err*]]"
185+
{:status schema/Int
186+
:messages [[(schema/one schema/Str "message")
187+
(schema/one java.io.Writer "stream")]]})
188+
189+
(defn exit-exception? [ex]
190+
(and (instance? ExceptionInfo ex)
191+
(not (schema/check {(schema/optional-key :puppetlabs.trapperkeeper.core/exit)
192+
exit-request-schema}
193+
(ex-data ex)))))
194+
195+
(defn shutdown-reason-for-ex
196+
[exception]
197+
(if (exit-exception? exception)
198+
(merge {:cause :requested}
199+
(select-keys (ex-data exception) [:puppetlabs.trapperkeeper.core/exit]))
200+
{:cause :service-error :error exception}))
201+
179202
(schema/defn ^:always-validate run-lifecycle-fn!
180203
"Run a lifecycle function for a service. Required arguments:
181204
@@ -234,9 +257,15 @@
234257
(log/debug (i18n/trs "Finished running lifecycle function ''{0}'' for service ''{1}''"
235258
lifecycle-fn-name
236259
service-id)))
237-
(catch Throwable t
238-
(log/error t (i18n/trs "Error during service {0}!!!" lifecycle-fn-name))
239-
(throw t))))
260+
(catch ExceptionInfo ex
261+
(if (exit-exception? ex)
262+
(log/info (i18n/trs "Immediate shutdown requested during service {0}"
263+
lifecycle-fn-name))
264+
(log/error ex (i18n/trs "Error during service {0}!!!" lifecycle-fn-name)))
265+
(throw ex))
266+
(catch Throwable ex
267+
(log/error ex (i18n/trs "Error during service {0}!!!" lifecycle-fn-name))
268+
(throw ex))))
240269

241270
(schema/defn ^:always-validate initialize-lifecycle-worker :- (schema/protocol async-prot/Channel)
242271
"Initializes a 'worker' which will listen for lifecycle-related tasks and perform
@@ -286,9 +315,7 @@
286315
(log/debug (i18n/trs "Lifecycle worker completed {0} lifecycle task; awaiting next task." type))
287316
(catch Exception e
288317
(log/debug e (i18n/trs "Exception caught in lifecycle worker loop"))
289-
(deliver shutdown-reason-promise
290-
{:cause :service-error
291-
:error e})))
318+
(deliver shutdown-reason-promise (shutdown-reason-for-ex e))))
292319
(recur))
293320

294321
(do
@@ -345,16 +372,6 @@
345372
;;;; regarding the cause of the shutdown, and is intended to be passed back
346373
;;;; in to the top-level functions that perform various shutdown steps.
347374

348-
(def exit-request-schema
349-
"A process exit request like
350-
{:status 7
351-
:messages [[\"something for stderr\n\" *err*]]
352-
[\"something for stdout\n\" *out*]]
353-
[\"something else for stderr\n\" *err*]]"
354-
{:status schema/Int
355-
:messages [[(schema/one schema/Str "message")
356-
(schema/one java.io.Writer "stream")]]})
357-
358375
(def ^{:private true
359376
:doc "The possible causes for shutdown to be initiated."}
360377
shutdown-causes #{:requested :service-error :jvm-shutdown-hook})
@@ -635,8 +652,7 @@
635652
(inc-restart-counter! this)
636653
this
637654
(catch Throwable t
638-
(deliver shutdown-reason-promise {:cause :service-error
639-
:error t})))))))
655+
(deliver shutdown-reason-promise (shutdown-reason-for-ex t))))))))
640656

641657
(schema/defn ^:always-validate boot-services-for-app**
642658
"Boots services for a TK app. WARNING: This should only ever be called
@@ -648,8 +664,7 @@
648664
(a/init app)
649665
(a/start app)
650666
(catch Throwable t
651-
(deliver shutdown-reason-promise {:cause :service-error
652-
:error t})))
667+
(deliver shutdown-reason-promise (shutdown-reason-for-ex t))))
653668
(deliver result-promise app)))
654669

655670
(schema/defn ^:always-validate boot-services-for-app* :- (schema/protocol a/TrapperkeeperApp)

test/puppetlabs/trapperkeeper/internal_test.clj

Lines changed: 38 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -108,4 +108,41 @@
108108
(tk-app/stop app)
109109
;; and make sure that we got one last :stop
110110
(is (= (conj expected-lifecycle-events :stop)
111-
@lifecycle-events)))))
111+
@lifecycle-events)))))
112+
113+
(deftest test-immediate-shutdown-exceptions
114+
(let [shutdown #(throw
115+
(ex-info "Shutting down"
116+
{:puppetlabs.trapperkeeper.core/exit
117+
{:status 42
118+
:messages [["Something on stdderr" *err*]]}}))
119+
service-that-shuts-down-from
120+
(fn [stage events]
121+
(tk/service
122+
[]
123+
(init [this context]
124+
(swap! events conj :init)
125+
(when (= :init stage) (shutdown))
126+
context)
127+
(start [this context]
128+
(swap! events conj :start)
129+
(when (= :start stage) (shutdown))
130+
context)
131+
(stop [this context]
132+
(swap! events conj :stop)
133+
context)))
134+
config-fn (constantly {})
135+
test-stage
136+
(fn test-stage [stage expected-events]
137+
(let [events (atom [])
138+
svc (service-that-shuts-down-from stage events)
139+
app (internal/build-app* [svc] config-fn)
140+
main-thread (future (internal/boot-services-for-app* app))]
141+
@main-thread
142+
(is (= expected-events @events))
143+
(let [{:keys [cause :puppetlabs.trapperkeeper.core/exit]}
144+
(internal/get-app-shutdown-reason app)]
145+
(is (= cause :requested))
146+
(is (= 42 (:status exit))))))]
147+
(test-stage :init [:init])
148+
(test-stage :start [:init :start])))

0 commit comments

Comments
 (0)