You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+26-5Lines changed: 26 additions & 5 deletions
Original file line number
Diff line number
Diff line change
@@ -139,15 +139,16 @@ For most of these services you use a third party to run your stuff for you. That
139
139
140
140
These are 'functions' things that run when you want them to run. Usually in response to a request, but they can be triggered by different things like a CRON type of message.
141
141
142
-
It runs on cloud providers like Amazon (AWS lambda) , Google (GCP Cloud Functions ) or Microsoft (Azure Functions) or anywhere with an open source framework like openfaas.
142
+
It runs on cloud providers like Amazon (AWS lambda) , Google (GCP Cloud Functions ) or Microsoft (Azure Functions) or anywhere with an open source framework like openfaas.
143
143
144
144
**Where are the costs?**: With FAAS you pay for execution, you get a bit for free but pay for use. The advantages are that very fast scripts cost next to nothing and you don't pay for not using it (unlike a server where you pay whether you used it or not). It is also massively parallel, you can run thousands of copies of the 'function' (you would pay for all the executions though) because they are all independent. Long running processes are expensive though.
145
145
146
146
**How easy is it to set up and use. and how easy can you transfer your work to your coworker**: Most of the cloud operators have configuration possible with files and APIs, but for your first setup you would probably do it by hand. If you have something running it pays to make this all 'configuration in code', that way you can make variants and hand them to your coworkers.
147
147
148
148
**Can you manage your entire configuration in code?**: Yes
149
149
150
-
**Is there logging, how easy is it see what exactly went wrong?**: Unknown #TODO
150
+
**Is there logging, how easy is it see what exactly went wrong?**:
151
+
Yes there is logging. For GCP for example all logs are centralised in [Cloud Logging](https://cloud.google.com/logging). There you can see logs for the R script and the service running the R script. Whats more is that the logs can be used to trigger events, which can be sent on to activate trigger based workflows.
151
152
152
153
**How precise is it and will it auto recover on failure?**: FAAS often have things like a cold start and a hot start. They are superfast in hot start (miliseconds sometimes) but if you haven't used the function for a while it goes into storage and triggering it will give it a cold start and that might take 5-10 times longer. Triggering a cold function for your batch job at 0900h will maybe lose you some time but realisticaly it starts within a minute and so how much you care about this is up to you and your application.
153
154
@@ -159,7 +160,7 @@ It runs on cloud providers like Amazon (AWS lambda) , Google (GCP Cloud Function
159
160
160
161
*[openfaas (Function as a Service) (I don't have an R tutorial yet #TODO )](https://www.openfaas.com/)
161
162
* R on AWS Lambda ([mediumpost](https://medium.com/bakdata/running-r-on-aws-lambda-9d40643551a6) & [AWS lambda R runtime](https://github.com/bakdata/aws-lambda-r-runtime))
162
-
* R on GCP Cloud functions ([package 'googleCloudRunner'](https://code.markedmondson.me/googleCloudRunner/))
163
+
* R on GCP Cloud Run (HTTP Containers as a service), Cloud Build (Batch jobs within Docker containers) and Cloud Scheduler (CRON in the cloud) - ([package 'googleCloudRunner'](https://code.markedmondson.me/googleCloudRunner/))
163
164
*[Azure functions with R](https://github.com/ktaranov/azure-function-r)
164
165
165
166
@@ -219,8 +220,6 @@ Github actions is not really meant for scheduling scripts, but it does support i
219
220
*[examples of R-specific github actions are collected here](https://github.com/r-lib/actions)
220
221
* there is a [book](https://ropenscilabs.github.io/actions_sandbox/) of github actions for R online.
221
222
222
-
223
-
224
223
### Heroku
225
224
226
225
*[back to top](#scheduling_r_scripts)*
@@ -245,7 +244,29 @@ Heroku is not really a version control system. But they did create something tha
245
244
246
245
* I wrote a blogpost about [running R on heroku](https://blog.rmhogervorst.nl/blog/2018/12/06/running-an-r-script-on-heroku/), and [an update from 2020](https://blog.rmhogervorst.nl/blog/2020/09/21/running-an-r-script-on-a-schedule-heroku/).
247
246
247
+
### Google Cloud Build
248
+
249
+
*[back to top](#scheduling_r_scripts)*
250
+
251
+
Other cloud services have similar services, in the CI/CD field. All use APIs which can be triggered via cron to batch schedule scripts, with varying degrees on integration between the respective cloud services. This is a bit of detail about the Google offering, Cloud Build which is faciliatated by [googleCloudRunner](https://code.markedmondson.me/googleCloudRunner/). It uses a yaml format to coordinate Docker containers running when triggered by git events, such as GitHub, BitBucket or Google's own git platform Source Repositories.
252
+
253
+
**Where are the costs?**: You pay for CPU running time with a monthly free tier that usually means causal use is free.
254
+
255
+
**How easy is it to set up and use. and how easy can you transfer your work to your coworker**: Once you are past authentication steps its simple to setup either in the Web UI or via the R package googleCloudRunner in R code. The package inclues an RStudio gadget which you can point at your R script, R code in-line of a script hosted on Google Cloud Storage. As its using a neutral yaml format running Docker images you can give that to co-workers and they can run the exact same job without knowing R, or give them the R script that generated that yaml
256
+
257
+
**Can you manage your entire configuration in code?**: Yes, R code or yaml.
258
+
259
+
**Is there logging, how easy is it see what exactly went wrong?**: Yes in Cloud Logging
260
+
261
+
**How precise is it and will it auto recover on failure?**: The scheduler runs as specified via cron syntax. You can set up auto-retries via extra scripting.
262
+
263
+
**how do you have to deal with secrets? can they leak?**: Google Secret Manager handles secret and has a templated R script `cr_buildstep_secret()`
264
+
265
+
**in what country does it run**: Global cloud regions in US, EU and elsewhere.
266
+
267
+
**links**:
248
268
269
+
*['Run R code on a schedule' use case](https://code.markedmondson.me/googleCloudRunner/articles/usecases.html#run-r-code-on-a-schedule-1) on `googleCloudRunner` website.
0 commit comments