Skip to content

Commit 4922b2b

Browse files
juanuribe28Tensorflow Cloud maintainers
authored and
Tensorflow Cloud maintainers
committed
Add demo for run_models experimental method.
PiperOrigin-RevId: 388982644
1 parent cb4108f commit 4922b2b

File tree

2 files changed

+343
-0
lines changed

2 files changed

+343
-0
lines changed

g3doc/_book.yaml

+2
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,8 @@ upper_tabs:
2222
path: /cloud/tutorials/distributed_training_nasnet_with_tensorflow_cloud
2323
- title: Hyperparameter tuning on Google Cloud
2424
path: /cloud/tutorials/hp_tuning_cifar10_using_google_cloud
25+
- title: Running vision models from TF Model Garden on GCP with TF Cloud
26+
path: /cloud/tutorials/experimental/running_vision_models_from_tf_model_garden_on_gcp_with_tf_cloud
2527
- heading: "Guides"
2628
- title: Cloud `run` guide
2729
path: /cloud/guides/run_guide
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,341 @@
1+
{
2+
"nbformat": 4,
3+
"nbformat_minor": 0,
4+
"metadata": {
5+
"colab": {
6+
"name": "Running vision models from TF Model Garden on GCP with TF Cloud",
7+
"provenance": [],
8+
"collapsed_sections": [],
9+
"toc_visible": true
10+
},
11+
"kernelspec": {
12+
"name": "python3",
13+
"display_name": "Python 3"
14+
},
15+
"language_info": {
16+
"name": "python"
17+
}
18+
},
19+
"cells": [
20+
{
21+
"cell_type": "markdown",
22+
"metadata": {
23+
"id": "ApxORpbFShVP"
24+
},
25+
"source": [
26+
"##### Copyright 2021 The TensorFlow Cloud Authors."
27+
]
28+
},
29+
{
30+
"cell_type": "code",
31+
"metadata": {
32+
"id": "eR70XKMMmC8I",
33+
"cellView": "form"
34+
},
35+
"source": [
36+
"#@title Licensed under the Apache License, Version 2.0 (the \"License\");\n",
37+
"# you may not use this file except in compliance with the License.\n",
38+
"# You may obtain a copy of the License at\n",
39+
"#\n",
40+
"# https://www.apache.org/licenses/LICENSE-2.0\n",
41+
"#\n",
42+
"# Unless required by applicable law or agreed to in writing, software\n",
43+
"# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
44+
"# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
45+
"# See the License for the specific language governing permissions and\n",
46+
"# limitations under the License."
47+
],
48+
"execution_count": null,
49+
"outputs": []
50+
},
51+
{
52+
"cell_type": "markdown",
53+
"metadata": {
54+
"id": "wKcTRRxsAmDl"
55+
},
56+
"source": [
57+
"# Running vision models from TF Model Garden on GCP with TF Cloud\n",
58+
"\n",
59+
"<table class=\"tfo-notebook-buttons\" align=\"left\">\n",
60+
" <td>\n",
61+
" <!-- <a target=\"_blank\" href=\"https://www.tensorflow.org/cloud/tutorials/overview.ipynb\"><img src=\"https://www.tensorflow.org/images/tf_logo_32px.png\" />View on TensorFlow.org</a> MSSING HREF -->\n",
62+
" </td>\n",
63+
" <td>\n",
64+
" <!-- <a target=\"_blank\" href=\"https://colab.research.google.com/github/tensorflow/cloud/blob/master/g3doc/tutorials/overview.ipynb\"\"><img src=\"https://www.tensorflow.org/images/colab_logo_32px.png\" />Run in Google Colab</a> MSSING HREF -->\n",
65+
" </td>\n",
66+
" <td>\n",
67+
" <!-- <a target=\"_blank\" href=\"https://github.com/tensorflow/cloud/blob/master/g3doc/tutorials/overview.ipynb\"><img src=\"https://www.tensorflow.org/images/GitHub-Mark-32px.png\" />View on GitHub</a> MSSING HREF -->\n",
68+
" </td>\n",
69+
" <td>\n",
70+
" <!-- <a href=\"https://storage.googleapis.com/tensorflow_docs/cloud/tutorials/overview.ipynb\"><img src=\"https://www.tensorflow.org/images/download_logo_32px.png\" />Download notebook</a> MSSING HREF -->\n",
71+
" </td>\n",
72+
" <td>\n",
73+
" <!-- <a href=\"https://kaggle.com/kernels/welcome?src=https://github.com/tensorflow/cloud/blob/master/g3doc/tutorials/overview.ipynb\" target=\"blank\"> <img width=\"90\" src=\"https://www.kaggle.com/static/images/site-logo.png\" alt=\"Kaggle logo\" />Run in Kaggle</a>MSSING HREF MSSING HREF -->\n",
74+
" </td>\n",
75+
"</table>"
76+
]
77+
},
78+
{
79+
"cell_type": "markdown",
80+
"metadata": {
81+
"id": "FAUbwFuJB3bw"
82+
},
83+
"source": [
84+
"In this example we will use [run_models](https://github.com/tensorflow/cloud/blob/690c3eee65dadee8af260a19341ff23f42f1f070/src/python/tensorflow_cloud/core/experimental/models.py#L30) from the experimental module of TF Cloud to train a ResNet model from [TF Model Garden](https://github.com/tensorflow/models/tree/master/official) on [imagenette from TFDS](https://www.tensorflow.org/datasets/catalog/imagenette)."
85+
]
86+
},
87+
{
88+
"cell_type": "markdown",
89+
"metadata": {
90+
"id": "EFCSAVDbC8-W"
91+
},
92+
"source": [
93+
"## Install Packages\n",
94+
"\n",
95+
"We need the nightly version of tensorflow-cloud that we can get from github, the official release of tf-models-official, and keras 2.6.0rc0 for compatibility."
96+
]
97+
},
98+
{
99+
"cell_type": "code",
100+
"metadata": {
101+
"id": "r4sSs1azu-Ti"
102+
},
103+
"source": [
104+
"!pip install -q 'git+https://github.com/tensorflow/cloud.git@refs/pull/352/head#egg=tensorflow-cloud&subdirectory=src/python' tf-models-official keras==2.6.0rc0"
105+
],
106+
"execution_count": null,
107+
"outputs": []
108+
},
109+
{
110+
"cell_type": "markdown",
111+
"metadata": {
112+
"id": "N3NC5vrDslsf"
113+
},
114+
"source": [
115+
"## Import required modules"
116+
]
117+
},
118+
{
119+
"cell_type": "code",
120+
"metadata": {
121+
"id": "sdkgm_6PvHkk",
122+
"colab": {
123+
"base_uri": "https://localhost:8080/"
124+
},
125+
"outputId": "c17384b4-07f1-493c-edf8-5eadde79524f"
126+
},
127+
"source": [
128+
"import os\n",
129+
"import sys\n",
130+
"\n",
131+
"import tensorflow_cloud as tfc\n",
132+
"from tensorflow_cloud.core.experimental.models import run_models\n",
133+
"\n",
134+
"print(tfc.__version__)"
135+
],
136+
"execution_count": 2,
137+
"outputs": [
138+
{
139+
"output_type": "stream",
140+
"text": [
141+
"0.1.17.dev\n"
142+
],
143+
"name": "stdout"
144+
}
145+
]
146+
},
147+
{
148+
"cell_type": "markdown",
149+
"metadata": {
150+
"id": "Ka6MHtF-tTU1"
151+
},
152+
"source": [
153+
"## Project Configurations\n",
154+
"Setting project parameters. For more details on Google Cloud Specific parameters please refer to [Google Cloud Project Setup Instructions](https://www.kaggle.com/nitric/google-cloud-project-setup-instructions/)."
155+
]
156+
},
157+
{
158+
"cell_type": "code",
159+
"metadata": {
160+
"id": "OFPPSLF9vx4H"
161+
},
162+
"source": [
163+
"# Set Google Cloud Specific parameters\n",
164+
"\n",
165+
"# TODO: Please set GCP_PROJECT_ID to your own Google Cloud project ID.\n",
166+
"GCP_PROJECT_ID = 'YOUR_PROJECT_ID' #@param {type:\"string\"}\n",
167+
"\n",
168+
"# TODO: set GCS_BUCKET to your own Google Cloud Storage (GCS) bucket.\n",
169+
"GCS_BUCKET = 'YOUR_GCS_BUCKET_NAME' #@param {type:\"string\"}\n",
170+
"\n",
171+
"# DO NOT CHANGE: Currently only the 'us-central1' region is supported.\n",
172+
"REGION = 'us-central1'\n",
173+
"\n",
174+
"# OPTIONAL: You can change the job name to any string.\n",
175+
"JOB_NAME = 'run_models_demo' #@param {type:\"string\"}"
176+
],
177+
"execution_count": null,
178+
"outputs": []
179+
},
180+
{
181+
"cell_type": "markdown",
182+
"metadata": {
183+
"id": "F1_shlH4tUM5"
184+
},
185+
"source": [
186+
"## Authenticating the notebook to use your Google Cloud Project\n",
187+
"\n",
188+
"This code authenticates the notebook, checking your valid Google Cloud credentials and identity. It is inside the `if not tfc.remote()` block to ensure that it is only run in the notebook, and will not be run when the notebook code is sent to Google Cloud.\n",
189+
"\n",
190+
"Note: For Kaggle Notebooks click on \"Add-ons\"->\"Google Cloud SDK\" before running the cell below."
191+
]
192+
},
193+
{
194+
"cell_type": "code",
195+
"metadata": {
196+
"id": "EeW7IHBgtPJD"
197+
},
198+
"source": [
199+
"if not tfc.remote():\n",
200+
"\n",
201+
" # Authentication for Kaggle Notebooks\n",
202+
" if \"kaggle_secrets\" in sys.modules:\n",
203+
" from kaggle_secrets import UserSecretsClient\n",
204+
" UserSecretsClient().set_gcloud_credentials(project=GCP_PROJECT_ID)\n",
205+
"\n",
206+
" # Authentication for Colab Notebooks\n",
207+
" if \"google.colab\" in sys.modules:\n",
208+
" from google.colab import auth\n",
209+
" auth.authenticate_user()\n",
210+
" os.environ[\"GOOGLE_CLOUD_PROJECT\"] = GCP_PROJECT_ID"
211+
],
212+
"execution_count": null,
213+
"outputs": []
214+
},
215+
{
216+
"cell_type": "markdown",
217+
"metadata": {
218+
"id": "EQrVntO2twh1"
219+
},
220+
"source": [
221+
"## Set up TensorFlowCloud run\n",
222+
"\n",
223+
"Set up parameters for tfc.run(). The chief_config, worker_count and worker_config will be set up individually for each distribution strategy. For more details refer to [TensorFlow Cloud overview tutorial](https://colab.research.google.com/github/tensorflow/cloud/blob/master/g3doc/tutorials/overview.ipynb)"
224+
]
225+
},
226+
{
227+
"cell_type": "code",
228+
"metadata": {
229+
"id": "o539iLTKv9a3"
230+
},
231+
"source": [
232+
"with open('requirements.txt','w') as f:\n",
233+
" f.write('git+https://github.com/tensorflow/cloud.git@refs/pull/352/head#egg=tensorflow-cloud&subdirectory=src/python\\n'+\n",
234+
" 'tf-models-official\\n'+\n",
235+
" 'keras==2.6.0rc0')\n",
236+
"\n",
237+
"run_kwargs = dict(\n",
238+
" requirements_txt = 'requirements.txt',\n",
239+
" docker_config=tfc.DockerConfig(\n",
240+
" parent_image=\"gcr.io/deeplearning-platform-release/tf2-gpu.2-5\",\n",
241+
" image_build_bucket=GCS_BUCKET\n",
242+
" ),\n",
243+
" chief_config=tfc.COMMON_MACHINE_CONFIGS[\"P100_4X\"],\n",
244+
" job_labels={'job': JOB_NAME}\n",
245+
")"
246+
],
247+
"execution_count": null,
248+
"outputs": []
249+
},
250+
{
251+
"cell_type": "markdown",
252+
"metadata": {
253+
"id": "hd4luG7nt3_0"
254+
},
255+
"source": [
256+
"## Run the training using run_models"
257+
]
258+
},
259+
{
260+
"cell_type": "code",
261+
"metadata": {
262+
"id": "_aVt71qpxHUe"
263+
},
264+
"source": [
265+
"values = run_models(\n",
266+
" 'imagenette',\n",
267+
" 'resnet',\n",
268+
" GCS_BUCKET,\n",
269+
" 'train',\n",
270+
" 'validation',\n",
271+
" **run_kwargs,\n",
272+
" )"
273+
],
274+
"execution_count": null,
275+
"outputs": []
276+
},
277+
{
278+
"cell_type": "markdown",
279+
"metadata": {
280+
"id": "Ku7oBH8iuc2X"
281+
},
282+
"source": [
283+
"## Training Results\n",
284+
"\n",
285+
"### Reconnect your Colab instance\n",
286+
"\n",
287+
"Most remote training jobs are long running. If you are using Colab, it may time out before the training results are available.\n",
288+
"\n",
289+
"In that case, **rerun the following sections in order** to reconnect and configure your Colab instance to access the training results.\n",
290+
"\n",
291+
"1. Import required modules\n",
292+
"2. Project Configurations\n",
293+
"3. Authenticating the notebook to use your Google Cloud Project\n",
294+
"\n",
295+
"**DO NOT** rerun the rest of the code.\n",
296+
"\n",
297+
"### Load Tensorboard\n",
298+
"While the training is in progress you can use Tensorboard to view the results. Note the results will show only after your training has started. This may take a few minutes."
299+
]
300+
},
301+
{
302+
"cell_type": "code",
303+
"metadata": {
304+
"id": "rhVCh8x9upRY"
305+
},
306+
"source": [
307+
"if not tfc.remote():\n",
308+
" %load_ext tensorboard\n",
309+
" tensorboard_logs_dir = values['tensorboard_logs']\n",
310+
" %tensorboard --logdir $tensorboard_logs_dir"
311+
],
312+
"execution_count": null,
313+
"outputs": []
314+
},
315+
{
316+
"cell_type": "markdown",
317+
"metadata": {
318+
"id": "kOU5Gu4Ku1Qc"
319+
},
320+
"source": [
321+
"### Load your trained model\n",
322+
"\n",
323+
"Once training is complete, you can retrieve your model from the GCS Bucket you specified above."
324+
]
325+
},
326+
{
327+
"cell_type": "code",
328+
"metadata": {
329+
"id": "rHoQnqKhu2Y8"
330+
},
331+
"source": [
332+
"import tensorflow as tf\n",
333+
"if not tfc.remote():\n",
334+
" trained_model = tf.keras.models.load_model(values['saved_model'])\n",
335+
" trained_model.summary()"
336+
],
337+
"execution_count": null,
338+
"outputs": []
339+
}
340+
]
341+
}

0 commit comments

Comments
 (0)