-
Notifications
You must be signed in to change notification settings - Fork 123
Description
Our company recently noticed errors in some of our deployed cloud functions when loading GCP modules (such as the alloydb connector). Though these modules are installed in our dependencies and these dependencies get added to the Python path, we get ModuleNotFound errors when trying to import them. We have been working around this by adding the following code before the import, which seems to resolve the issue but is admittedly very hacky:
import logging
from pip._internal.operations import freeze
logger = logging.getLogger(__name__)
pkgs = freeze.freeze()
logger.info('PIP FREEZE', extra={'packages': [str(pkg) for pkg in pkgs]})
When we build the cloud functions, we install our dependencies using pip install ... -t .dependencies and then launch the cloud function using a command of the following form
gcloud functions deploy amp_vendor_cf \
--source /path/to/amp_vendor_cf \
--runtime python312 \
--project [PROJECT_ID_REDACTED] \
--service-account [SERVICE_ACCOUNT_EMAIL_REDACTED] \
--trigger-bucket [TRIGGER_BUCKET_REDACTED] \
--max-instances 50 \
--entry-point cf_main \
--vpc-connector [VPC_CONNECTOR_REDACTED] \
--region us-east1 \
--timeout 540s \
--egress-settings private-ranges-only \
--ingress-settings internal-only \
--memory 2048MB \
--set-build-env-vars=VERSION=[VERSION_REDACTED] \
--set-env-vars ...PYTHONPATH=.dependencies... \
--set-secrets DD_API_KEY=[SECRET_NAME_REDACTED] \
--retry
We then get the following errors
.................................failed.
12:18:32 ERROR: (gcloud.functions.deploy) OperationError: code=3, message=Gen1 operation for function projects/z0r0-dev-service/locations/us-east1/functions/amp_vendor_cf failed: Function failed on loading user code. This is likely due to a bug in the user code. Error message: Traceback (most recent call last):
12:18:32 File "/layers/google.python.pip/pip/bin/functions-framework", line 8, in <module>
12:18:32 sys.exit(_cli())
12:18:32 ^^^^^^
12:18:32 File "/workspace/.dependencies/click/core.py", line 1442, in __call__
12:18:32 return self.main(*args, **kwargs)
12:18:32 ^^^^^^^^^^^^^^^^^^^^^^^^^^
12:18:32 File "/workspace/.dependencies/click/core.py", line 1363, in main
12:18:32 rv = self.invoke(ctx)
12:18:32 ^^^^^^^^^^^^^^^^
12:18:32 File "/workspace/.dependencies/click/core.py", line 1226, in invoke
12:18:32 return ctx.invoke(self.callback, **ctx.params)
12:18:32 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12:18:32 File "/workspace/.dependencies/click/core.py", line 794, in invoke
12:18:32 return callback(*args, **kwargs)
12:18:32 ^^^^^^^^^^^^^^^^^^^^^^^^^
12:18:32 File "/workspace/.dependencies/functions_framework/_cli.py", line 36, in _cli
12:18:32 app = create_app(target, source, signature_type)
12:18:32 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12:18:32 File "/workspace/.dependencies/functions_framework/__init__.py", line 395, in create_app
12:18:32 raise e from None
12:18:32 File "/workspace/.dependencies/functions_framework/__init__.py", line 376, in create_app
12:18:32 spec.loader.exec_module(source_module)
12:18:32 File "<frozen importlib._bootstrap_external>", line 999, in exec_module
12:18:32 File "<frozen importlib._bootstrap>", line 488, in _call_with_frames_removed
12:18:32 File "/workspace/main.py", line 14, in <module>
12:18:32 from alloydb import AlloyDBConnection, batch_upsert_vendor
12:18:32 File "/workspace/alloydb.py", line 7, in <module>
12:18:32 from google.cloud.alloydb.connector import Connector
12:18:32 ModuleNotFoundError: No module named 'google.cloud.alloydb'
12:18:32 failed to submit span stats to the Datadog agent at http://localhost:8126/v0.6/stats
12:18:32 . Please visit https://cloud.google.com/functions/docs/troubleshooting for in-depth troubleshooting documentation..
12:18:32 [Pipeline] }
12:18:32 Executing sh script inside container cloud-functions-builder-dev of pod cloud-functions-builder-dev-rlzwk-jzrvt
12:18:33 Executing command: "ssh-agent" "-k"
12:18:33 exit
12:18:33 unset SSH_AUTH_SOCK;
12:18:33 unset SSH_AGENT_PID;
12:18:33 echo Agent pid 14162 killed;
12:18:33 [ssh-agent] Stopped.
After some digging, I found we had not experienced this error until recently and that there was a release of 3.8.3 mid-May. On pinning the version to 3.8.2, the problem was resolved. I'm guessing that this is a problem with the way dependencies are managed between the framework runner and the application code -- somehow the framework runner doesn't know about the dependencies the application code does and the inline pip freeze helps resolve this.
For what it's worth, I encountered this problem only after deploying to the cloud and not during local testing. I also am unsure if this problem arises because we deploy our actual code and don't point the function to a docker image.