Skip to content

Commit d92623e

Browse files
bmenashaagold-rh
andauthored
Address issue #1373 and issue #1374 and other performance/bug fixes (#1394)
Issue # 1374: Use the latest Dataflow SDK version. Issue # 1373: Unable to deal with new cloudbuild.googleapis.com/Build assets: The core issue was that the discovery_name of this new asset type is incorrectly reported as cloudbuild.googleapis.com/Build rather than 'Build'. Try to deal with that by correcting any discovery_name with a '/' in it. But there are other fixes necessary to speed processing. Other performance/bug fixes: - Use the discovery document generated schema if we have one over any resource generated one. This is a big performance improvement as determining the schema from the resource is time consuming, it's also not productive as if we have an API resource schema, it should always match the resource json anyways. - Add ancestors, update_time, location, json_data to discovery generated schema. This prevents those properties from being dropped if we always rely on it. - Sanitize discovery document generated schemas. If we are to always rely on them, it's possible they could be invalid, so enforce the bigquery rules on them as well. - Use copy.deepcopy less, only when we copy a source into a destination field. - Prevent bigquery columns with BQ_FORBIDDEN_PREFIXES from being created. There are some bigtable resources that can include these prefixes. - Some BigQuery model resources had NaN and Infinity values for numeric fields. Try to handle those in sanitization. - When merging schemas, stop after we have BQ_MAX_COLUMNS fields. This helps to stop the merge process earlier. (It can take forever if there are many unique fields and many elements). - When enforcing schema on a resource, recognize when we are handling addition properties and add the additional property fields to the value of the additional property key value list in push_down_additional_properties. This produced more regular schemas. - Add ignore_unknown_values to the load job so that we don't fail if resource contains fields not present in the schema. - Accept and pass --add-load-date-suffix via main.py. - Better naming of some local variables for readability. - Some format changes suggested by Intellij. Co-authored-by: Andrew Gold <[email protected]>
1 parent f67dd45 commit d92623e

13 files changed

+812
-315
lines changed

tools/asset-inventory/asset_inventory/api_schema.py

Lines changed: 156 additions & 58 deletions
Large diffs are not rendered by default.

tools/asset-inventory/asset_inventory/bigquery_schema.py

Lines changed: 248 additions & 117 deletions
Large diffs are not rendered by default.

tools/asset-inventory/asset_inventory/export.py

Lines changed: 13 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,7 @@ def export_to_gcs(parent, gcs_destination, content_type, asset_types):
5151
5252
Invoke either the cloudasset.organizations.exportAssets or
5353
cloudasset.projects.exportAssets method depending on if parent is a project
54-
or orgniaztion.
54+
or organization.
5555
Args:
5656
parent: Either `project/<project-id>` or `organization/<organization#>`.
5757
gcs_destination: GCS uri to export to.
@@ -65,10 +65,10 @@ def export_to_gcs(parent, gcs_destination, content_type, asset_types):
6565
output_config = asset_v1.types.OutputConfig()
6666
output_config.gcs_destination.uri = gcs_destination
6767
operation = Clients.cloudasset().export_assets(
68-
parent,
69-
output_config,
70-
content_type=content_type,
71-
asset_types=asset_types)
68+
{'parent': parent,
69+
'output_config': output_config,
70+
'content_type': content_type,
71+
'asset_types': asset_types})
7272
return operation.result()
7373

7474

@@ -128,17 +128,19 @@ def add_argparse_args(ap, required=False):
128128
'This MUST be run with a service account owned by a project with the '
129129
'Cloud Asset API enabled. The gcloud generated user credentials'
130130
' do not work. This requires:\n\n'
131-
' 1. Enable the Cloud Asset Inventory API on a project (https://console.cloud.google.com/apis/api/cloudasset.googleapis.com/overview)\n'
132-
' 2. Create a service acocunt owned by this project\n'
131+
'1. Enable the Cloud Asset Inventory API on a project ('
132+
'https://console.cloud.google.com/apis/api/cloudasset.googleapis.com/overview)\n'
133+
' 2. Create a service account owned by this project\n'
133134
' 3. Give the service account roles/cloudasset.viewer at the organization layer\n'
134135
' 4. Run on a GCE instance started with this service account,\n'
135-
' or downloadthe private key and set GOOGLE_APPLICATION_CREDENTIALS to the file name\n'
136+
' or download the private key and set GOOGLE_APPLICATION_CREDENTIALS to the file name\n'
136137
' 5. Run this command.\n\n'
137138
'If the GCS bucket being written to is owned by a different project then'
138139
' the project that you enabled the API on, then you must also grant the'
139140
' "service-<project-id>@gcp-sa-cloudasset.iam.gserviceaccount.com" account'
140-
' objectAdmin privleges to the bucket:\n'
141-
' gsutil iam ch serviceAccount:service-<project-id>@gcp-sa-cloudasset.iam.gserviceaccount.com:objectAdmin gs://<bucket>\n'
141+
' objectAdmin privileges to the bucket:\n'
142+
'gsutil iam ch serviceAccount:service-<project-id>@gcp-sa-cloudasset.iam.gserviceaccount.com:objectAdmin '
143+
'gs://<bucket>\n'
142144
'\n\n')
143145
ap.add_argument(
144146
'--parent',
@@ -172,7 +174,7 @@ def content_types_argument(string):
172174

173175
ap.add_argument(
174176
'--asset-types',
175-
help=('Comma seprated list of asset types to export such as '
177+
help=('Comma separated list of asset types to export such as '
176178
'"google.compute.Firewall,google.compute.HealthCheck"'
177179
' default is `*` for everything'),
178180
type=lambda x: [y.strip() for y in x.split(',')],

0 commit comments

Comments
 (0)