Issue 256 - move camsl to geo_point table #263

jo-asplin-met-no · 2025-06-11T15:32:15Z

…ests fail

lukas-phaf

I did a high level review, focusing mostly on the protobuf and postgres migrations. I have added some comments around these, and will try to get to the golang code later.

A general question is how we test this, as there is no ingestion and/or API support to see if inserting and querying actually works (unless I am missing something).

Ideally, we would have the ingestion and API in the same PR, so that we can add some camsl's to the test dataset, and confirm that everything works by testing that the data shows up in the API, preferable using the camsl range query.

lukas-phaf · 2025-06-19T11:17:50Z

protobuf/datastore.proto

-  string processing_level = 9 [json_name = "processing_level"];
-  int64 quality_code = 10 [json_name = "quality_code"];
-  int64 camsl = 11; // centimeters above mean sea level
+  optional int32 camsl = 3; // centimeters above mean sea level


Are we okay with re-ordering all the indices (also the fields below)?
It makes the messages incompatible (the docs say you should never do this):
https://protobuf.dev/programming-guides/proto3/#consequences
We do tend to change all the components at once, but in the EWC setup they are on separate machines, so there might still be some issues.
I think it might actually lead to storing garbage.

Why just not leave caml at 11, and the rest untouched? You can leave the order in the file as it is now, I don't think it is used for anything.

lukas-phaf · 2025-06-19T11:25:44Z

protobuf/datastore.proto

  //           '10', '1*', '*0', '*'.
  //
-  map<string, Strings> filter = 5;
+  map<string, Strings> filter = 6;


Avoid change in field number.

lukas-phaf · 2025-06-19T11:25:57Z

protobuf/datastore.proto

+  repeated string included_response_fields = 7;

-  // repeated string excluded_response_fields = 7; // TODO
+  // repeated string excluded_response_fields = 8; // TODO


Avoid change in field number.

lukas-phaf · 2025-06-19T11:26:48Z

protobuf/datastore.proto

+  string camsl_range = 4;
+  map<string, Strings> filter = 5;


Avoid change in field number.

lukas-phaf · 2025-06-19T11:46:57Z

datastore/migrate/data/migrations/1747651104_camsl.up.sql

@@ -1 +1,9 @@
-ALTER TABLE observation ADD COLUMN camsl BIGINT;
+ALTER TABLE geo_point ADD COLUMN camsl INTEGER;


If I start on master branch, load data, switch to your branch, and run the migrations (just up), nothing happens. This is because you reused an existing migration file, which according to the DB table schema_migrations was already applied.

Now we can just assume that production systems never ran the original version of the migration... but it is much safer to make a new one where you drop the column in observations, and add a new on to geo_point.

The only potential issue is then that any camsl data that was already there is ignore... but I don't thing the ingest was changed to put anything in, so this is fine.

lukas-phaf · 2025-06-19T12:14:36Z

datastore/migrate/data/migrations/1747651104_camsl.up.sql

+ALTER TABLE geo_point ADD COLUMN camsl INTEGER;
+
+-- drop UNIQUE constraint of 'point' column
+-- WARNING: we assume that the constraint name is the correct one (it was never explicitly set)


The original table definition was this:

id BIGINT PRIMARY KEY GENERATED ALWAYS AS IDENTITY, point GEOGRAPHY(Point, 4326) NOT NULL UNIQUE ); CREATE INDEX geo_point_idx ON geo_point USING GIST(point);

This gave the following constraint and indieces:

After forcing the migration to your branch, I have the following constraint and indices:

This corresponds to the indices that you would get with the following table definition (without any migrations):

CREATE TABLE geo_point( id BIGINT PRIMARY KEY GENERATED ALWAYS AS IDENTITY, point GEOGRAPHY(Point, 4326), camsl INTEGER, CONSTRAINT geo_point_point_camsl_key UNIQUE NULLS NOT DISTINCT (point, camsl) ); CREATE INDEX geo_point_idx ON geo_point USING GIST(point);

So I guess the question is, do we want to have a GIST index that includes camsl? This depends on the query, which I haven't looked at.

lukas-phaf · 2025-06-19T12:17:38Z

datastore/datastore/storagebackend/postgresql/putobservations.examples.sql

The upserting 2 points example should also change, correct?

lukas-phaf · 2025-06-19T12:21:15Z

datastore/datastore/storagebackend/postgresql/putobservations.go

-	lat float64
+	lon   float64
+	lat   float64
+	camsl *int32


Why a pointer? So we can use nil for not set/NULL?

lukas-phaf

Note to also fix just load. Gives concurrency errors...

lukas-phaf · 2025-06-19T15:21:15Z

datastore/datastore/storagebackend/postgresql/putobservations.go

+	SELECT c.id, point, camsl FROM input_rows
+	JOIN geo_point c USING (point, camsl)


Suggested change

SELECT c.id, point, camsl FROM input_rows

JOIN geo_point c USING (point, camsl)

SELECT c.id, c.point, c.camsl FROM input_rows i

JOIN geo_point c ON i.camsl IS NOT DISTINCT FROM c.camsl AND i.point=c.point; -- Workaround for JOIN on NULL's

Looks like this fixes the "concurrency" issue on the point table during data load, and the integration test.

Initial attempt at moving camsl to geo_point table, but integration t…

08a7d57

…ests fail

jo-asplin-met-no self-assigned this Jun 11, 2025

jo-asplin-met-no added 2 commits June 11, 2025 20:49

Removed unused function

46d45cc

Fixed bug and typo

1056284

jo-asplin-met-no requested review from lukas-phaf and removed request for lukas-phaf June 11, 2025 20:02

lukas-phaf reviewed Jun 19, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Issue 256 - move camsl to geo_point table #263

Issue 256 - move camsl to geo_point table #263

Uh oh!

jo-asplin-met-no commented Jun 11, 2025 •

edited

Loading

Uh oh!

lukas-phaf left a comment

Uh oh!

lukas-phaf Jun 19, 2025

Uh oh!

lukas-phaf Jun 19, 2025

Uh oh!

lukas-phaf Jun 19, 2025

Uh oh!

lukas-phaf Jun 19, 2025

Uh oh!

lukas-phaf Jun 19, 2025

Uh oh!

lukas-phaf Jun 19, 2025

Uh oh!

lukas-phaf Jun 19, 2025

Uh oh!

lukas-phaf Jun 19, 2025

Uh oh!

lukas-phaf left a comment

Uh oh!

lukas-phaf Jun 19, 2025

Uh oh!

lukas-phaf Jun 20, 2025

Uh oh!

Uh oh!

		@@ -1 +1,9 @@
		ALTER TABLE observation ADD COLUMN camsl BIGINT;
		ALTER TABLE geo_point ADD COLUMN camsl INTEGER;

		SELECT c.id, point, camsl FROM input_rows
		JOIN geo_point c USING (point, camsl)

Issue 256 - move camsl to geo_point table #263

Are you sure you want to change the base?

Issue 256 - move camsl to geo_point table #263

Uh oh!

Conversation

jo-asplin-met-no commented Jun 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lukas-phaf left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lukas-phaf left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jo-asplin-met-no commented Jun 11, 2025 •

edited

Loading