feat(records): alpha support for streams and records #2246

erlendvollset · 2025-08-12T14:49:20Z

Add support for streams
Add basic support for records
Raise FeaturePreviewWarning on all invocations of stream/record methods
Use new warn_on_all_method_invocations decorator across the board

Description

Please describe the change you have made.

Checklist:

Tests added/updated.
Documentation updated. Documentation is generated from docstrings - these must be updated according to your change.
If a new method has been added it should be referenced in cognite.rst in order to generate docs based on its docstring.
Changelog updated in CHANGELOG.md.
Version bumped. If triggering a new release is desired, bump the version number in _version.py and pyproject.toml per semantic versioning.

gemini-code-assist · 2025-08-12T14:49:25Z

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

codecov · 2025-08-12T15:36:12Z

Codecov Report

❌ Patch coverage is 80.93842% with 65 lines in your changes missing coverage. Please review.
✅ Project coverage is 90.67%. Comparing base (67b9b76) to head (bcd743f).

Files with missing lines	Patch %	Lines
cognite/client/_api/data_modeling/records.py	41.30%	27 Missing ⚠️
...gnite/client/data_classes/data_modeling/records.py	85.36%	12 Missing ⚠️
...ite/client/data_classes/data_modeling/instances.py	76.31%	9 Missing ⚠️
cognite/client/_api/data_modeling/streams.py	80.55%	7 Missing ⚠️
cognite/client/data_classes/data_modeling/ids.py	54.54%	5 Missing ⚠️
...gnite/client/data_classes/data_modeling/streams.py	93.15%	5 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #2246      +/-   ##
==========================================
- Coverage   90.82%   90.67%   -0.16%     
==========================================
  Files         170      174       +4     
  Lines       25666    25898     +232     
==========================================
+ Hits        23312    23483     +171     
- Misses       2354     2415      +61

Files with missing lines	Coverage Δ
cognite/client/_api/agents/agents.py	`100.00% <100.00%> (ø)`
cognite/client/_api/data_modeling/__init__.py	`100.00% <100.00%> (ø)`
...nite/client/_api/hosted_extractors/destinations.py	`94.44% <100.00%> (+1.11%)`	⬆️
cognite/client/_api/hosted_extractors/jobs.py	`88.57% <100.00%> (+0.10%)`	⬆️
cognite/client/_api/hosted_extractors/mappings.py	`96.00% <100.00%> (+1.35%)`	⬆️
cognite/client/_api/hosted_extractors/sources.py	`95.08% <100.00%> (+1.05%)`	⬆️
cognite/client/_api/simulators/__init__.py	`100.00% <100.00%> (ø)`
cognite/client/_api/simulators/integrations.py	`96.66% <100.00%> (-0.11%)`	⬇️
cognite/client/_api/simulators/logs.py	`100.00% <100.00%> (ø)`
cognite/client/_api/simulators/models.py	`98.27% <100.00%> (-0.06%)`	⬇️
... and 14 more

... and 4 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

We should consider making this a blacklist instead of a whitelist, would be much easier to maintain

Only run them locally for now against the erlend-test project where records&streams is available

Just reuse SourceData (previously NodeOrEdgeData) instead

kornelione · 2025-10-03T13:37:49Z

cognite/client/_api/data_modeling/streams.py

+            resource_cls=Stream,
+            method="GET",
+            chunk_size=chunk_size,
+            limit=limit,


note: we do not have limit on listStream atm.

https://api-docs.cogheim.net/redoc/#tag/Streams/operation/listStream

kornelione · 2025-10-03T13:42:53Z

cognite/client/_api/data_modeling/streams.py

+                >>> from cognite.client import CogniteClient
+                >>> client = CogniteClient()
+                >>> client.data_modeling.streams.delete(streams=["myStream", "myOtherStream"])
+        """


note: Streams api deviate from cognite api standard on delete (bug has been opened on this), which is generally to do post with items list under , instead we have a issue a DELETE request.

we only accept one item in the list of streams to be deleted and created btw..

asosnovski · 2025-10-06T07:17:36Z

cognite/client/_api/data_modeling/records.py

+        self,
+        stream: str,
+        *,
+        last_updated_time: LastUpdatedRange | None = None,


Food for thought. This parameter is required for immutable streams. Should we have a default value for it, or should we maybe require an explicit None for mutable streams so that SDK users have to think which stream they are querying?

asosnovski · 2025-10-06T07:19:55Z

cognite/client/_api/data_modeling/records.py

+@warn_on_all_method_invocations(
+    FeaturePreviewWarning(api_maturity="alpha", sdk_maturity="alpha", feature_name="Records API")
+)
+class RecordsAPI(APIClient):


What about aggregate endpoint?

asosnovski · 2025-10-06T07:27:49Z

cognite/client/_api/data_modeling/streams.py

+        Fetches streams as they are iterated over, so you keep a limited number of streams in memory.
+
+        Args:
+            chunk_size (int | None): Number of streams to return in each chunk. Defaults to yielding one stream a time.


I don't fully understand what this thing does, so maybe my worries are groundless. But streams endpoint will have pretty strict rate/concurrency limits. Why have a chunk size of 1, when by default a project can have no more than 10 streams? Maybe set something like 20 to avoid unnecessary API calls?

asosnovski · 2025-10-06T07:33:17Z

cognite/client/_api/data_modeling/streams.py

+
+            Get multiple streams by id:
+
+                >>> res = client.data_modeling.streams.retrieve(streams=["MyStream", "MyAwesomeStream", "MyOtherStream"])


This is concerning. How many IDs can be passed this way? Our endpoint allows retrieving only 1 stream by ID, so this results in multiple API invocations, right? And stream endpoints have very strict limits: https://cognitedata.atlassian.net/wiki/spaces/RPILA/pages/4797431945/Design+ILA+rate+and+concurrency+limits#Stream-endpoint-limits Invoking this method with >5 IDs basically guarantees throttling.

asosnovski · 2025-10-06T07:35:59Z

cognite/client/_api/data_modeling/streams.py

+        """
+        return self()
+
+    def retrieve(self, external_id: str) -> Stream | None:


What about includeStatistics parameter? NB! Statistics calculation is potentially expensive, that's why we have it in the get stream endpoint but not in the list streams. And allowing to get multiple streams by ID with one method invocation kind of circumvents this.

asosnovski · 2025-10-06T07:38:37Z

cognite/client/_api/data_modeling/streams.py

+
+                >>> from cognite.client import CogniteClient
+                >>> client = CogniteClient()
+                >>> client.data_modeling.streams.delete(streams=["myStream", "myOtherStream"])


Food for thought. Streams are intended to be long-lived, and customers should think twice before creating or deleting them. A deleted stream will stay for up to 6 weeks in a soft-deleted state, all this time consuming capacity, incurring costs and preventing another stream with the same name from being created. Should we really allow customers to delete multiple streams with 1 request?

asosnovski · 2025-10-06T07:44:33Z

cognite/client/_api/data_modeling/streams.py

+        """`List streams <https://developer.cognite.com/api#tag/Streams/operation/listStreamsV3>`_
+
+        Args:
+            limit (int | None): Maximum number of streams to return. Defaults to 10. Set to -1, float("inf") or None to return all items.


As mentioned by Andreas, there is no limit. The endpoint returns all the streams there are, as a project isn't supposed to have many.

asosnovski · 2025-10-06T07:46:31Z

cognite/client/_api/data_modeling/streams.py

+    @overload
+    def create(self, streams: StreamWrite) -> Stream: ...
+
+    def create(self, streams: StreamWrite | Sequence[StreamWrite]) -> Stream | StreamList:


Food for thought. Streams are intended to be long-lived, and customers should think twice before creating or deleting them. A deleted stream will stay for up to 6 weeks in a soft-deleted state, all this time consuming capacity, incurring costs and preventing another stream with the same name from being created. Should we really allow customers to create multiple streams with 1 request? Especially considering that the endpoint will have only 1 rps limit, so calling this method with multiple stream names would automatically mean throttling.

erlendvollset force-pushed the streams-and-records branch 2 times, most recently from 344b827 to 6cfb164 Compare August 12, 2025 14:58

erlendvollset force-pushed the streams-and-records branch 2 times, most recently from de116c6 to 4eb65aa Compare August 14, 2025 10:32

erlendvollset marked this pull request as ready for review August 14, 2025 10:32

erlendvollset requested review from a team as code owners August 14, 2025 10:32

erlendvollset added 11 commits August 14, 2025 14:38

Add support for streams

bf7e159

Add basic support for records

5ee67ff

Raise FeaturePreviewWarning on all invocations of stream/record methods

8867493

Use new warn_on_all_method_invocations decorator across the board

e3086e3

Bump version and update changelog

a804832

Fix CogniteClientMock

a625536

Add retryable POST endpoints to _RETRYABLE_POST_ENDPOINT_REGEX_PATTERNS

08ed751

We should consider making this a blacklist instead of a whitelist, would be much easier to maintain

Fix dump and load methods

55ab1b5

Skip records/streams integration tests in CI

6ea6ae8

Only run them locally for now against the erlend-test project where records&streams is available

Get rid of RecordData class

7ebb47a

Just reuse SourceData (previously NodeOrEdgeData) instead

Use instances.Properties class in records too

bf85d13

erlendvollset force-pushed the streams-and-records branch from 4eb65aa to bf85d13 Compare August 14, 2025 12:39

erlendvollset and others added 4 commits August 15, 2025 16:22

Fix filter dump in sync method

44708f7

Make last_updated_time filter optional on records.filter

8704f93

Rename Stream apply to create

74cd61d

Merge branch 'master' into streams-and-records

46f6623

evertoncolling changed the title ~~Alpha support for streams and records~~ feat(records): alpha support for streams and records Oct 2, 2025

evertoncolling added 2 commits October 2, 2025 22:52

fix: Add missing import and remove invalid _warning call

5760a0b

fix: Add streams to non-idempotent POST endpoints list

bcd743f

kornelione reviewed Oct 3, 2025

View reviewed changes

kornelione approved these changes Oct 3, 2025

View reviewed changes

asosnovski reviewed Oct 6, 2025

View reviewed changes

haakonvt mentioned this pull request Oct 28, 2025

feat: add support for the Streams API #2380

Draft

3 tasks


		Get multiple streams by id:

		>>> res = client.data_modeling.streams.retrieve(streams=["MyStream", "MyAwesomeStream", "MyOtherStream"])

feat(records): alpha support for streams and records #2246

Are you sure you want to change the base?

feat(records): alpha support for streams and records #2246

Uh oh!

Conversation

erlendvollset commented Aug 12, 2025

Description

Checklist:

Uh oh!

gemini-code-assist bot commented Aug 12, 2025

Uh oh!

codecov bot commented Aug 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

asosnovski Oct 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

asosnovski Oct 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

asosnovski Oct 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

codecov bot commented Aug 12, 2025 •

edited

Loading

asosnovski Oct 6, 2025 •

edited

Loading

asosnovski Oct 6, 2025 •

edited

Loading

asosnovski Oct 6, 2025 •

edited

Loading