-
Notifications
You must be signed in to change notification settings - Fork 36
feat(records): alpha support for streams and records #2246
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. |
344b827 to
6cfb164
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #2246 +/- ##
==========================================
- Coverage 90.82% 90.67% -0.16%
==========================================
Files 170 174 +4
Lines 25666 25898 +232
==========================================
+ Hits 23312 23483 +171
- Misses 2354 2415 +61
🚀 New features to boost your workflow:
|
de116c6 to
4eb65aa
Compare
We should consider making this a blacklist instead of a whitelist, would be much easier to maintain
Only run them locally for now against the erlend-test project where records&streams is available
Just reuse SourceData (previously NodeOrEdgeData) instead
4eb65aa to
bf85d13
Compare
| resource_cls=Stream, | ||
| method="GET", | ||
| chunk_size=chunk_size, | ||
| limit=limit, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: we do not have limit on listStream atm.
https://api-docs.cogheim.net/redoc/#tag/Streams/operation/listStream
| >>> from cognite.client import CogniteClient | ||
| >>> client = CogniteClient() | ||
| >>> client.data_modeling.streams.delete(streams=["myStream", "myOtherStream"]) | ||
| """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: Streams api deviate from cognite api standard on delete (bug has been opened on this), which is generally to do post with items list under , instead we have a issue a DELETE request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we only accept one item in the list of streams to be deleted and created btw..
| self, | ||
| stream: str, | ||
| *, | ||
| last_updated_time: LastUpdatedRange | None = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Food for thought. This parameter is required for immutable streams. Should we have a default value for it, or should we maybe require an explicit None for mutable streams so that SDK users have to think which stream they are querying?
| @warn_on_all_method_invocations( | ||
| FeaturePreviewWarning(api_maturity="alpha", sdk_maturity="alpha", feature_name="Records API") | ||
| ) | ||
| class RecordsAPI(APIClient): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about aggregate endpoint?
| Fetches streams as they are iterated over, so you keep a limited number of streams in memory. | ||
|
|
||
| Args: | ||
| chunk_size (int | None): Number of streams to return in each chunk. Defaults to yielding one stream a time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't fully understand what this thing does, so maybe my worries are groundless. But streams endpoint will have pretty strict rate/concurrency limits. Why have a chunk size of 1, when by default a project can have no more than 10 streams? Maybe set something like 20 to avoid unnecessary API calls?
|
|
||
| Get multiple streams by id: | ||
|
|
||
| >>> res = client.data_modeling.streams.retrieve(streams=["MyStream", "MyAwesomeStream", "MyOtherStream"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is concerning. How many IDs can be passed this way? Our endpoint allows retrieving only 1 stream by ID, so this results in multiple API invocations, right? And stream endpoints have very strict limits: https://cognitedata.atlassian.net/wiki/spaces/RPILA/pages/4797431945/Design+ILA+rate+and+concurrency+limits#Stream-endpoint-limits Invoking this method with >5 IDs basically guarantees throttling.
| """ | ||
| return self() | ||
|
|
||
| def retrieve(self, external_id: str) -> Stream | None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about includeStatistics parameter? NB! Statistics calculation is potentially expensive, that's why we have it in the get stream endpoint but not in the list streams. And allowing to get multiple streams by ID with one method invocation kind of circumvents this.
|
|
||
| >>> from cognite.client import CogniteClient | ||
| >>> client = CogniteClient() | ||
| >>> client.data_modeling.streams.delete(streams=["myStream", "myOtherStream"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Food for thought. Streams are intended to be long-lived, and customers should think twice before creating or deleting them. A deleted stream will stay for up to 6 weeks in a soft-deleted state, all this time consuming capacity, incurring costs and preventing another stream with the same name from being created. Should we really allow customers to delete multiple streams with 1 request?
| """`List streams <https://developer.cognite.com/api#tag/Streams/operation/listStreamsV3>`_ | ||
|
|
||
| Args: | ||
| limit (int | None): Maximum number of streams to return. Defaults to 10. Set to -1, float("inf") or None to return all items. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned by Andreas, there is no limit. The endpoint returns all the streams there are, as a project isn't supposed to have many.
| @overload | ||
| def create(self, streams: StreamWrite) -> Stream: ... | ||
|
|
||
| def create(self, streams: StreamWrite | Sequence[StreamWrite]) -> Stream | StreamList: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Food for thought. Streams are intended to be long-lived, and customers should think twice before creating or deleting them. A deleted stream will stay for up to 6 weeks in a soft-deleted state, all this time consuming capacity, incurring costs and preventing another stream with the same name from being created. Should we really allow customers to create multiple streams with 1 request? Especially considering that the endpoint will have only 1 rps limit, so calling this method with multiple stream names would automatically mean throttling.
Description
Please describe the change you have made.
Checklist:
If a new method has been added it should be referenced in cognite.rst in order to generate docs based on its docstring.