Skip to content

Add usage analytics utils #16

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 14 commits into
base: main
Choose a base branch
from
Open

Conversation

liam-sbhoo
Copy link
Collaborator

@liam-sbhoo liam-sbhoo commented Apr 12, 2025

Change Description

Brief (a few bullet points describing your changes, use full sentences and try to link lines in the code whenever needed)

Implement minimal usage tracking feature:

  • AnalyticsHttpClient can retrieves some basic usage info from the caller, and send them in the HTTP header.
  • in tabpfn_client, we replace standard httpx.Client with AnalyticsHttpClient
  • in tabpfn_server, we extract these info based on ANALYTICS_TO_TRACK

Details (add details if your pull request is more complicated and harder to understand from the code alone)

Standard Qs (leave questions that do not apply blank)

If you broke behavior: Please describe what behavior you broke and how you inform people to not get stuck trying to use the old behavior.

If you used new dependencies: Did you add them to requirements.txt?
No new dependencies.

Who did you ping on Mattermost to review your PR? Please ping that person again whenever you are ready for another review.
@Jabb0


Please do not mark comments/conversations as resolved unless you are the assigned reviewer. This helps maintain clarity during the review process.

@liam-sbhoo liam-sbhoo requested review from Jabb0, LeoGrin and noahho April 12, 2025 13:53
)


ANALYTICS_TO_TRACK = [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use another prefix here please. Something like PL- for prior labs.

From Gemini:

Deprecation (RFC 6648): In June 2012, RFC 6648 was published, which deprecated the use of the X- prefix for new, non-standard parameters (including HTTP headers).

Reasoning: The practice caused problems.1 When headers starting with X- became widely adopted and effectively standard (like X-Forwarded-For or X-Frame-Options), removing the X- prefix later would cause compatibility issues. The prefix didn't reliably prevent name collisions and added confusion.  
1.
Why we need to deprecate x prefix for HTTP headers? - Tony Xu Blog

tonyxu.io

Recommendation: RFC 6648 advises against using the X- prefix for new non-standard or experimental headers. Instead, developers creating new headers should try to register them officially if appropriate, or choose names carefully, perhaps using vendor-specific identifiers or choosing names that are unlikely to clash with future standards, without relying on the X- prefix.

)


ANALYTICS_TO_TRACK = [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a list and this globally mutable. Use a tuple please.

If no such frame is found, returns 'StandaloneFunction'.
"""

import inspect
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

import here likely adds runtime overhead and error potential. If this method is used add inspect import to the top of the module.

if not recursive:
break

# If no class context was found, assume it's a standalone function
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove that comment. It is already apparent from the doc string and the default value of outmost_caller.

@@ -0,0 +1,72 @@
def get_calling_class(recursive=True):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return type missing.


# Call request method
self.client.request(
"GET", "https://example.com", headers={"Existing": "Header"}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use this enum to avoid typing out the requests types. This enables autocompletion. And, although veeeery unlikely, that you get the correct GET verbs if the global community wants to change it to GET2.

https://docs.python.org/3/library/http.html#http.HTTPMethod

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately this is only available from python 3.11

self.assertEqual(headers.get("X-Module-Name"), self.module_name)
self.assertIn("X-Unique-Call-Id", headers)
self.assertIn("X-Python-Version", headers)
self.assertIn("X-Calling-Class", headers)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test does not check for all options in ANALYTICS_TO_TRACK.

The stream one does.

At best you can share code between the two tests.


# Verify all analytics headers from ANALYTICS_TO_TRACK were added
for header_name, _ in ANALYTICS_TO_TRACK:
self.assertIn(header_name, headers)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You do not test the functions who populate this headers, although they are defined here.

Add tests that they work and check that the values are correct in here.

self.assertIsNotNone(headers.get("X-Calling-Class"))


if __name__ == "__main__":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works, however I'd prefer running the tests from CLI or using the IDEs build in features for better debugging.

There should be no in harm in it, so keeping it is fine too.

requirements.txt Outdated
typing_extensions>=4.12.2
httpx>=0.28.1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Too broad requirements! You cannot guarantee your code to work with every upcoming version.

Use Semver to make sure this does not break please.

Example: 0.28.1 is compatible with 0.28.x but not with 0.29.0. However, you allow this version to be used.

httpx~=0.28.1

Gemini:

~= (Compatible Release):

Example: numpy~=1.21.0 means >=1.21.0, <1.22.0 (allows PATCH updates).
Example: numpy~=1.21 means >=1.21.0, <2.0.0 (allows MINOR and PATCH updates).
This is specifically designed with SemVer in mind. It allows updates that should be backward-compatible (PATCH or MINOR+PATCH fixes/features) but prevents updates that might break things (MAJOR). This is often a good balance for libraries.

@liam-sbhoo liam-sbhoo requested a review from Jabb0 April 17, 2025 13:20
)


class ANALYTICS_KEYS(str, Enum):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants