Skip to content

Commit babb439

Browse files
authored
Introduction of tracking tool calls through mixpanel (#199)
1 parent 5549d50 commit babb439

20 files changed

+659
-56
lines changed

.env.example

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -36,3 +36,8 @@ BLOCKSCOUT_RPC_POOL_PER_HOST=50
3636
# The server version is appended automatically.
3737
BLOCKSCOUT_MCP_USER_AGENT="Blockscout MCP"
3838

39+
# Optional Mixpanel analytics (HTTP mode only). Set token to enable; leave empty to disable.
40+
# Use API host for regional endpoints (e.g., EU). No tracking occurs in stdio mode.
41+
BLOCKSCOUT_MIXPANEL_TOKEN=""
42+
BLOCKSCOUT_MIXPANEL_API_HOST=""
43+

AGENTS.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,8 @@ mcp-server/
1919
│ ├── config.py # Configuration management (e.g., API keys, timeouts, cache settings)
2020
│ ├── constants.py # Centralized constants used throughout the application, including data truncation limits
2121
│ ├── logging_utils.py # Logging utilities for production-ready log formatting
22+
│ ├── analytics.py # Centralized Mixpanel analytics for tool invocations (HTTP mode only)
23+
│ ├── client_meta.py # Shared client metadata extraction helpers and defaults
2224
│ ├── cache.py # Simple in-memory cache for chain data
2325
│ ├── web3_pool.py # Async Web3 connection pool manager
2426
│ ├── models.py # Defines standardized Pydantic models for all tool responses
@@ -206,6 +208,17 @@ mcp-server/
206208
* **`logging_utils.py`**:
207209
* Provides utilities for configuring production-ready logging.
208210
* Contains the `replace_rich_handlers_with_standard()` function that eliminates multi-line Rich formatting from MCP SDK logs.
211+
* **`analytics.py`**:
212+
* Centralized Mixpanel analytics for MCP tool invocations.
213+
* Enabled only in HTTP mode when `BLOCKSCOUT_MIXPANEL_TOKEN` is set.
214+
* Generates deterministic `distinct_id` based on client IP, name, and version fingerprint.
215+
* Tracks tool invocations with client metadata, protocol version, and call source (MCP vs REST).
216+
* Includes IP geolocation metadata for Mixpanel and graceful error handling to avoid breaking tool execution.
217+
* **`client_meta.py`**:
218+
* Shared utilities for extracting client metadata (name, version, protocol, user_agent) from MCP Context.
219+
* Provides `ClientMeta` dataclass and `extract_client_meta_from_ctx()` function.
220+
* Falls back to User-Agent header when MCP client name is unavailable.
221+
* Ensures consistent sentinel defaults ("N/A", "Unknown") across logging and analytics modules.
209222
* **`cache.py`**:
210223
* Encapsulates in-memory caching of chain data with TTL management.
211224
* **`web3_pool.py`**:

Dockerfile

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,5 +32,7 @@ ENV BLOCKSCOUT_ADVANCED_FILTERS_PAGE_SIZE="10"
3232
ENV BLOCKSCOUT_RPC_REQUEST_TIMEOUT="60.0"
3333
ENV BLOCKSCOUT_RPC_POOL_PER_HOST="50"
3434
ENV BLOCKSCOUT_MCP_USER_AGENT="Blockscout MCP"
35+
# ENV BLOCKSCOUT_MIXPANEL_TOKEN="" # Intentionally commented out: pass at runtime to avoid embedding secrets in image
36+
ENV BLOCKSCOUT_MIXPANEL_API_HOST=""
3537

3638
CMD ["python", "-m", "blockscout_mcp_server"]

README.md

Lines changed: 17 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -144,13 +144,7 @@ Refer to [TESTING.md](TESTING.md) for comprehensive instructions on running both
144144
## Example Prompts for AI Agents
145145

146146
```plaintext
147-
On which popular networks is `ens.eth` deployed as a contract?
148-
```
149-
150-
```plaintext
151-
What are the usual activities performed by `ens.eth` on the Ethereum Mainnet?
152-
Since it is a contract, what is the most used functionality of this contract?
153-
Which address interacts with the contract the most?
147+
Is any approval set for OP token on Optimism chain by `zeaver.eth`?
154148
```
155149

156150
```plaintext
@@ -163,9 +157,22 @@ before `Nov 08 2024 04:21:35 AM (-06:00 UTC)`?
163157
```
164158

165159
```plaintext
166-
What is the most recent transaction made to queue a proposal on `0x323A76393544d5ecca80cd6ef2A560C6a395b7E3`
167-
in the Ethereum mainnet? What is the proposal ID? What are the current vote
168-
statistics for this proposal?
160+
Tell me more about the transaction `0xf8a55721f7e2dcf85690aaf81519f7bc820bc58a878fa5f81b12aef5ccda0efb`
161+
on Redstone rollup.
162+
```
163+
164+
```plaintext
165+
Is there any blacklisting functionality of USDT token on Arbitrum One?
166+
```
167+
168+
```plaintext
169+
What is the latest block on Gnosis Chain and who is the block minter?
170+
Were any funds moved from this minter recently?
171+
```
172+
173+
```plaintext
174+
When the most recent reward distribution of Kinto token was made to the wallet
175+
`0x7D467D99028199D99B1c91850C4dea0c82aDDF52` in Kinto chain?
169176
```
170177

171178
## Development & Deployment

SPEC.md

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -493,8 +493,38 @@ Implemented via the `@log_tool_invocation` decorator, these logs capture:
493493
- The arguments provided to the tool.
494494
- The identity of the MCP client that initiated the call, including its **name**, **version**, and the **MCP protocol version** it is using.
495495

496+
If the client name cannot be determined from the MCP session parameters, the server falls back to the HTTP `User-Agent` header as the client identifier.
497+
496498
This provides a clear audit trail, helping to diagnose issues that may be specific to certain client versions or protocol implementations. For stateless calls, such as those from the REST API where no client is present, this information is gracefully omitted.
497499

500+
#### 3. Mixpanel Analytics for Tool Invocation
501+
502+
To gain insight into tool usage patterns, the server can optionally report tool invocations to Mixpanel.
503+
504+
- Activation (opt-in only):
505+
- Enabled exclusively in HTTP modes (MCP-over-HTTP and REST).
506+
- Requires `BLOCKSCOUT_MIXPANEL_TOKEN` to be set; otherwise analytics are disabled.
507+
508+
- Integration point:
509+
- Tracking is centralized in `blockscout_mcp_server/analytics.py` and invoked from the shared `@log_tool_invocation` decorator so every tool is tracked consistently without altering tool implementations.
510+
511+
- Tracked properties (per event):
512+
- Client IP address derived from the HTTP request, preferring proxy headers when present: `X-Forwarded-For` (first value), then `X-Real-IP`, otherwise connection `client.host`.
513+
- MCP client name (or the HTTP `User-Agent` when the client name is unavailable).
514+
- MCP client version.
515+
- MCP protocol version.
516+
- Tool arguments (currently sent as-is, without truncation).
517+
- Call source: whether the tool was invoked by MCP or via the REST API.
518+
519+
- Anonymous identity (distinct_id) (as per Mixpanel's [documentation](https://docs.mixpanel.com/docs/tracking-methods/id-management/identifying-users-simplified#server-side-identity-management)):
520+
- A stable `distinct_id` is generated to anonymously identify unique users.
521+
- The fingerprint is the concatenation of: namespace URL (`"https://blockscout.com/mcp/"`), client IP, client name, and client version.
522+
- This provides stable identification even when multiple clients share the same name/version (e.g., Claude Desktop), because their IPs differ.
523+
524+
- REST API support and source attribution:
525+
- The REST context mock is extended with a request context wrapper so analytics can extract IP and headers consistently (see `blockscout_mcp_server/api/dependencies.py`).
526+
- A `call_source` field is introduced on the REST mock context and set to `"rest"`, allowing analytics to reliably distinguish REST API calls from MCP tool calls without coupling to specific URL paths.
527+
498528
### Smart Contract Interaction Tools
499529

500530
This server exposes a tool for on-chain smart contract read-only state access. It uses the JSON-RPC `eth_call` semantics under the hood and aligns with the standardized `ToolResponse` model.

blockscout_mcp_server/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,3 @@
11
"""Blockscout MCP Server package."""
22

3-
__version__ = "0.7.0"
3+
__version__ = "0.8.0-dev"

blockscout_mcp_server/analytics.py

Lines changed: 188 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,188 @@
1+
"""Centralized Mixpanel analytics for MCP tool invocations.
2+
3+
Tracking is enabled only when:
4+
- BLOCKSCOUT_MIXPANEL_TOKEN is set, and
5+
- server runs in HTTP mode (set via set_http_mode(True)).
6+
7+
Events are emitted via Mixpanel with a deterministic distinct_id based on a
8+
connection fingerprint composed of client IP, client name, and client version.
9+
"""
10+
11+
from __future__ import annotations
12+
13+
import logging
14+
import uuid
15+
from typing import Any
16+
17+
try:
18+
# Import lazily; tests will mock this
19+
from mixpanel import Consumer, Mixpanel
20+
except ImportError: # pragma: no cover
21+
22+
class _MissingMixpanel: # noqa: D401 - simple placeholder
23+
"""Placeholder that raises if Mixpanel is actually used."""
24+
25+
def __init__(self, *args: Any, **kwargs: Any) -> None: # noqa: D401 - simple placeholder
26+
raise ImportError("Mixpanel library is not installed. Please install 'mixpanel' to use analytics features.")
27+
28+
Consumer = _MissingMixpanel # type: ignore[assignment]
29+
Mixpanel = _MissingMixpanel # type: ignore[assignment]
30+
31+
from blockscout_mcp_server.client_meta import (
32+
ClientMeta,
33+
extract_client_meta_from_ctx,
34+
get_header_case_insensitive,
35+
)
36+
from blockscout_mcp_server.config import config
37+
38+
logger = logging.getLogger(__name__)
39+
40+
41+
_is_http_mode_enabled: bool = False
42+
_mp_client: Any | None = None
43+
44+
45+
def set_http_mode(is_http: bool) -> None:
46+
"""Enable or disable HTTP mode for analytics gating."""
47+
global _is_http_mode_enabled
48+
_is_http_mode_enabled = bool(is_http)
49+
# Log enablement status once at startup (HTTP path only)
50+
if _is_http_mode_enabled:
51+
token = getattr(config, "mixpanel_token", "")
52+
if token:
53+
# Best-effort initialize client to validate configuration
54+
_ = _get_mixpanel_client()
55+
api_host = getattr(config, "mixpanel_api_host", "") or "default"
56+
logger.info("Mixpanel analytics enabled (api_host=%s)", api_host)
57+
else:
58+
logger.debug("Mixpanel analytics not enabled: BLOCKSCOUT_MIXPANEL_TOKEN is not set")
59+
60+
61+
def _get_mixpanel_client() -> Any | None:
62+
"""Return a singleton Mixpanel client if token is configured."""
63+
global _mp_client
64+
if _mp_client is not None:
65+
return _mp_client
66+
token = getattr(config, "mixpanel_token", "")
67+
if not token:
68+
return None
69+
try:
70+
api_host = getattr(config, "mixpanel_api_host", "")
71+
if api_host:
72+
consumer = Consumer(api_host=api_host)
73+
_mp_client = Mixpanel(token, consumer=consumer)
74+
else:
75+
_mp_client = Mixpanel(token)
76+
return _mp_client
77+
except Exception as exc: # pragma: no cover - defensive
78+
logger.debug("Failed to initialize Mixpanel client: %s", exc)
79+
return None
80+
81+
82+
def _extract_request_ip(ctx: Any) -> str:
83+
"""Extract client IP address from context if possible."""
84+
ip = ""
85+
try:
86+
request = getattr(getattr(ctx, "request_context", None), "request", None)
87+
if request is not None:
88+
headers = request.headers or {}
89+
# Prefer proxy-forwarded headers
90+
xff = get_header_case_insensitive(headers, "x-forwarded-for", "") or ""
91+
if xff:
92+
# left-most IP per standard
93+
ip = xff.split(",")[0].strip()
94+
else:
95+
x_real_ip = get_header_case_insensitive(headers, "x-real-ip", "") or ""
96+
if x_real_ip:
97+
ip = x_real_ip
98+
else:
99+
client = getattr(request, "client", None)
100+
if client and getattr(client, "host", None):
101+
ip = client.host
102+
except Exception: # pragma: no cover - tolerate all shapes
103+
pass
104+
return ip
105+
106+
107+
def _build_distinct_id(ip: str, client_name: str, client_version: str) -> str:
108+
# User-Agent is merged into client_name in extract_client_meta_from_ctx when name is unavailable.
109+
# Therefore composite requires only ip, client_name and client_version for a stable fingerprint.
110+
composite = "|".join([ip or "", client_name or "", client_version or ""])
111+
return str(uuid.uuid5(uuid.NAMESPACE_URL, "https://blockscout.com/mcp/" + composite))
112+
113+
114+
def _determine_call_source(ctx: Any) -> str:
115+
"""Return 'mcp' for MCP calls, 'rest' for REST API, else 'unknown'.
116+
117+
Priority:
118+
1) Explicit marker set by caller (e.g., REST mock context) via `call_source`.
119+
2) Default to 'mcp' when no explicit marker is present (applies to MCP-over-HTTP).
120+
"""
121+
try:
122+
explicit = getattr(ctx, "call_source", None)
123+
if isinstance(explicit, str) and explicit:
124+
return explicit
125+
# No explicit marker: treat as MCP (covers MCP-over-HTTP)
126+
return "mcp"
127+
except Exception: # pragma: no cover
128+
pass
129+
return "unknown"
130+
131+
132+
def track_tool_invocation(
133+
ctx: Any,
134+
tool_name: str,
135+
tool_args: dict[str, Any],
136+
client_meta: ClientMeta | None = None,
137+
) -> None:
138+
"""Track a tool invocation in Mixpanel, if enabled and in HTTP mode."""
139+
if not _is_http_mode_enabled:
140+
return
141+
mp = _get_mixpanel_client()
142+
if mp is None:
143+
return
144+
145+
try:
146+
ip = _extract_request_ip(ctx)
147+
148+
# Prefer provided client metadata from the decorator; otherwise, fall back to context
149+
if client_meta is not None:
150+
client_name = client_meta.name
151+
client_version = client_meta.version
152+
protocol_version = client_meta.protocol
153+
user_agent = client_meta.user_agent
154+
else:
155+
meta = extract_client_meta_from_ctx(ctx)
156+
client_name = meta.name
157+
client_version = meta.version
158+
protocol_version = meta.protocol
159+
user_agent = meta.user_agent
160+
161+
distinct_id = _build_distinct_id(ip, client_name, client_version)
162+
163+
properties: dict[str, Any] = {
164+
"ip": ip,
165+
"client_name": client_name,
166+
"client_version": client_version,
167+
"user_agent": user_agent,
168+
"tool_args": tool_args,
169+
"protocol_version": protocol_version,
170+
"source": _determine_call_source(ctx),
171+
}
172+
173+
# TODO: Remove this log after validating Mixpanel analytics end-to-end
174+
logger.info(
175+
"Mixpanel event prepared: distinct_id=%s tool=%s properties=%s",
176+
distinct_id,
177+
tool_name,
178+
properties,
179+
)
180+
181+
meta = {"ip": ip} if ip else None
182+
# Mixpanel Python SDK allows meta for IP geolocation mapping
183+
if meta is not None:
184+
mp.track(distinct_id, tool_name, properties, meta=meta) # type: ignore[call-arg]
185+
else:
186+
mp.track(distinct_id, tool_name, properties)
187+
except Exception as exc: # pragma: no cover - do not break tool flow
188+
logger.debug("Mixpanel tracking failed for %s: %s", tool_name, exc)
Lines changed: 23 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,35 @@
11
"""Dependencies for the REST API, such as mock context providers."""
22

3+
from __future__ import annotations
4+
5+
from typing import TYPE_CHECKING
6+
7+
if TYPE_CHECKING: # pragma: no cover - typing-only import
8+
from starlette.requests import Request
9+
10+
11+
class _RequestContextWrapper:
12+
"""Lightweight wrapper to mimic MCP's request_context shape for analytics."""
13+
14+
def __init__(self, request: Request) -> None:
15+
self.request: Request = request
16+
317

418
class MockCtx:
519
"""A mock context for stateless REST calls.
620
721
Tool functions require a ``ctx`` object to report progress. Since REST
822
endpoints are stateless and have no MCP session, this mock provides the
923
required ``info`` and ``report_progress`` methods as no-op async functions.
24+
It also exposes a ``request_context`` with the current Starlette request so
25+
analytics can extract connection fingerprint data.
1026
"""
1127

28+
def __init__(self, request: Request | None = None) -> None:
29+
self.request_context = _RequestContextWrapper(request) if request is not None else None
30+
# Mark source explicitly so analytics can distinguish REST from MCP without path coupling
31+
self.call_source = "rest"
32+
1233
async def info(self, message: str) -> None:
1334
"""Simulate the ``info`` method of an MCP ``Context``."""
1435
pass
@@ -18,6 +39,6 @@ async def report_progress(self, *args, **kwargs) -> None:
1839
pass
1940

2041

21-
def get_mock_context() -> MockCtx:
42+
def get_mock_context(request: Request | None = None) -> MockCtx:
2243
"""Dependency provider to get a mock context for stateless REST calls."""
23-
return MockCtx()
44+
return MockCtx(request=request)

0 commit comments

Comments
 (0)