Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
139 changes: 127 additions & 12 deletions docs/python/how-to-guides/use-uris.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,30 @@
---
id: use-uris
title: Use URIs to share tables
sidebar_label: URI
---

This guide will show you to use Deephaven's [URIs](/core/pydoc/code/deephaven.uri.html#module-deephaven.uri) to share tables across instances and networks.
This guide will show you to use Deephaven's [URIs](https://deephaven.io/core/javadoc/io/deephaven/uri/package-summary.html) to share tables across server instances and networks.

A URI, short for [Uniform Resource Identifier](https://en.wikipedia.org/wiki/Uniform_Resource_Identifier), is a sequence of characters that identifies a resource on the web. Think of a URI as a generalization of a URL. A Deephaven URI identifies a table. By linking to a URI, you share your results with others without them needing to replicate your setup or know the details of your queries.
A URI, short for [Uniform Resource Identifier](https://en.wikipedia.org/wiki/Uniform_Resource_Identifier), is a sequence of characters that identifies a resource on the web. Think of a URI as a generalization of a URL. A Deephaven URI identifies a table on a server instance. By linking to a URI, you can access and work with tables from other Deephaven server instances without needing to replicate the data or queries that created them.

> [!NOTE]
> URIs can be used to share tables across Groovy and Python instances interchangably. For how to use URIs in Groovy, see [the equivalent guide](/core/groovy/docs/how-to-guides/use-uris).

## Why use URIs?

Deephaven URIs provide several key benefits:

- **Canonicalized resource identification**: Access resources through a standardized string format that works across server instances.
- **Simplified data sharing**: Share tables between different Deephaven instances without duplicating data or queries.
- **Distributed computing**: Build systems where processing is distributed across multiple Deephaven nodes.
- **Real-time access**: Access live, updating tables from remote sources that reflect the latest data.
- **Resource abstraction**: Reference remote tables and application fields using a consistent pattern regardless of location.
- **Cross-language compatibility**: Access the same data from both Python and Groovy scripts.
- **Environmental isolation**: Access data across different containers, servers, or networks.

By using URIs, you enable others to directly access your tables without needing to replicate your data pipeline, understand your query logic, or maintain duplicate datasets. This is particularly valuable in collaborative environments and distributed systems.

> [!NOTE]
> URI and Shared Tickets are two different ways to pull tables. Both work on static or dynamic tables. URI pulls tables already on the server via a URL-like string. Shared Tickets let you pull tables you create or access via the Python Client. Learn more about using Shared Tickets with Deephaven in the [Shared Tickets guide](../how-to-guides/capture-tables.md).

Expand All @@ -28,15 +43,46 @@ The above URL can be broken down as follows:
- Path
- The path, in this case, is `/core/docs`. It is a path on the authority.

### Deephaven URI structure

Deephaven URIs use a similar syntax:

`dh+plain://<authority>/<path>`
```
dh://<authority>[:<port>]/<scope>/<resource_name>
dh+plain://<authority>[:<port>]/<scope>/<resource_name>
```

The components are:

- **`dh+plain`** or **`dh`** is the scheme.
- `dh://` indicates a secure connection (TLS/SSL).
- `dh+plain://` indicates an insecure connection (no encryption).
- The scheme identifies the protocol for accessing Deephaven resources.
- All Deephaven URIs use one of these schemes, regardless of the application type (script, static, dynamic, qst) configured in [Application Mode](./application-mode.md).
- **`<authority>`** is the authority, which will be either:
- A Docker container name (for local container-to-container communication).
- A hostname/IP address (for network communication).
- **`<port>`** is optional and only needed when:
- The Deephaven instance is running on a non-default port (something other than 10000).
- You're connecting across a network to a specific port.
- **`<scope>`** identifies the namespace where the resource exists. This is typically `scope` for variables created in interactive console sessions, or `app/<app_name>/field` for resources exported from Application Mode applications.
- **`<table_name>`** is the exact name of the table you want to access.

### Resolving URIs in your code

- `dh+plain` is the scheme.
- `<authority>` is the authority, which will be either a Docker container name or hostname/IP.
- `<path>` is the path to a table, which is generally `scope/<table_name>`.
To access a table via its URI, use the [`resolve`](../reference/data-import-export/uri.md#parameters) function from the `deephaven.uri` module:

Let's explore this with a couple of examples.
```python skip-test
from deephaven.uri import resolve

# Basic usage
table = resolve("dh+plain://hostname/scope/table_name")

# With explicit port
table = resolve("dh+plain://hostname:9876/scope/table_name")
```

The `resolve` function connects to the specified Deephaven instance, retrieves the table, and returns it as a local reference that you can use in your code.

## Share tables locally

Expand Down Expand Up @@ -86,12 +132,64 @@ resolved_table = resolve("dh+plain://table-producer/scope/my_table")

By resolving the URI, we acquire `my_table` from the `table-producer` container using the syntax given above.

## Resource scopes and paths

A **scope** in a Deephaven URI is a namespace that identifies where a resource exists within a Deephaven server instance. Think of scopes as organizational containers that prevent naming conflicts and provide context for how resources were created.

Deephaven uses scopes to separate resources based on their origin and purpose:

### Query scope (`scope`)

The query scope contains variables created in interactive console sessions - when you create tables, variables, or other objects directly in the Deephaven IDE console or through client connections.

```groovy
// This creates a table in the query scope
my_table = emptyTable(100).update("X = i", "Y = i * 2")
// Accessible via: dh://hostname/scope/my_table
```

### Application scope (`app/<app_name>/field`)

The application scope contains fields exported from [Application Mode](./application-mode.md) applications. These are pre-configured resources that are available when the server starts, defined by application scripts.

```groovy
// In an Application Mode script, this exports a field
// Accessible via: dh://hostname/app/trading_app/field/market_data
```

Scopes ensure that:

- **No naming conflicts**: A table named `trades` in the query scope is completely separate from a field named `trades` in an application scope
- **Clear resource organization**: You know immediately whether a resource comes from interactive work or a pre-built application
- **Proper access control**: Different scopes can have different permission models

### URI format by scope type

```
# Query scope variable (most common)
dh+plain://hostname/scope/table_name
dh://hostname/scope/table_name

# Application field
dh+plain://hostname/app/my_application/field/my_field
dh://hostname/app/my_application/field/my_field
```

> [!NOTE]
> When using URIs to access resources, you must have appropriate permissions to access the resources in those scopes.

## Share tables across a network

Tables can also be shared across networks, public or private. Just like the previous example of sharing across a machine, this works in the same way. Rather than the container name, you only need the hostname/IP and port of the instance producing the table.

> [!NOTE]
> When sharing tables across a network, whether public or private, you do _not_ need the port if Deephaven is being run on the default Deephaven port `10000`. In all other cases, you _must_ provide the port on which the table can be found.
>
> - When sharing tables across a network, you do **not** need to specify the port if Deephaven is running on the default port `10000`.
> - You **must** specify the port in the URI when:
> - The remote Deephaven instance runs on a non-default port (not 10000).
> - You're connecting to a custom port forwarding configuration.
>
> Example format with port: `dh+plain://hostname:9876/scope/table_name`

### Create a table

Expand Down Expand Up @@ -119,16 +217,33 @@ If we have the hostname of our colleague's machine, that can be used in place of

If the machine on which a table exists is public, then consuming that table is done the same way as if it were a private network. All that's needed is the hostname/IP and table name.

<!-- TODO:
## Performance considerations

When using URIs to share tables across instances, particularly over networks, there are several performance factors to consider:

### Network impact

- **Latency**: Table access over a network introduces latency that varies based on network conditions. For operations requiring low latency, consider co-locating instances when possible.
- **Bandwidth**: The initial table snapshot and subsequent incremental updates consume bandwidth. Deephaven's Barrage protocol optimizes this by transmitting only changes rather than full table refreshes.
- **Connection reliability**: Unstable network connections can affect the reliability of table access. Implement appropriate error handling for network disruptions.

### Table characteristics

## Paths
- **Initial snapshot**: When first resolving a URI, Deephaven sends a snapshot of the current table state. Larger tables require more resources for this initial transfer.
- **Update frequency**: Tables with high update frequencies generate more incremental updates over the network. Deephaven's Barrage protocol efficiently transmits only the changes (additions, removals, modifications).
- **Column types**: Tables with complex data types like large strings or nested structures may have higher overhead during the initial snapshot and subsequent updates.

Tables can exist in different scopes, such as in app mode and others. When this is the case, the scope changes.
### Optimization strategies

Update this section. I need to learn more about different scopes. -->
- **Only share what's needed**: Filter, aggregate, and limit the amount of data you're sharing to only what a downstream consumer actually needs. This includes applying filters at the source, projecting only necessary columns, and pre-aggregating large datasets to reduce the volume of transferred data.
- **Avoid repeated URI resolution**: Store resolved table references in variables rather than calling `resolve` multiple times for the same URI. Each call to `resolve` creates a new connection, so reuse the table reference when possible within your application.
- **Use appropriate data consistency models**: For analysis requiring consistent data across multiple operations, use table snapshots instead of live updating tables. Point-in-time consistency ensures all your data represents the same moment in time, preventing issues where some data updates mid-analysis while other data remains static. Snapshots freeze the table state at a specific moment, guaranteeing consistent results and reducing network overhead from continuous updates.

## Related documentation

- [`empty_table`](../reference/table-operations/create/emptyTable.md)
- [`time_table`](../reference/table-operations/create/timeTable.md)
- [`update`](../reference/table-operations/select/update.md)
- [Capture Python client tables](./capture-tables.md)
- [Application Mode](./application-mode.md)
- [Pydoc](https://deephaven.io/core/pydoc/code/deephaven.uri.html)
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"file":"how-to-guides/use-uris.md","objects":{}}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"file":"how-to-guides/use-uris.md","objects":{"my_table":{"type":"Table","data":{"columns":[{"name":"X","type":"int"},{"name":"Y","type":"int"}],"rows":[[{"value":"0"},{"value":"0"}],[{"value":"1"},{"value":"2"}],[{"value":"2"},{"value":"4"}],[{"value":"3"},{"value":"6"}],[{"value":"4"},{"value":"8"}],[{"value":"5"},{"value":"10"}],[{"value":"6"},{"value":"12"}],[{"value":"7"},{"value":"14"}],[{"value":"8"},{"value":"16"}],[{"value":"9"},{"value":"18"}],[{"value":"10"},{"value":"20"}],[{"value":"11"},{"value":"22"}],[{"value":"12"},{"value":"24"}],[{"value":"13"},{"value":"26"}],[{"value":"14"},{"value":"28"}],[{"value":"15"},{"value":"30"}],[{"value":"16"},{"value":"32"}],[{"value":"17"},{"value":"34"}],[{"value":"18"},{"value":"36"}],[{"value":"19"},{"value":"38"}],[{"value":"20"},{"value":"40"}],[{"value":"21"},{"value":"42"}],[{"value":"22"},{"value":"44"}],[{"value":"23"},{"value":"46"}],[{"value":"24"},{"value":"48"}],[{"value":"25"},{"value":"50"}],[{"value":"26"},{"value":"52"}],[{"value":"27"},{"value":"54"}],[{"value":"28"},{"value":"56"}],[{"value":"29"},{"value":"58"}],[{"value":"30"},{"value":"60"}],[{"value":"31"},{"value":"62"}],[{"value":"32"},{"value":"64"}],[{"value":"33"},{"value":"66"}],[{"value":"34"},{"value":"68"}],[{"value":"35"},{"value":"70"}],[{"value":"36"},{"value":"72"}],[{"value":"37"},{"value":"74"}],[{"value":"38"},{"value":"76"}],[{"value":"39"},{"value":"78"}],[{"value":"40"},{"value":"80"}],[{"value":"41"},{"value":"82"}],[{"value":"42"},{"value":"84"}],[{"value":"43"},{"value":"86"}],[{"value":"44"},{"value":"88"}],[{"value":"45"},{"value":"90"}],[{"value":"46"},{"value":"92"}],[{"value":"47"},{"value":"94"}],[{"value":"48"},{"value":"96"}],[{"value":"49"},{"value":"98"}],[{"value":"50"},{"value":"100"}],[{"value":"51"},{"value":"102"}],[{"value":"52"},{"value":"104"}],[{"value":"53"},{"value":"106"}],[{"value":"54"},{"value":"108"}],[{"value":"55"},{"value":"110"}],[{"value":"56"},{"value":"112"}],[{"value":"57"},{"value":"114"}],[{"value":"58"},{"value":"116"}],[{"value":"59"},{"value":"118"}],[{"value":"60"},{"value":"120"}],[{"value":"61"},{"value":"122"}],[{"value":"62"},{"value":"124"}],[{"value":"63"},{"value":"126"}],[{"value":"64"},{"value":"128"}],[{"value":"65"},{"value":"130"}],[{"value":"66"},{"value":"132"}],[{"value":"67"},{"value":"134"}],[{"value":"68"},{"value":"136"}],[{"value":"69"},{"value":"138"}],[{"value":"70"},{"value":"140"}],[{"value":"71"},{"value":"142"}],[{"value":"72"},{"value":"144"}],[{"value":"73"},{"value":"146"}],[{"value":"74"},{"value":"148"}],[{"value":"75"},{"value":"150"}],[{"value":"76"},{"value":"152"}],[{"value":"77"},{"value":"154"}],[{"value":"78"},{"value":"156"}],[{"value":"79"},{"value":"158"}],[{"value":"80"},{"value":"160"}],[{"value":"81"},{"value":"162"}],[{"value":"82"},{"value":"164"}],[{"value":"83"},{"value":"166"}],[{"value":"84"},{"value":"168"}],[{"value":"85"},{"value":"170"}],[{"value":"86"},{"value":"172"}],[{"value":"87"},{"value":"174"}],[{"value":"88"},{"value":"176"}],[{"value":"89"},{"value":"178"}],[{"value":"90"},{"value":"180"}],[{"value":"91"},{"value":"182"}],[{"value":"92"},{"value":"184"}],[{"value":"93"},{"value":"186"}],[{"value":"94"},{"value":"188"}],[{"value":"95"},{"value":"190"}],[{"value":"96"},{"value":"192"}],[{"value":"97"},{"value":"194"}],[{"value":"98"},{"value":"196"}],[{"value":"99"},{"value":"198"}]]}}}}