Description
UMF should offer a set of observability functions that can be used to retrieve the memory properties of memory allocated through UMF. Since these properties are closely tied to the provider used, the API should essentially return the provider's properties for a given pointer.
Requirements:
- allow to get memory type: CPU or GPU, or the info is the ptr CPU accessible or not
- allow access to provider-specific properties, such as NUMA node, USM type (Host, Device, Shared), GPU device ID, context, and others
Currently, for a given ptr
, a user can obtain a provider by calling:
pool = umfPoolByPtr(ptr)
umfPoolGetMemoryProvider(pool, &provider)
After obtaining the provider, there are several options for retrieving the memory provider properties:
Proposal 1 - per-provider get/set functions
In this proposal, the user needs to be aware of the provider's type. Each provider's property can then be retrieved in a manner similar to how it was set during creation. Additionally, there could be extra functions that do not have a corresponding "set" function for provider properties, such as a function that retrieves the general type of memory (e.g., CPU or GPU). If a specific provider doesn't know how to populate a given property, we could return a new error code UMF_RESULT_INVALID_PARAM
.
name = umfMemoryProviderGetName(provider)
umf_memory_type_t memory_type // NEW enum: CPU or GPU
umf_memory_type_t usm_memory_type // host, dev, shared
switch(name)
case "level_zero":
// NEW per-property umf*MemoryProviderParamsGet*() call
ret1 = umfLevelZeroMemoryProviderParamsGetMemoryType(params, &memory_type)
// NEW return code
if (ret1 == UMF_RESULT_INVALID_PARAM) {
...
}
ret2 =umfLevelZeroMemoryProviderParamsGetUSMMemoryType(params, &usm_memory_type)
case "os":
unsigned numa_list = []
unsigned numa_list_len = 0
ret1 = umfOsMemoryProviderParamsGetNumaList(params, &numa_list, &numa_list_len)
ret1 = umfOsMemoryProviderParamsGetMemoryType(params, &memory_type)
...
Proposal 2 - common properties structure
In this proposal, we could define a common structure for provider properties, along with a function that returns it based on the provider handle. In this structure, we could maintain some properties, such as is_cpu_accessible
, in a common scope, while storing provider-specific properties in unions (they could be nested). Additionally, we could introduce the type of provider as one of the properties.
// NOTE: this NEW enum is defined in the public API
enum umf_memory_provider_type {
UMF_MEMORY_PROVIDER_OS,
UMF_MEMORY_PROVIDER_CUDA,
UMF_MEMORY_PROVIDER_LEVEL_ZERO,
UMF_MEMORY_PROVIDER_FILE,
}
// NOTE: this NEW struct is defined in the public API
struct umf_memory_provider_params_t {
DWORD version // NOTE: this struct has to be versioned
bool is_cpu_accessible
umf_memory_provider_type provider_type
union {
struct {
int numa_node
DWORD flags
} os
struct {
umf_usm_memory_type_t usm_type
void* gpu_context
union {
struct {
void* device_ctx
} level_zero
struct {
int device_id
} cuda
}
} gpu
}
}
umf_memory_provider_params_t params = umfMemoryProviderGetParams(provider)
Proposal 1 + 2 - generic per-property set of functions
This is a mix of proposals 1 and 2. The difference is that umf_memory_provider_params_t
is hidden from the user, and there is a public list of generic (not provider-specific) per-property functions:
umfMemoryProviderGetParams(provider, ¶ms)
umfMemoryProviderParamsGetType(params, &provider_type)
umfMemoryProviderParamsGetCPUAccessible(params, &is_cpu_accessible )
umf_result_t res = umfMemoryProviderParamsGetGPUContext(params, &gpu_context)
Proposal 3A - per-provider key-value set based on strings
Each provider could keep a key-value store that could hold its properties. Then, the user could get the specific property from a provider using its name.
// NOTE: this structure is non-public
struct umf_property_t {
char* key
void* value
}
// NOTE: this structure is non-public
struct umf_property_set_t {
umf_property_t* properties
size_t count
}
const umf_property_set_t* props = umfMemoryProviderGetProps(provider)
// NEW API call - return value in last param (void*)
// NOTE: do we always return void* or unsigned int? If not, the additional "size" param could be needed
umf_result_t res = umfGetPropValueByName(props, "memory_type", &memory_type)
// get name of the prop
const char* prop_name = NULL
umfGetPropName(props, id, &prop_name)
Proposal 3B - per-provider key-value set based on IDs
Similarly to string-based Get functions from Proposal 3A we could use a property ID. They could be pre-defined in public headers.
enum umf_property_id {
UMF_MEMORY_TYPE,
// OS provider
UMF_NUMA_NODE_ID,
// any GPU provider
UMF_MEMORY_USM_TYPE,
UMF_GPU_CONTEXT,
// CUDA-specyfic
UMF_DEVICE_ID
// Level-Zero specyfic
UMF_DEVICE_CTX
// File-provider specyfic
....
UMF_MAX_PROPERTY_ID
}
const umf_property_set_t* props = umfMemoryProviderGetProps(provider)
void* value = umfGetPropValueById(props, UMF_NUMA_NODE_ID)
Proposal 4 - CTL
Similar to proposal 3A but based on CTL.
// get value by name
umfCtlGet("umf.provider.by_handle.props.memory_type", provider, memory_type)
// get a list of props
int count = 0;
umfCtlGet("umf.provider.by_handle.props.count", provider, &count)
// get name/val by id
const char* name = NULL;
umfCtlGet("umf.provider.by_handle.props.2.name", provider, &name)
void* val;
umfCtlGet("umf.provider.by_handle.props.2.val", provider, &val)
Additional Considerations - per allocation properties
Please note that in the proposals above, we assumed that all pointer properties could be derived from the provider properties. However, this is not the case for certain attributes, such as the unique ID of the allocation (see CU_POINTER_ATTRIBUTE_BUFFER_ID
for CUDA and ze_memory_allocation_properties_t.id
for Level Zero) or the page size. To query the page size of an allocation, the user could use the generic umfMemoryProviderGetMinPageSize(provider, ptr, &page_size)
function. However, we still need to define a new umfMemoryProviderGetAllocationID(provider, ptr, &id)
function for retrieving the allocation ID.
Additional per-pointer properties to consider are base pointer and size of the full allocation (see zeMemGetAddressRange).
Hybrid proposal
It is also worth noticing, that we could achieve both flexibility (like in the Proposal 4 - CTL) and performance by caching per-provider properties at the user side:
// this struct is defined by the user and caches only required provider properties
struct props_struct {
int type; // set accordingly
union {
struct cuda_props cuda;
struct l0_props l0;
} data;
}
...
get_properties(ptr, props_cache) {
// get memory provider
umf_pool_handle_t pool = umfPoolByPtr(ptr)
umfPoolGetMemoryProvider(pool, &provider)
// props_struct is defined and filled by the user
props_struct props = props_cache[provider]
if (props == NULL) {
// slow path - cache properties once
props_struct props = {} // empty
const char* adapter_name = umf_get_prop(provider, "adapter_name") // only example - could be CTL
if (adapter_name == "CUDA") {
props.type = PROPS_STRUCT_CUDA
props.cuda.prop2 = umf_get_prop(provider, "prop2")
props.cuda.prop3 = umf_get_prop(provider, "prop3")
} else if (adapter_name == "L0") {
props.type = PROPS_STRUCT_L0
props.l0.prop2 = umf_get_prop(provider, "prop2")
props.l0.prop3 = umf_get_prop(provider, "prop3")
}
props_cache[provider] = props
}
bool is_host = props.is_host;
}
Pros / Cons
Proposal | Proposal 1 per-provider get/set funcs |
Proposal 2 common props struct |
Proposal 1 + 2 generic per-property funcs |
Proposal 3A per-provider key-value strings set |
Proposal 3B per-provider key-value ID set |
Proposal 4 CTL |
Proposal 5 Hybrid CTL |
---|---|---|---|---|---|---|---|
easy to implement | + yes | - complex structure, new ops method | - complex structure, new ops method | + easy | + easy | - complex | - complex + code at user side |
consistent with how we set props of providers | + yes | - no | +/- somehow | - no | - no | - no | - no |
API defined in per-provider vs common headers | per-provider | common | common + per-provider | common, kv structures non public | common | hidden | hidden |
encapsulation | + yes | - no (types) | + yes if we keep some API in e.g. GPU headers | + yes | + yes | + yes | + yes |
needs to be versioned | + no | - yes | + no | + no | - yes? | + no | + no |
number of new API functions | - large | + small | +/- moderate | + small | + small | + none - uses existing | + none - uses existing |
performance | + fast | + fast | + fast | - slow (string compare) | + fast | - slow | + fast |
supports "common" properties | - no | + yes | + yes | + yes | + yes | + yes | + yes |
supports user-defined providers | + yes | +/- only common props | +/- only common props | + yes | +/- yes with some potentiall problems | + yes | + yes |
supports user-defined properties | + yes | - no | + yes (tricky) | + yes | +/- yes with some potentiall problems | + yes | + yes |
user needs to know the type of provider for common props | - yes | + no | + no | + no | + no | + no | + no |