Depth anything v2 workflow block #875

reiffd7 · 2024-12-12T15:02:11Z

Description

This PR implements the Depth Anything V2 model integration as a workflow block, enabling depth map prediction from 2D images. The implementation provides:

Support for three model sizes (Small, Base, Large) with different performance characteristics
Multiple colormap visualization options
Normalized depth output for technical applications
Comprehensive documentation and user guidance

Dependencies:

transformers
PIL
numpy
matplotlib
supervision

Type of change

Please delete options that are not relevant.

New feature (non-breaking change which adds functionality)
This change requires a documentation update

How has this change been tested, please provide a testcase or example of how you tested the change?

The implementation has been tested with:

Different input image formats (RGB images via PIL/numpy)
All supported model sizes (Small, Base, Large)
All available colormap options (Spectral_r, viridis, plasma, magma, inferno)
Edge cases:
- Handling images with uniform depth
- Proper normalization of depth values
- Error handling for invalid inputs

Any specific deployment considerations

Model weights will be downloaded on first use from HuggingFace
Different model sizes have varying resource requirements:
- Small: ~25M parameters, ~60ms inference time
- Base: ~335M parameters, ~213ms inference time
- Large: ~891M parameters, ~5.2s inference time
GPU acceleration recommended for optimal performance

For example, documentation changes, usability, usage/costs, secrets, etc.

Docs

Docs updated? What were the changes:
- Added comprehensive block documentation in LONG_DESCRIPTION
- Included detailed model size descriptions and performance characteristics
- Added colormap options with visual descriptions
- Provided usage examples and application scenarios
- Documented all input parameters and outputs

PawelPeczek-Roboflow · 2024-12-12T23:12:27Z

transformers is installed as extras - I bet the contribution would make inference python package failing (triggered by import of blocks loader - not visible in CI, would be shown if you extend Test package install - inference with import of inference/core/workflows/core_steps/loader.py module
the approach for model loading would be extremally slow - what you have would make the model loading over and over again for consecutive request to server, would work ok in InferencePipeline

until the two above are clarified I will not approve the PR

reiffd7 · 2024-12-13T03:18:29Z

transformers is installed as extras - I bet the contribution would make inference python package failing (triggered by import of blocks loader - not visible in CI, would be shown if you extend Test package install - inference with import of inference/core/workflows/core_steps/loader.py module

the approach for model loading would be extremally slow - what you have would make the model loading over and over again for consecutive request to server, would work ok in InferencePipeline

until the two above are clarified I will not approve the PR

For first bullet point, can 'transformers' become a core dependency or is it too large to be included for just having one block? I can foresee more huggingface wrapper workflow blocks being built.

For the second bullet, we could set up a model cache within the block. This could keep the model in memory between instances. I'm not sure how well that would handle on a multi-process server. I was thinking that we could use the Huggingface Inference API to post an image and get a prediction returned. That would be much more lightweight. On the model card, https://huggingface.co/depth-anything/Depth-Anything-V2-Base-hf, I read that "This model does not have enough activity to be deployed to Inference API (serverless) yet. ncrease its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead." Thats a bummer, it has 34,918 downloads, would have to think it will hosted on the serverless API soon. I'm also now just discovering ModelManager. How do we host CLIP, Florence 2? I would imagine thats a similar situation where loading these models would be extremely slow.

PawelPeczek-Roboflow · 2024-12-13T09:06:57Z

the point with transformers as core dependency is not that obvious - my concern is around our very specific pins:

and the general trend that we can see in the issues - people do not like strict pins on popular libs as they want those to be always up to date - which is not possible when we have tight pins in requirements.

So - the problem is much more convoluted than moving the dependency around - you can take that hustle and try it out, potentially unleashing pins taking care that old things still work - but that's the hustle you need to take at the moment as I do not have bandwidth.

A quick solution would be to have import transformers in try-catch at block init and fail when someone attempts to run the block in env that do not have dependencies - with clear message what is wrong. That's not ideal, magical imports are something I personally hate, but we could live with that solution.

On the second bullet - the code you create would be unbearably slow to handle consecutive requests in inference server and I am almost sure that if I approve that, any complaints regarding speed would need to be addressed by me. We can also live with that, but effectively you commit to fixing future issues if they are raised - and also - if any marketing material needs to be created on top of that which would not be possible due to the fact it's to slow - that would also be on you to fix.

hansent · 2025-01-16T17:36:59Z

is this still active? would like to close if its stale

reiffd7 · 2025-01-16T17:40:51Z

Yes its still active. I'm going to get back to this soon. I want to reimplement without huggingface for better performance. Depth anything has come up in a bunch of use cases so its still really important for us to have.

reiffd7 · 2025-04-03T21:11:09Z

@PawelPeczek-Roboflow , I'd like to revisit this. We've seen the need for a depth model in workflows pop up a few times. Do you think a solution to the points you made above (1. huggingfaced pinned dependency, 2. slow loading of the model) could be solved by us hosting the model ourselves by building it from the source, https://github.com/DepthAnything/Depth-Anything-V2?tab=readme-ov-file

The requirements are pytorch and opencv which I believe inference already supports.

PawelPeczek-Roboflow · 2025-04-07T16:01:13Z

not sure that we will make this specific model to work nicely (but we have plenty of options: https://huggingface.co/models?pipeline_tag=depth-estimation&sort=trending)

I believe that now we have sorted out a way for that HF models to work - take look at @capjamesg PR: #1106
Key points:

trying not to add dependencies other than hf transformers (1)
trying to pick model that would be loadable in serverless v2 GPU (can load within single request span of few seconds) (2)
wrapping model with our model manager (3)
posting static asset to our RF hosting (4)

achieving all 4 means we can very effectively host it and people would have no-brainer option to use the model
not achieving (2) means that people only will be able to use it locally with inference server

grzegorz-roboflow · 2025-04-18T09:57:47Z

@reiffd7 , can this be closed? Depth estimation was handled in below PRs:

#1204
#1201
#1175

reiffd7 added 3 commits December 11, 2024 17:36

depth anything v2 block

35ab822

depth anything v2 unit tests

6600ed5

improved description, wrote integration tests for new block

349ee67

reiffd7 requested review from PawelPeczek-Roboflow, grzegorz-roboflow, yeldarby, probicheaux and hansent as code owners December 12, 2024 15:02

PawelPeczek-Roboflow closed this Apr 18, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Depth anything v2 workflow block #875

Depth anything v2 workflow block #875

reiffd7 commented Dec 12, 2024

PawelPeczek-Roboflow commented Dec 12, 2024

reiffd7 commented Dec 13, 2024 •

edited

Loading

PawelPeczek-Roboflow commented Dec 13, 2024

hansent commented Jan 16, 2025

reiffd7 commented Jan 16, 2025

reiffd7 commented Apr 3, 2025

PawelPeczek-Roboflow commented Apr 7, 2025

grzegorz-roboflow commented Apr 18, 2025 •

edited

Loading

Depth anything v2 workflow block #875

Depth anything v2 workflow block #875

Conversation

reiffd7 commented Dec 12, 2024

Description

Type of change

How has this change been tested, please provide a testcase or example of how you tested the change?

Any specific deployment considerations

Docs

PawelPeczek-Roboflow commented Dec 12, 2024

reiffd7 commented Dec 13, 2024 • edited Loading

PawelPeczek-Roboflow commented Dec 13, 2024

hansent commented Jan 16, 2025

reiffd7 commented Jan 16, 2025

reiffd7 commented Apr 3, 2025

PawelPeczek-Roboflow commented Apr 7, 2025

grzegorz-roboflow commented Apr 18, 2025 • edited Loading

reiffd7 commented Dec 13, 2024 •

edited

Loading

grzegorz-roboflow commented Apr 18, 2025 •

edited

Loading