Skip to content
This repository was archived by the owner on Aug 22, 2024. It is now read-only.

Body tracking is too slow and inaccurate #514

Closed
rfilkov opened this issue Jul 17, 2019 · 101 comments
Closed

Body tracking is too slow and inaccurate #514

rfilkov opened this issue Jul 17, 2019 · 101 comments
Assignees
Labels
Body Tracking Issue related to the Body Tracking SDK Enhancement New feature or request

Comments

@rfilkov
Copy link

rfilkov commented Jul 17, 2019

Describe the bug

By all means Azure Kinect is the best Kinect so far, and will be probably the best depth sensor on the market. The sensor SDK is pretty stable and good, providing almost everything an average user would want. But the body tracking subsystem is ruining this positive user experience. In means of API this SDK is great too, but the DNN model performance is much worse than the body tracking of Kinect-v2. The joint positions are inaccurate by fast movements. The body index map is not very accurate, as well. It does not fully match the user's silhouette on the depth frame. On my GTX 1060 it takes 2-3 depth frame cycles to process a body frame. Hence, it works at about 10 fps.

To Reproduce

  1. Run Azure Kinect Body Tracking Viewer.
  2. Stand in front of the sensor.
  3. Make fast arm movements.
  4. Look at the arm joint positions with regard to the real arms.
  5. Look at the colorized body-index map with regard to the real body.

Expected behavior

  1. I expect the body tracking to work at least at 30 fps or more, i.e. faster than the depth frames arrive.
  2. I expect the body joint positions to match as precisely as possible the user's joints on the depth frame.
  3. I expect the body index map to match as precisely as possible the user's silhouette on the depth frame.
  4. I expect the body tracking to be less demanding, in means of hardware and 3rd party software requirements. GTX 1070 + CUDA + cuDNN + manually setting paths would be too much for the average user.

Please consider at least providing some option to the users, who don't have high end graphics cards and would like to get Kinect body tracking out of the box, without (or with minimum) extra installations. As far as I remember, Kinect-v2 used random forest model(s) for its body tracking. The performance was great and no extra installations were needed, back then in 2013/14.

Logs

Screenshots

Desktop (please complete the following information):

  • OS with Version: Windows 1809
  • Kinect SDK Version: 1.1.0
  • Body tracking SDK Version: 0.9.0

Additional context

I believe most Kinect users would expect better, more accurate and more performant body tracking experience, not worse. And now, with Apple adding people segmentation and body tracking to their AR-Kit 3.0 I would expect Kinect (with all these years of experience) to provide a better user experience in all aspects than anybody else.

@rfilkov rfilkov added Body Tracking Issue related to the Body Tracking SDK Bug Something isn't working Triage Needed The Issue still needs to be reviewed by Azure Kinect team members. labels Jul 17, 2019
@billpottle
Copy link

I understand that the body tracking is only a preview release, and will likely be improved over time. As a stopgap, if it's possible, perhaps the body tracking can expose a parameter that can be tuned to achieve higher accuracy for offline or remote processing on powerful systems for non-real time applications.

There are many use cases that might only need a 3-4 second clip and could afford to spend minutes/hours processing, but can't afford meter+ gaps between actual and reported joint positions.

@Brekel
Copy link

Brekel commented Jul 17, 2019

Fully agree!

I have a feeling like the output data is also overly filtered/smoothed somewhere in the pipeline, possibly at the model fitting stage after the DNN?
Maybe for certain use cases it could be useful to favor plausible poses over per-frame accuracy, but a user setting to tune this behavior could definitely improve usability in many more scenarios.

Another addition could be to output the 2D joint positions straight from the DNN when we need accurate points on the IR frame prior to any model fitting and post filtering stages.

I've had these 2 suggestions on the feedback page already in case you want to upvote:
https://feedback.azure.com/forums/920053-azure-kinect-dk/suggestions/38027029-user-adjustable-skeleton-smoothing
https://feedback.azure.com/forums/920053-azure-kinect-dk/suggestions/38031535-2d-joint-positions

@cdedmonds cdedmonds added Enhancement New feature or request and removed Bug Something isn't working Triage Needed The Issue still needs to be reviewed by Azure Kinect team members. labels Jul 17, 2019
@cdedmonds
Copy link
Contributor

cdedmonds commented Jul 17, 2019

Thank you for the feedback. Please note that the Body Tracking SDK is currently a preview release and we are actively working on many aspects of quality. The installation process and dependencies are a pain point that we are aware of and are working to fix.

As of the preview release the minimum recommended card is a GTX1070:
https://docs.microsoft.com/en-us/azure/kinect-dk/system-requirements

Please keep sending us your feedback as new releases come out.

@cdedmonds cdedmonds self-assigned this Jul 17, 2019
@rfilkov
Copy link
Author

rfilkov commented Jul 17, 2019

@Brekel I upvoted your first suggestion some days ago and now - the 2nd one as well. Anyway, I don't think a single hyperparameter tuning would solve the lag-issue. DNNs are notoriously slow, even on high end graphics processors.

@Brekel
Copy link

Brekel commented Jul 17, 2019

Thanks for the upvote.
Maybe you're right but I think this is not lag/slowness but filtering I'm seeing on my GTX1080ti, could be wrong.

In any case the DNN is still amazingly robust so not a critique just trying to help the team in making it more awesome and useful for more scenarios. The more controls we have the easier that will be.

@rfilkov
Copy link
Author

rfilkov commented Jul 17, 2019

@cdedmonds Thank you for the swift response! I know the system requirements and that BT SDK is still in preview. But I think it would be better (for both - you and us) to give such a feedback now, while in preview than when the official release is out.

I suppose the Azure Kinect-team is much different than the team developed the Kinect-v2 sensor and SDK, but anyway, you should have access to all the documentation and internal info from back then. It may be worth comparing the K4A body tracking to the body tracking of Kinect-v2, in means of performance, accuracy, features, extras, etc. It would be good to get a well performing, more accurate and more versatile body tracking in the end than the previous one.

Please keep this issue open, so we could comment when the new BT SDK releases come out. When do you plan to release the next preview version, by the way?

@cdedmonds
Copy link
Contributor

We appreciate the feedback. Keep it coming.

@Brekel
Copy link

Brekel commented Jul 17, 2019

@cdedmonds I would say the key part is exposing as much control as possible to make things adaptable to a wide range of scenarios.

For example:

  • for people identification/tracking/counting it may be enough to get the 2D output of the DNN and no 3D model fitting
  • for interaction we do want a skeleton emphasizing smoothness and plausible poses (like we have now)
  • for pose estimation (especially in multi-sensor/computer scenarios) we want a skeleton that may contain noise, missing limbs (with a confidence score) and less plausible poses at times but with no inter-frame smoothing at all
  • for machine learning we probably want something in between, plausible poses but no inter-frame lag
  • for offline tracking we may want to have a non-realtime version that favors accuracy over computation time

Again I don't know what is possible with the underlying algorithms and am just guessing, but the more that can be exposed the more wide the use cases. We're clever programmers using this you know :)

@billpottle
Copy link

I definitely agree that if you can expose more control parameters, more possible use cases open up in a non-linear fashion, because then developers can work around the limitations by adapting their workflow, or by combining outputs from processing a stream more than once. I think this is particularly true for a device like the Kinect which feels like a revolutionary "general purpose" technology, with a lot of applications that haven't really been developed before. For instance, we would love to have low-latency, low computation positional accuracy. However, if that's not possible, we can always send the mkv files for remote processing on cloud GPUs and then send the data back to the end user.

@Brekel
Copy link

Brekel commented Jul 17, 2019

Absolutely!
Although cloud is not always an option since body tracking deals with people and possible privacy issues that may require local compute. :)

@rfilkov
Copy link
Author

rfilkov commented Jul 18, 2019

@cdedmonds Please also look at what the competition is doing: https://developer.apple.com/videos/play/wwdc2019/607

@Chris45215
Copy link

Chris45215 commented Aug 29, 2019

For my use cases, the tracking latency and the ability to recognize a person are the key concerns - positional accuracy within 6 inches (roughly 15cm) is sufficient. Multiple sensors can be used to increase accuracy, but there is no way to get around the latency and tracking delay. 60fps depth tracking would probably help alleviate the problem, but if that's not possible with the hardware then reduced latency is a priority. We can always add more cameras, but we can't subtract time. Though I note it would be easier to add more cameras if they didn't each require a PC with a Nvidia card; Intel NUCs would take far less space, and in fact the Azure Kinect fits atop them perfectly.

@rfilkov
Copy link
Author

rfilkov commented Sep 1, 2019

I must admit, v0.9.2 has significant improvements in means of performance and accuracy. In this regard, I have a question: do you change the model as well, or only the SDK internally?

@trekze
Copy link

trekze commented Sep 2, 2019

@rfilkov Is there an MSI for 0.9.2? I can't seem to get my hands on it.

@rfilkov
Copy link
Author

rfilkov commented Sep 2, 2019

@hmexx here it is.

@qm13
Copy link
Contributor

qm13 commented Sep 3, 2019

@rflikov v0.9.2 includes an updated model.

@trekze
Copy link

trekze commented Sep 6, 2019

The new model does seem slightly better.

The main issue for us is that the model does not output confidence scores for the joints. Most 2D pose estimation have this feature, which makes it easy to deal with situations where the model has to fail (occluded joint). Is this something that can be added? I'll start a new issue with a feature request.

@PierrePlantard
Copy link

I'm agree for the confidence score fro each joint.
This improvement seems to be under evaluation (https://feedback.azure.com/forums/920053-azure-kinect-dk/suggestions/38166871-joint-tracking-state).
Another big limitation is the hardware requirement.
We want to work in mobility with laptop but such hardware (e.g. GTX 1070 or RTX 2070) need to much power to work in mobility (in battery mode).
We think it is very important to add more configuration possibilities for the body tracking. For example à lighter model compatible with lower grade graphics cards (GTX 1650 for example).

@rfilkov
Copy link
Author

rfilkov commented Sep 6, 2019

@hmexx Here is a feature request in this regard: https://feedback.azure.com/forums/920053-azure-kinect-dk/suggestions/38166871-joint-tracking-state Please upvote or comment there, too.

@Chris45215
Copy link

@qm13, I may be the only person here satisfied with that answer. Has anyone at Microsoft tried to reproduce the test I posted at https://youtu.be/7Jc7KhoPWdc yet? Considering that I found that Microsoft's example code for realtime, lowest-latency-possible output has an error that causes it to buffer an extra frame and thus always be 1 frame further behind than it could be - and the example code has yet to be updated with a correction - I question the testing (if any) of bodytracking latency.

I am still of the opinion that either the DNN is designed to withhold the result until it gets later frames (thus returning a 'stale' result), or some part of the firmware or SDK is delaying frames when it shouldn't be. I could partially test this by creating 'artificial' frames (a saved video+depth feed that is edited and re-arranged by hand) to feed into the DNN, which would let me more easily catch it delaying frames if the delay is on the DNN side. This would take far more than a reasonable amount of my time to do, and I don't think the editing toolset for that even exists so I would need to create it.
If my suspicions are correct, then it provides a much easier route to greatly improve the latency issues.

Beyond that, I think we all know what the real, long-term solution must be: integrate the CUDA cores and about 1GB RAM into the sensor.

@gradientLord
Copy link

this is fine

@Chris45215
Copy link

I created issue #1253 to suggest integrating the body tracker into a V2 of the sensor. That would solve all of the problems. It may be solving it by increasing the price of the sensor, but I'd pay $4000 per sensor if it had integrated body tracking and 50ms body tracking latency.

#1253

Beyond that, I stand by my statement that the current Nvidia-GPU dependent approach is fine for me IF the latency issue can be fixed. My GPU processing time is 22ms per frame, but that does not jive well with the real-world measured (and demonstrated) latency of the bodytracking latency.

@PierrePlantard
Copy link

@qm13, thank you for the update. For us, IF the new DNN version allow us to run with consumer grade laptop (without Nvidia GTX GPU) and maintain the accuracy level of the current DNN, it's fine !
Do you have an expected release date for this new version ?

@knat
Copy link

knat commented Jun 16, 2020

I'm using RTX 2060 super with a single kinect, the frame rate is almost 30 FPS and the latency is acceptable.

@bastiankayser
Copy link

@qm13 Thank you for the update and the transparency. Nevertheless I must say that I am deeply disappointed by this decision. You have a dedicated community that is begging on all possible channels for a funtionality that was already implemented years ago because otherwise alot of interesting and serious applications, especially in the health sector, are not feasible. And your (the development teams) answer is: "Sorry we want to play with the current hot stuff" namely DNNs. I get it, it is a fascinating technology but you are ignoring all the people that actually want to use your sensor for something useful that can have a positive impact on many peoples life.
Please rise to your responsibility as the technology leader in this space that you still are and give the people what they want and what they had before: Fast, reliable body tracking on low to mid level hardware. Thank you.

@fractalfantasy
Copy link

fractalfantasy commented Jun 23, 2020

@qm13 @wes-b Please mark the feature request as ‘unplanned’, to show the dev community how much you value their input.

This is the most discussed issue and second most requested feature - porting the option over would be a breeze for any decent developer.

@rfilkov
Copy link
Author

rfilkov commented Jun 27, 2020

If there is nothing else to add here, I'd like to close this very long and fruitless thread. A year later, it's also a bit too late to start implementing the most wanted (or any) feature requests.

@rfilkov rfilkov closed this as completed Jun 27, 2020
zchLovv added a commit to zchLovv/K4A_OpenVR that referenced this issue Jul 26, 2020
Added prediction methods based on velocity calculations to update feet and hip positions when kinect prediction has not processed yet. This decreased the latency of updating the trackers in space (made the trackers track quicker movements better). Right now only positions are updated. Velocity stays constant for performance until the impact of adding this can be benchmarked.

Reduced estimated time differences in velocity calculation to accomodate reduced update latency

Added velocity calculation for hip tracker

Changed the way the vector position was updated. Instead of using just the raw data, the raw data is merged with the predicted position (same as the prediction method above) at a ratio of 6 to 4. This decreased the variability of the data (made the trackers less jittery) for the feet to a level I found usable.

Removed temporal smoothing tweaks in the calibrator as any level of smoothing set above 0.05 resulted in trackers that were too slow to use.

In the future, (maybe if more people want this functionality) timing data in the function needs to be changed to be dynamically computed instead of hardcoded estimations (estimations based on benchmarks found here microsoft/Azure-Kinect-Sensor-SDK#514). This needs to be done to accomadate different machine setups and to also support newer software/hardware that can process frames at a higher fps.

TODOs :
1: figure out a way to preserve hip positional data when rotating body ( when body is rotated the hip's predicted position becomes whatever side is facing the camera. This breaks some of the rotational tracking capabilities of the device. Potentially look into setting the hip tracker on one or some combination of the hip bones instead of the pelvis bone?

2: Measure the performance impact of predicting data vs using only raw data

3: Implement the method for more trackers

4: Add accurate timing data into the prediction method so the predicted values can be more accurate for every machine (not just my testing rig).

Testing Rig Specs:
i7-8700k
MSI 370 A-Pro Motherboard
32gb ddr4 ram
rx 580 8gb
gtx 960 2gb
Rift S
Kinect Azure
Windows 10
All drivers and software up to date
7/26/2020
@Chang-Che-Kuei
Copy link

My GPU is Nvidia RTX 2060. It runs at about 30 FPS.

@vpenades
Copy link

@Chang-Che-Kuei Developers switching from Kinect2 to Kinect4A expect similar performance on similar hardware, which is not the case with Kinect4A.

@fractalfantasy
Copy link

fractalfantasy commented Jan 13, 2021

They promised a lighter, more performant DNN model, that works across different GPU manufacturers and then completely abandoned development.

@RoseFlunder
Copy link

It is also a bit worrying that there is still no compatible release of the body tracking sdk with the new NVidia 3000 series and its a known issue since obtober:
#1125

Or that they never released Ubuntu binaries for a body tracking sdk that is compatible with sensor sdk 1.4.

@vpenades
Copy link

@fractalfantasy @RoseFlunder as far as I know, they were still on it with the hopes of a late 2020 release, but it seems it's been delayed. But given microsoft track record of announcing projects cancelations, it would very well be the case that it's been silently canceled.

Anyway, on our side, we moved on to look for alternate solutions that don't require this BodyTracking nor the K4A camera. We simply could not afford waiting a whole year for a solution.

@fractalfantasy
Copy link

@vpenades may I ask if you found any alternate solutions? I was hoping that maybe Nuitrack would support the Azure Kinect, but doesn't seem to be the case.

I looked into other sensors couldn't find any competitor that has good resolution... Intel sensors aren't even comparable to Kinect v2.

@vpenades
Copy link

vpenades commented Jan 13, 2021

@fractalfantasy I can't comment, sorry. But our solution is tailored to our use case.

It's still not as good as what the Kinect2 delivered, though, so if the kinect team finally delivers something that really improves (a lot) on what's currently available, then we might switch back to it.

@rfilkov
Copy link
Author

rfilkov commented Jan 13, 2021

@fractalfantasy There are not many options - Intel RealSense, Orbbec (their new sensor looks promising), Apple's iPad-Pro/iPhone-Pro (with some limitations). Kinect-v2 is still the best one out there. Unfortunately, MS are famous for ruining or canceling their best products, as a result of not listening. Deja vu.

@L4Z3RC47
Copy link

I would be surprised if the Azure Kinect project was cancelled. MS has been active on responding to many issues as recently as a day or two ago. Though the silence on issue #1125 is frustrating...

@rfilkov
Copy link
Author

rfilkov commented Jan 13, 2021

@L4Z3RC47 Yes, they provide some minimum of customer support. But look at the progress of development here: https://feedback.azure.com/forums/920053-azure-kinect-dk It hasn't moved an inch since one year ago.

@vpenades
Copy link

@rfilkov Not only the feedback page hasn't moved. There's not been any commit to the main repository since 1-july-2020. And I'd be surprised if they consider the driver side "completed"

@gradientLord
Copy link

@L4Z3RC47 They answer these forum posts so that senior management thinks they are being active, and then go off and play video games.

@vpenades
Copy link

I don't think that's the case... nevertheless, if there's been delays to deliver, at least they could be mitigated with better communication.

@fractalfantasy
Copy link

fractalfantasy commented Jan 13, 2021

@rfilkov wow this new orbbec sensor looks like an Azure Kinect, wonder how it will compare resolution-wise

Ero5GwHXAAAdvk6

@andrey-tsb
Copy link

@vpenades may I ask if you found any alternate solutions? I was hoping that maybe Nuitrack would support the Azure Kinect, but doesn't seem to be the case.

If you are interested, Nuitrack currently supports Kinect Azure and provides fast CPU-only skeletal tracking with it.

@vpenades
Copy link

vpenades commented Aug 3, 2021

If you are interested, Nuitrack currently supports Kinect Azure and provides fast CPU-only skeletal tracking with it.

Thanks for the tip. I'm aware of nuitrack and I've been following their progress for a while now.... unfortunately, our use case prevents us from using NuiTrack for a number of reasons.

Long story short: our customers demand low end, easy to use, easy to install, no headaches, no power hungry, no internet required, and easy to purchase, solutions. The Kinect2 met almost all these requirements. Everything that came afterwards for one reason or another, falls short from these requirements.

Fortunately for us, we're progressing on solutions that only require a webcam, it's not as good as the Kinect2, but we've managed to fit our requirements with it.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Body Tracking Issue related to the Body Tracking SDK Enhancement New feature or request
Projects
None yet
Development

No branches or pull requests