Microsoft’s Azure Kinect AI Camera

noen · on July 12, 2019

I have 5 of these sitting in my office.

The quality of image (depth and rgb ) is staggering compared to everything else I've used (zed, realsense, kinect v2, and a couple of others).

They can be chained together for near microsecond timing accuracy.

The mic array is awesome. The IMU is awesome.

But right now I've set them all back in their boxes.

Why? Because without being a vision specialist, there's nothing I can do with these devices.

The SDK and sample code is so incredibly bare bones, it is almost laughable.

There's no way to make use of those mics for anything. Its literally not in the SDK.

There's no way to make use of multiple devices in any practical manner. No point cloud merging, no calibration or shared space alignment.

Then there's the problem that buries deep in the SDK is a binary blob that is the depth engine. No source, no docs, just a black box.

Also, these cameras require a BIG gpu. Nothing is seemingly happening onboard. And you're at best limited to 2 kinects per usb3 controller.

All that said, I'm still a very happy early adopter and will continue checking in every month or two to see if they've filled in enough critical gaps for me to build on top of.

If any devs in Seattle want to collaborate (or know computer vision well enough to fill in some of these gaps for the OSS community) let me know :)

wiremine · on July 12, 2019

> The SDK and sample code is so incredibly bare bones, it is almost laughable.

This is why I held off ordering a few. I took an hour looking through the docs and concluded the overall offering isn't fully baked yet. The hardware looks incredible, but the software looks anemic.

cyrux004 · on July 12, 2019

Why did you get 5?

noen · on July 12, 2019

One of the key features for me is the hardware chaining to make realtime PC merging easier to resolve. All of the scenarios I can about are realtime rather than post processing reconstruction.

novaleaf · on July 12, 2019

i would assume some full body mo-cap or realtime avatar scanning.

discordance · on July 12, 2019

I can line you up with many people working with them around Seattle if you're interested. Got a contact?

artemisyna · on July 12, 2019

Ditto here, though some of these people might end up distracted by their full time jobs. ;)

noen · on July 12, 2019

I have a similar problem :) - azurekinect@gmail.com

noen · on July 12, 2019

Sure and thanks :) - azurekinect@gmail.com

shreezus · on July 12, 2019

I have one of these as well and exploring interesting use cases for the tech. If you don’t mind me asking, what do you have in mind for those?

jayd16 · on July 12, 2019

Can you still use the body tracking from previous kinects?

desdiv · on July 12, 2019

>The system requirements are Windows® 10 PC or Ubuntu 18.04 LTS...

I must say: I'm liking this new Microsoft.

[0] https://azure.microsoft.com/en-us/services/kinect-dk/

vinayan3 · on July 12, 2019

> 7th Generation Intel® Core™ i3 Processor (Dual Core 2.4 GHz with HD620 GPU or faster), USB 3.0 port, and 4 GB RAM.

Raspberry Pi or a Jetson Nano are probably not gonna work... Seems to be x86 only.

H8crilA · on July 12, 2019

Given Microsoft's stance on supporting "non-native" setups I think we should be expecting ARM Linux driver's if this device catches on.

XzetaU8 · on July 12, 2019

Me on the other hand i remain very skeptical with the new Microsoft.

https://news.ycombinator.com/item?id=20412905

antome · on July 12, 2019

It's just a matter of following the money. A bigger and bigger slice of Microsoft's income is coming from Azure, and an increasing proportion of Azure users are running Linux. This gives them strong incentive to be the business leaders in open source software.

The other increasing slice is services, where they get you to buy stuff from their storefront, or subscribe to their game pass, or office 365, or make personal interactions with Bing and Windows (so they can sell targeted ads). This naturally gives Microsoft the incentive to know everything about you.

mulmen · on July 12, 2019

Following the money is exactly why I am skeptical. As Linux becomes a revenue stream for Microsoft it will also become a Microsoft product. From the company that was built on Embrace, Extend, Extinguish I think that is cause for concern.

vogtb · on July 12, 2019

This is really cool! Back in college I used the Kinect dev kit to build proof-of-concept special effects for live theatre as an independent study project. I used the Kinect pointed at the face of an actor off stage, ran the resulting 3D data points to some cool algorithms, and then projected the result onto a screen on stage. The idea was that, for example, in Hamlet instead of having an actor in make up play the ghost of King Hamlet, you could have this larger-than-life projection on stage.

This camera is way better quality, so it'l be neat to see the sort of projects can be done now.

svd4anything · on July 12, 2019

How does it compare to iPhone sensors used for Face Id? I’m wondering if mounted to a workstation it could be used to implement a Face id system under linux.

jeffchuber · on July 12, 2019

yes. the hard part is implementing FaceID! Lots of papers on the topic: https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&as_ylo...

svd4anything · on July 12, 2019

I was just checking specs on the two devices and it seems that the new Kinect depth even higher resolution, so it looks like a go in terms of raw input data.

So my next question would be why would it be significantly harder than regular facial recognition approaches as found in say OpenCV, naively one would think more data makes it easier not harder, neglecting hardware requirements/performance, but just from accuracy perspective from a trivial refactor of current facial identification algorithms.

I’m not talking about identification of people moving or far away but straight looking at from a fairly close distance.

lionpixel · on July 12, 2019

Right now the SDK supports body pose estimation without finger joints. Does anybody know or work on a model including finger joints?

ramraj07 · on July 12, 2019

Does anyone know if this will work with Mac running Windows in pArallels? Really wanna tinker with it but stuck with a Mac :(

valgaze · on July 12, 2019

I did some Kinect V2 experiments on Bootcamp and it worked like a charm

dingo_bat · on July 12, 2019

It supports Ubuntu too, you can dual boot.

jbrooksuk · on July 12, 2019

I remember seeing The V Motion Project [0] when it first came out. It'd be good to see other people approach this too!

[0] https://www.youtube.com/watch?v=YERtJ-5wlhM

nojvek · on July 12, 2019

This youtube video is definitely a chapter inspired/stolen from Apple's ads. Good on Microsoft though.

https://www.youtube.com/watch?v=jJglCYFiodI

grrowl · on July 12, 2019

The "Order" link goes to a 404 outside of the United States/for languages other an en-us

rememberlenny · on July 12, 2019

From what I understand, this is nice if you are trying to use multiple devices together. The Kinect's API makes that very easy with Azure (read: multiple cameras scanning a single location in real time, via a robot).

If you just need the same sensors for depth, but significantly cheaper, then look at occipital.

Link: https://occipital.com

modeless · on July 12, 2019

Occipital uses structured light. Kinect uses time of flight. Completely different sensor technology.

takanori · on July 12, 2019

I’m interested in buying one and doing a hack project. Anyone interested in brainstorming with me?

kodachi · on July 12, 2019

what about something like this (holoportation): https://www.youtube.com/watch?v=7d59O6cfaM0

beagle3 · on July 12, 2019

Can they be used as a kinect replacement for xbox360/xenia?

sansnomme · on July 12, 2019

How does this compare to Intel's RealSense?

joewee · on July 12, 2019

What are some potential use cases?

jsilence · on July 12, 2019

Thinking about using vision technology for small scale aquaculture stock management. Sorting the fishes by weight while eliminating the scale. Probably a conventional camera might suffice with a little bit of ML training. Length does not correlate well with weight in the fish species we are using, but a 2D picture of the fish body might do it. A 3D picture with the new kinect will do the job for sure.

tootie · on July 12, 2019

Classic Kinect sensors are used extensively for gesture detection in the niche world of digital interactive exhibitions. You see them a lot in things like science museums or popup events or digital art gallery kinda places.

This exhibit had at least a dozen Kinect cameras: https://nysci.org/home/exhibits/connected-worlds/

JoeDaDude · on July 12, 2019

Motion capture for animation comes to mind. Another is to project peoples' motion into VR space.

lozaning · on July 12, 2019

I've been wanting to build gesture based control of my smarthome for a while.

rememberlenny · on July 12, 2019

This would definitely be overkill. You could do that with 5$ cameras and some ML.

BubRoss · on July 12, 2019

https://azure.microsoft.com/en-us/services/kinect-dk/

Here is the actual Microsoft link just in case you don't want blog spam that is nearly unreadable on a phone.

kiddico · on July 12, 2019

[flagged]

voltagex_ · on July 12, 2019

>It’s meant to give developers a platform to experiment with AI tools and plug into Azure’s ecosystem of machine learning services (though using Azure is not mandatory).

binarysaurus · on July 12, 2019

Agreed. Doubt sdk will be OS/Linux friendly either. Previous Linux kinect drivers (libfreenect) were all community driven, it worked, but not as polished as the official stuff.

desdiv · on July 12, 2019

Ubuntu is officially supported by Microsoft.

>The system requirements are Windows® 10 PC or Ubuntu 18.04 LTS...

[0] https://azure.microsoft.com/en-us/services/kinect-dk/