So many likes, was not expecting that! I will be presenting this work tomorrow at MICCAI and then I will post my presentation link in the README of the repository!
Thank you for your work! Looking forward to the presentation.
I had to look up MICCAI. To others: 23rd INTERNATIONAL CONFERENCE ON MEDICAL IMAGE COMPUTING & COMPUTER ASSISTED INTERVENTION
(4-8 OCTOBER 2020)
https://www.miccai2020.org/en
Love it!
Should maybe add a link in addition to the video "image". Was not intuitive if it is a an image from a video or a link to a video (I am not used to video "preview" on Github).
Very, very cool work, I fully expect to use this exact project in the future!
To those of you who always thought this stuff looked neat but never tried it out, and to those who may have used blender in the past and gave up, I would HIGHLY encourage you to try again with the latest version of the software. Although there is still a bit of a learning curve, there have been massive improvements have to the software suite.
The ability to code audio/visual in blender is just incredible. I describe it like this: Imagine yourself as someone trying to code an image that looks like a tree in machine code. Then imagine your partner comes over, sees what you are working on, and hands you Python and a fully setup IDE, it's like being given literal magic.
I downloaded the latest version earlier this year to write some basic AI simulations (cube wars!) and to create models for my 3D-printer. Like when I first learned to code, the fun of the machine totally sucked me and I got completely off-task from my original goal. Lately I've been working on two things with the same lines of code, music videos and simulated walks through forests(Think the movie Avatar). With only a few hundred lines of code I am able to generate infinite forest trails in which you can walk (or fly a drone-style camera) through, synced to music that is generated by the AI-mushrooms WITHIN in the scene itself! Literally was able to go from 0 to highly visually engaging trippy music videos in the last year with minimial musical production experience and with no music-video production background. The ease in which you are able to generate things via code is stunning and the limits feel completely boundless.
Very cool! I've done something similar for improving an OCR system on crinkled paper[0]. Blender is a powerful and totally underutilized tool for this kind of work
Uou this is awesome! And it's very nicely presented in the website. I'm wondering how you mapped from the UV to the 3D model. I would like to add that feature to the addon.
TLDR: using a KD-tree, I find the face containing the UV coordinate. Then I transform the UV coordinate to barycentric coordinates within that containing face, then put that barycentric coordinate through the local -> world -> view -> perspective transform matrices
A common approach in rendering engines to convert screen space coordinates to objects is to render a second image with light and shadow disabled where the color uniquely maps to an id. You then can uniquely identify 24 bits worth of objects without needing to maintain a KD tree.
Ok, this turned out to be far more interesting than the title here reads like. The little abstract at the top is far more informative:
> A Blender user-interface to generate synthetic ground truth data (benchmarks) for Computer Vision applications.
And it lets you make stereo images, depth maps, segmentation masks, surface normals and optical flow data from the rendered animation, and export it all in .npz numpy format. Quite interesting project.
This is awesome! I’ve been working with blender scripts a lot lately for my side project where I generate jewelry for 3D printing (https://lulimjewelry.com)
It’s an incredibly powerful tool, IMO one of the best large open source applications. I’ve learned some good ideas by reading the plug-in here, thank you!
Very cool! Just the other day I was trying to set Blender’s camera based on a standard 3x4 computer vision KRT matrix, and it is surprisingly a pain in the ass —- I wish more of these graphics CAD packages (Blender, Houdini, Maya) made it easier to deal with vision data.
Excellent! For a while my job entailed this very thing: creating synthetic data for computer vision, and I used Blender as well! You've done a great job.
Ground truth in this case means images with labels of some sort on parts of the image or objects in the image, not that they're real images themselves. So a cat in a photo with a label or mask on the cat would be "ground truth" on what part of the image a cat is in, or that there is a cat.
Haha, just pointing out how the term is used in the industry. If you want to really play with semantics then I would argue that even a photo of a real cat isn’t a real cat and thus not ground truth. But while the consequences of that might have the awesome effect of clearing the shelters, it’s not practical, so we just call any labeled pixels representing a cat to a fidelity good enough for our purposes ground truth. ;)
Nerb here. I can't say what this does but build a frame for ai comparison? Blender doesn't need so much eyes anyways? Buuuuut it doesnt have them either in this way?