Hacker News new | past | comments | ask | show | jobs | submit login

I wonder how text rendering performs. If it's good, a Hololens 2 IDE would be an improvement over a laptop for working while traveling.



Text readability in specific context was passable on the original so I think this should be pretty good! The problems are going to be more in that it’s not quite a desktop window manager yet. It’s not designed to be a screen replacement. That is coming in the next couple years, but the bumps you’ll experience to make it your environment involve implementing and establishing paradigms that don’t exist yet and are maybe availible as a collection of weird bit rotting tech demos.

That said, you could probably wire it up to terminal and split out movable windows with a tmux wrapper. Impress your friends and strangers while you hack the Gibson, then take it off and use your laptop screen because you’re frustrated.


It always surprises me how much these technologies try to run before they can walk. Giving people a great desktop experience equivalent to multiple high-end monitors just seems like the obvious first baby step. Everything else seems like fluff if they can't get that to be desirable.


You would think that, but as someone who has been in the Natural User Interface R&D field for a decade, I’ll tell you that this is what walking looks like. Our tools as of this year are just on the edge of being able to deliver what a consumer expects. Adding a 3rd dimension makes everything harder and none of the existing assets, let alone functional principals from the 2D world translate.

To get what you just called a baby step is an act of barn raising. By thousands of dedicated professionals, tinkerers, artists, scientists, and large corporations. If we could have willed the “baby step” into being before now we would have. Actually many people have, but then quickly notice what’s lacking and get back to raising the barn.

Here are a couple projects I pulled with a quick google. I have a bitbucket wiki with dozens of similar projects, some in AR, VR, projection mapped, single user parallax, with gesture tracking, with custom controllers, etc... This is definitely one of those problems where it's easy to imagine so people imagine it's easy. Maybe we need you to help! Get a spatial display of some sort and lets get to work!

https://github.com/SimulaVR/Simula

https://github.com/letoram/safespaces


Yes but I'm saying _don't_ add a 3rd dimension in any but the most basic sense. Just give me what is effectively a very large 2D workspace, as if I was staring at one or more nice big monitors. No fancy metaphors, no gestures - with Hololens I can still touch type and use my existing mouse.

I am a curmudgeonly luddite, obviously, but I would pay thousands of dollars just to have a multi monitor setup that required zero effort and worked on the move (to the extent that I'd be willing to go out with the equivalent of a Segway attached to my head).

If _that's_ still impossible (i.e. it's impossible to accurately and quickly orient the device in space, even in the controlled environment of my desk, or it's just not possible to display text that's nice to read on these devices) then I don't see the point of even attempting the more esoteric stuff as anything but pure research. Clearly several multi-billion dollar companies disagree, so I'm happy if all this comes to pass either way.


Making a display with that high a resolution that stays in the same place with normal head movements and has a 60fs refresh rate and something like ClearType is far, far more difficult than what is being done now. This IS “obvious first baby steps”.


A great desktop experience is hard. VR lens blur. AR display dimness and fov. Cost of resolution. Fixed focus depth. Heavy thing on your head. Tethered.

A nice VR gaming experience is hard. Immersive means you can't see the real world, so your balance rides on rendering, requiring high fps, low latency, and thus lots of gpu. Means a high bar for avoiding "immersion breaking" hardware and software visual oddities.

The perceived near-term market for VR gaming is larger than that of VR or AR desktops, so that set of hard has gotten investment, and the other, not so much.

Market structure, especially patents, discourages commercial exploration of smaller markets. Say you want to build and sell your great system, and that it happens to need eye tracking. One eye tracking provider sells very expensive systems to industry and government, and your volumes are far too small to interest them in cheaper. Its competitors have been bought by bigcos, and no longer offer you product. Patents, and your small market, block new competitors. So eye tracking is not available to you or your envisioned market.

Community communication infrastructure is poor. Suppose all the pieces exist somewhere. People interested in desktop experiences, willing to throw a few thousand dollars at panels, tethered gpus, eye tracking, and so on, willing to be uncomfortable, to look weird, to tolerate a regression in display quality; a 2K HMD with nice lenses; a similar 4K panel; the electronics to swap the panels; some tracking solution; and so on... Even if all those pieces exist, the communication infrastructure doesn't exist to pull them all together. Neither as forums, nor as markets.

The poor communication degrades professional understanding of the problem space. People are unclear on the dependency chains behind conventional wisdom. So high resolution is said to require eye tracking or high-end gpus. VR is said to require high frame rates. And the implicit assumption of immersive gaming is forgotten. I've found it ironic to read such, while in VR, running on an old laptop's integrated graphics, at 30 fps. Wishing someone offered an 4K panel conversion kit, so a few more pixels escaped blurry lenses. A panel I could still run on integrated graphics, albeit newer. So part of the failing to pull together opportunities, is failing to even see them.

Perhaps in an alternate universe, all patents are FRAND, all markets have an innovative churn of small businesses, and online forums implement the the many things we know how to do better, but don't. And there's been screen-comparable VR for years. But that's not where we're at.


That alternate universe really did almost exist. You can still see the remnants over at http://www.nuigroup.com/go/lite.

10 years ago there was a booming scene of open source natural user interface projects. People building huge multitouch interfaces, experimenting, releasing open SLAM tools, and exploring what you could do with DIY AR/VR/projection maping/natural feature tracking/gesture interfaces. Post the success of the first Oculus dev release all of the forums went quiet, the git repos started being unmaintained and outright scrubbed. The main contributors to the community got scooped up by the motherships and any supporting technologies locked down in what felt like about six months to a year. Leap Motion was a standout company from that time. They have been selling the exact product they built then with almost no improvement until recently. Somehow they weathered the storm, didn't sell, and are doing some really neat stuff now. Structure IO took up the stewardship of OpenNI and if you look hard enough you can still find cross platform installers that have the banned original kinect tech that Apple bought and is extremely litigious about keeping off the internet.


Sigh. Yeah. Last year I was kludging some optical hand-and-stylus tracking above laptop, keyboard as multitouch surface, and head-tracked screen-and-HMD 3D desktop, sort of laptop-nextgen for my own use... but didn't find a community left to motivate sharing or demos. :/ Thanks for your comment.


A possible baby step is a household system of stationary cameras with a mobile Hololens. The stationary cameras observe and attempt to classify objects you hold and put away; you can also teach it objects you name. Kind of like a visual Mycroft/Alexa/Google Assistant/Siri The Hololens guides you to the objects when you can't recall where you left them.

Elderly, memory-impaired (dementia, Alzheimer's, etc.), households with children would find immediate use with the assistance, even with not-great accuracy: any kind of reminder that scores greater than zero success at finding misplaced objects would be welcome than the alternative of zero effective recall.

If only the battery life was better, then small business warehouses would find immediate use for these as merchandise locators, as a lot of them have very haphazard methods to lookup merchandise locations.


That is doable right now. I am only sort of being glib here. I just want to outline what's involved in what people think is a simple task in a spatial system.

TLDR: As of right now, your wildest dreams are pretty possible. In the next 2 years, how we compute is going to get strange. Nothing is nearly as simple in an XR environment as a traditional computing one and currently most people don't really want what they'd ask for. Building a simple (useful) XR app makes launching a web product look like assembling a lego set vs. landing on the moon.

---------

Imagenet[0] works just fine for the object identification. You would probably want to use RGBd sensors like the kinect[1] or Intel Realsense[2] instead of regular cameras, but tracking like what the Vive[3] uses could also work. The thing you just proposed would involve a network of server processes handling the spatial data and feeding extracted relevant contextual information to a wireless headset at a pretty crazy rate. Just to give you an idea a SLAM[4] point cloud from a stereo set of cameras or a cloud from a Kinect2 or Realsese produces a stream of data that is about 200mb a second. Google Draco[5] can compress that and help you stream that data at 1/8 the size without any tuning.

Extracting skeletal information from that is really something that only Microsoft has reliably managed to deliver and it's at the core of the Kinect/Hololens products. NuiTrack[6] is the next best thing, but registering a human involves a T-Pose and gets tricky. Definitely you could roll something specific to the application, maybe just put a fiduciary marker[7] or two on a person and extrapolate skeleton from knowing where it is on their shirt. You will also want to be streaming back the RGBd, IMU, hand and skeletal tracking from the headset back to the server. This could help inform and reduce the tracking requirements from the surrounding sensors.

Out of the box, you'd probably need a base i7, 64Gb+ of ram, and a couple GTX 1080s to power 4 sensors in one room. The task of syncing the cameras and orienting[8] them would be something you'd have to solve independently. After having all of that, you would have an amazing lab to reduce the problem further and maybe require less bandwidth, but very probably to get where you're going you'd need to scale that up by 2x for dev headroom and maybe run some sort of cluster operations[9] for management of your GPU processes and pipeline. Keeping everything applicable in memory for transport+processing would be desirable so you'd want to look at something like Apache Arrow[10]. At this point you are on the edge of what is possible at even the best labs at Google, Microsoft, or Apple. The arrow people will gladly welcome you as a contributor! Hope you like data science and vector math, because that's where you live now.

After getting all of this orchestrated, you now have to stream an appropriate networked "game" environment[11] to your application client on the hololens, but congrats! You made a baby step! Battery life is still an issue, but Disney Research has demonstrated full room wireless power[12].

Now all you have to do is figure out all the UI/UX, gesture control, text entry/speech recognition, art assets, textures, models, particle effects, post processing pipelines, spatial audio systems[13], internal APIs, cloud service APIs, and application build pipeline. The Unity asset store has a ton of that stuff, so you don't have to get in the weeds making it yourself but you will probably have to do a big lift on getting your XR/Holodeck cluster processing pipeline to produce the things you want as instantiated game objects.

Once that's done, you literally have a reality warping metaverse overlay platform to help people find their car keys.

What's crazy is that you can probably have all of it for under $15,000 in gear. Getting it to work right is where the prizes are currently living and they are huge prizes.

[0] https://en.wikipedia.org/wiki/ImageNet

[1] https://azure.microsoft.com/en-us/services/kinect-dk/

[2] https://github.com/IntelRealSense/librealsense

[3] https://www.vive.com/us/vive-tracker/

[4] http://webdiis.unizar.es/~jcivera/papers/pire_etal_iros15.pd...

[5] https://github.com/google/draco

[6] https://nuitrack.com/

[7] https://www.youtube.com/watch?v=JzlsvFN_5HI (markers are on the boxes not the robot)

[8] https://realsense.intel.com/wp-content/uploads/sites/63/Mult...

[9] https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus...

[10] https://arrow.apache.org/

[11] https://docs.unity3d.com/Manual/UNetClientServer.html

[12] https://www.youtube.com/watch?v=gn7T599QaN8

[13] https://en.wikipedia.org/wiki/Ambisonics


This sounds like something a warehouse could afford to buy but for most people organizing their stuff properly is going to be the winner. (even at high salaries like $300k yearly 15k is still 100+ hours of your time)


Yeah, it's not a consumer product yet. Did you miss that it's a lab to build it and to make the most "basic" solution you have to have a functional holodeck for the price of a used sedan? That's pretty bonkers.


Yes, same goes for using the walls in the Mixed Reality house as windows. It's readable and workable, but 3 realworld HD monitors are 'better' for getting work done.


I talk from my personal experience with V1 and other devices using waveguides display, I felt like it is not the best type of display to do "heavy work" such as reading/coding. It is quite annoying to have everything ghostly, even if the tracking is stable and the resolution is good.


I've never tried any of these devices, but the one thing that comes to mind is that I wonder if you might get eyestrain trying to use it for 8 hours a day. When working on my laptop, I'm constantly switching focus -- looking out the window, then back to the lap top, then at the wall, etc, etc. I think that kind of thing might end up being more difficult.


You can still switch your focus with these because you still see the real world. However it is true that the objects are rendered ("OpenGL-like") the same way no matter where the object is located in the scene, so while it gives you the feeling that you can focus your eyes on the object (because of their 3D position and the tracking), the objects always appear "clear" which can be annoying. That said, using them 8 hours must be hard : the virtual objects emits unreal lighting (different and stronger than the light reflected by real surfaces)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: