I was thinking about the implications of using this technique to analyze e.g. political speeches and try to catch people lying on the act. Your application (winning on card games) seems very interesting too
Now, with the Google glass getting closer to being a real thing in the market, the possibilities are endless (for the good and for the bad). Unfortunately, my mind is kind of twisted and I think about the bad first. Must be a side effect of all the security issues I'm researching.
For instance:
# Google glass + Eulerian magnification + facial expression recognition = Instant "Lie to Me"-like[1] microexpressions expert.
# Google glass + Eulerian magnification + TSA agent: "picking" suspects by the way their pulse react as they get closer to the agent using the "apparatus". Of course, the real criminals would just take some kind of drug to avoid being detected...
Or have a camera with software like this and a low-intensity video projector aimed at their face.
The projector would beam the anti-phase of the signal that the camera detects.
Better yet: project a signal on the guy next to you in line that makes him a prime 'I am a terrorist' suspect.
Another bad: when walking on the street, if your heart rate goes up when seeing a mean looking individual approach you, that individual could pick you out for a 'give me your wallet' treatment. I think that would decrease the risk of getting caught for the criminal.
> could pick you out for a 'give me your wallet' treatment
Well, this is small-time stuff and I think it's unlikely that such thieves would start employing technology like this. The payoff for this type of robbery is just too small to draw in a lot of innovation.
Plus, I'm not sure heart rate is such a useful data point in deciding who to rob. My heart rate could go up because I'm scared, or because I'm amping myself up for a fight, or because I'm getting ready to draw my weapon in self-defense...
"The payoff for this type of robbery is just too small to draw in a lot of innovation"
Sure, nobody would do a startup with this market in mind, but if it is cheaply and reliably available, I bet people would start using it, just as people started using bolt cutters to steal bicycles. Also, if one had reliable tools to filter for low risk victims, it would statistically become more profitable to rob people. If you can expect to take an average $10 from ten people before being caught, fewer people would take the chance than when it is $10 from about 100 people (humans do not always make judgments that seem rational. For an example, see http://www.scientificamerican.com/article.cfm?id=stats-show-...)
If we're going to talk about using to detect emotional reaction in individuals on the street you could walk by a pretty lady (or gent depending on who's reading this) and determine if they show any signs of arousal/disgust when they see you.
The science of human emotion when paired with this technology would be particularly interesting on its own.
I agree with you, the results (not theory) are similar to Horprasert's background subtraction paper. Both techniques have same limitations. It is pixel-by-pixel and does not take the entire scene into account.
I was thinking about doing this. It's very do-able, there's nothing really complicated going on here. I wanted to do it for mobile devices, but this is going to be pretty resource heavy and I couldn't work out how to do the necessary processing in OpenGL ES...
I bet the iPhone app will be great. A few months ago I had some weekend fun implementing this algorithm in a trow-away iPhone application, and indeed I was able to observe the color change in my skin, and when watching to the veins of my arm the movement was greatly amplified.
I was missing a lot of the filtering required in order to amplify more and performance was a bit poor (frame rate) since it was just an hack. Something like that done in the proper way will be cool.
How are they able to change the video frame to show the motion amplification. In the last example, they are amplifying baby's breathing and showing it using some changes into the video frame itself. How are they able to stretch baby's cloths and body without knowing its 3D geometry, even wrinkles on the baby's cloths seems to shift. I am puzzled.
Human perception of 3D geometry comes from motion. This is an algorithm to amplify motion. And that's really all that is needed. See "Structure From Motion" and "Optical Flow" for more information about this. Note that this is technology that is used in some video codecs. You may have noticed in some corrupted videos, the corruption artefacts can sometimes move as if they are mapped on a detailed 3d model of the moving objects. But really all it is are 2D motion vectors moving blocks of color around on a predetermined optical flow path.
In this algorithm, exaggerating that same 2D optical flow path has the perceptual effect of somehow "knowing" something about the 3d geometry. you could more accurately say the original motion is constrained by the 3D geometry.
(more)
here is a kanye west video that uses the optical flow/video corruption effect intentionally for artistic purposes.
Thanks for the detailed explanation. The kanye video just looked like a corrupted low res video, I didn't understand that one. But i get the idea, its about extrapolation of moving pixels in 2D.
What I find neatest about this (if I recall the paper correctly), is that it's essentially the same concept as an unsharp mask filter, only taken over time instead of over space.
Computer science that can improve people's wellbeing -- in this case, medical diagnostics -- is particularly amazing. I can see this being used by emergency workers to take pulse quickly without having to fumble with electrical leads. The algorithm is also very simple and elegant.
I too think this is a very exciting sub-field. EMTs and the like already have something like this, only better: It's called a pulse oximeter, and it determines both heart rate and blood oxygen saturation by measuring the absorption ratio of two wavelengths of light. Commercial devices clip on your finger and cost about $50 (the whole device is about the size of two fingers).
I've seen those. But having even a mediocre version of the same available as software for each smartphone will only improve what the casual user or even first-aider is capable of.
I wonder whether it would make sense to give people first aid instructions via an app? I guess when actually panicking, people would work better talking to an experience person over the phone. At least for the time being.
My first thought when watching the segment where a clip from Batman is shown is, if this can be applied to movies, it may ruin some of the magic when the video picks up the microscopic motions of supposedly 'dead' characters (as the actors are still breathing and pulsing).
I think this is groundbreaking technology though -- I've read that there are subconscious responses to seeing things we like, such as a delicious food or an attractive individual of the preferred gender; a widening of the pupil or an increase in heart rate and body temperature. Devices that capture these changes can have applications in everything from marketing to security to courting.
"My first thought when watching the segment where a clip from Batman is shown is, if this can be applied to movies, it may ruin some of the magic when the video picks up the microscopic motions of supposedly 'dead' characters (as the actors are still breathing and pulsing)."
Because you'd need fancy algorithm to believe that actors in movies aren't really dead ?
I was pointing out that as cool as this is, you wouldn't realistically want to watch a movie you hadn't seen before with this kind of augmentation. Part of accepting a fictional story is suspending disbelief.. movies are designed to make this as easy for the viewer as possible. That being said, seeing a dead character breathing would disrupt that suspension of disbelief. It was the first thought that crossed my mind, but I found it interesting enough to share.
Can someone explain how they are amplifying the video of the eye moving and the baby in the crib breathing? If they are only analyzing the changes in color, how are they amplifying movement of shapes with that information?
They are amplifying a vector of choice. That vector could be change in color or change in relative pixel positions (as in physical motion).
In other words, it's not color based.
I’ve been using it for a few months and it’s surprisingly accurate. It’s fascinating to see a video demonstration of what Cardio does with all its math and image processing, since the only feedback in the app is the silly phrases above the stethoscope. I’d love a switch to show the processed video instead of the heart rate!
Waho, I think this project is fascinating. Anyone wants to work with me something like 2/3 days to release a web app, that transpose their code in a more "webby" way?
(We have their own solution: http://videoscope.qrclab.com/ but with all the respect this guys deserve! It's unfortunately far from perfect.)
I am based in Austin, my email is hartator_AT_gmail.com
Someone posted this same link yesterday, but I think I am with a ban that make my upvotes don't count, since the arrow go away and the vote count don't change. Goad that someone tried again and made this go to front page, I think it is vert amazing and awesome, maybe animators will be able to use this tech to learn how to avoid uncanny valley
I find this fascinating especially that the code is open source. Makes me wonder what kind of applications can be built. Maybe even using some sort of Leap Motion application in conjunction with the amplification algorithm.
I imagine that the only real problem is it'll never be able to be 100% real time as you need to sources of colour. I guess you could get pretty close though.
It is intrinsically delayed because it's based on the deltas from the previous state, and also they show a feedback loop in the filtering phase.
In one of the example of the color amplification you can see the head moving and a little bit of lag between the two sides, but I can't tell if they have been intentionally synced or this was done truly in real time.
Pulse, yes, as is demonstrated in the video. Blood pressure, in a calibrated sense, is a lot harder to measure without directly tapping into a vessel or putting a cuff on a patient, both of which are invasive and/or annoying over a long period of time or for chronic / long-term patients.
But a few derivative measures of circulation should be determinable. Both skin color and temperature will relate to blood oxygen levels and circulatory efficacy. It should be possible to note when a patient's circulation is no longer effective, a low blood-pressure situation or heartbeat irregularity.
"The system works by homing in on specific pixels in a video over the course of time. Frame-by-frame, the program identifies minute changes in color and then amplifies them up to 100 times, turning, say, a subtle shift toward pink to a bright crimson"
So really the title should be "Scientist amplify motion in video" - because that is what is occurring. There is nothing "invisible" being discovered - it still needs a visual change.
Invisible here obviously means invisible to the human eye. This kind of deliberate point-missing is toxic to conversation, and it just makes you come across as childish.
You can argue that this was "imperceptible" rather than "invisible". But you could say the same thing about things we only see with the aid of telescopes and microscopes. "The amoebas weren't invisible; that light was already bouncing up to your eyes."
But really, invisible means "unable to be seen." Unless you can demonstrate that it's possible for a person to see these things unaided, they are invisible.
I was thinking about the implications of using this technique to analyze e.g. political speeches and try to catch people lying on the act. Your application (winning on card games) seems very interesting too
Now, with the Google glass getting closer to being a real thing in the market, the possibilities are endless (for the good and for the bad). Unfortunately, my mind is kind of twisted and I think about the bad first. Must be a side effect of all the security issues I'm researching.
For instance:
# Google glass + Eulerian magnification + facial expression recognition = Instant "Lie to Me"-like[1] microexpressions expert.
# Google glass + Eulerian magnification + TSA agent: "picking" suspects by the way their pulse react as they get closer to the agent using the "apparatus". Of course, the real criminals would just take some kind of drug to avoid being detected...
http://en.wikipedia.org/wiki/Lie_to_Me