I love whenever it's possible to upsample/restore media due to known constraints in the original -- in this case, how screens work.
Something analagous I've been waiting for is regenerating old scratchy piano recordings. Piano is unexpectedly simple compared to other instruments -- the only inputs are really note down + speed, note up, sustain pedal pressure, and (less frequently) soft pedal pressure.
Seems like you should be able to turn any solo piano recording, no matter how degraded, into a relatively lossless MIDI representation, then re-record that replayed physically (via motors, which exist already) on a modern piano, or even just synthesized, trying to be as true to the original piano's characteristics as possible. Losing literally none of the artistry.
It seems like this should be "easy" for piano in a way that it isn't, for example, with violin which has so many more complicated characteristics of pitch, timbre, bowing, vibrato, etc.
This has been done in various forms! One interesting one is shown in https://www.youtube.com/watch?v=6hv2zh_Z0Io
Sergei Rachmaninoff recorded a 78 album simultaneously with a piano roll of several pieces - thus capturing the 'keystrokes' and the audio of intonation of the master's hand.
The piano roll was converted to an automatic reproducing piano (super high end player piano, a Bosendorfer 290SE) and
massaged by an expert to sound almost exactly like the 78. Then it was re-recorded in modern fidelity, playing back from the 290SE.
It is exciting to see and hear a 290SE (re)play in person, but a little weird in a concert setting with no pianist to watch.
When you translate to MIDI, you are going to lose a lot of the subtle pitch and tonality of the piano. One string vibrating produces overtones and a chord will vibrate in a very deep way on a grand piano that I feel you cannot replicate in a MIDI, and therefore that detail would be subsequently lost.
For instance, how long you are touching the string - while you're touching the string, there is a sound - but after you let off, you get the "reverb" - and there is different reverb for how you hit the key, if you bounce, or if you stay for a split second longer for staccato, I don't feel like these subtleties translate to MIDI.
It is certainly easier than violin, that I will grant.
edit: IMO the best way to do what you are describing is get a really good pianist to sit down and do the work. I don't think that (current?) machine learning can really "understand" the nuance of phrasing esp that would be coming from older recordings.
> I don't feel like these subtleties translate to MIDI
But wouldn't they be reproduced when replaying the MIDI data physically on a piano?
Ultimately isn't how you hit the key and bounce/stay still just initial velocity and then timing of letting go? Perhaps the velocity of letting go would have to be added as well, but I'm not actually sure if that's really acoustically meaningful.
I guess I don't see why all the reverb and ultimate sound complexity wouldn't be recreated in playback? Of course, this requires actual physical playback on a similar enough model of piano, or else a synthesizer that is sufficiently accurate.
Well - for one thing there is the pedal is not itself binary but in degrees - and there are three pedals, one of which if you depress, will silence only some of the strings. I don't think MIDI itself is capable I guess, some other format might be. There are a lot of factors, and pianos sound different from each other, I think that would be lost.
MIDI is definitely capable -- code 64 is used for sustain pedal, and code 67 for soft pedal. And it's associated with a byte value for how far the pedal is depressed.
The third (middle) pedal in pianos is nonstandard -- i.e. used for different effects on different pianos, whether sostenuto or bass damper or practice mute.
In actual performance the only time it's ever really used (and rarely at that) is as sostenuto, since that's what it does on grand pianos like Steinways, but its effect is indistinguishable from simply holding notes for longer durations, so MIDI can simply represent its effect that way. (Unlike the sustain pedal which increases resonances in a big way and needs to be represented independently, or soft pedal which changes timbre as well as volume.)
A piano can be intentionally and or unintentionally not in tune. Every Piano has its own unique sound (due to manufacturer and also form factor)
and is played in a place where temperature can have another affect in the sound.
I kinda wonder how this would compare to an upscale and sharpen - a good amount of these screentone patterns are solid blacks on a consistent white or gray, which seems like it should work fairly well. Or maybe that'd round too much off - this is doing a pretty good job of keeping line-quality intact.
That said, this is an interesting technique, and looks pretty good in the end... but the minor misalignments / pattern-jitter in some areas would probably bug me more than the blurry image, tbh. Seems like that could be improved somehow though, maybe by modifying the pattern it decides on with something similar but not original-pixel-aligned?
---
edit: after writing the above and looking back at it a third or fourth time: I've changed my mind, the patterns this is producing will very likely look better than a sharpen when they're closer together or more heavily aliased. They're "plausible" and still look like patterns, sharpens have some terrible edge cases on stuff like the remote(?)'s frame. Maybe they just need some more examples / side-by-sides? I imagine more will be in the final paper, whenever that's linked.
While the aliased sample is sharper, I experience an unpleasant artifact in that version based on the halftone dots lining up with pixels, with an effect like grid illusion.
https://en.wikipedia.org/wiki/Grid_illusion?wprov=sfti1The blurry character of the original is also unpleasant, but the aliased version is hard for me to look at. I'm interested to know if anyone else experiences this.
It's not even an illusion; it looks bad because they have a terrible tile for the dithering. Should be easy to fix in a postprocessing step after the AI.
Interesting. I myself have over the past month trained a simple 2X ESRGAN upscaling model for manga, so I did a quick comparison to see how it performs[1].
For ESRGAN I would say there is still ways to go for it to be useable in realtime.
The shown image at an input resolution of 309x237 pixels takes about 7 seconds to process with my own model on a GTX1080.
I love whenever it's possible to upsample/restore media due to known constraints in the original -- in this case, how screens work.
Something analagous I've been waiting for is regenerating old scratchy piano recordings. Piano is unexpectedly simple compared to other instruments -- the only inputs are really note down + speed, note up, sustain pedal pressure, and (less frequently) soft pedal pressure.
Seems like you should be able to turn any solo piano recording, no matter how degraded, into a relatively lossless MIDI representation, then re-record that replayed physically (via motors, which exist already) on a modern piano, or even just synthesized, trying to be as true to the original piano's characteristics as possible. Losing literally none of the artistry.
It seems like this should be "easy" for piano in a way that it isn't, for example, with violin which has so many more complicated characteristics of pitch, timbre, bowing, vibrato, etc.