Outputs are sampled from a probability distribution, and increasing the legibility effectively concentrates probability density around more likely outcomes. So you're correct that it's just altering variation. The general technique is referred to as 'adjusting the temperature of the sampling distribution'.
Hey HN — This is a simple web app that allows you to enter text and watch as an in-browser neural network generates the corresponding handwriting in real time. I’m happy to hear feedback or answer questions about how it works!
Very neat! Would you be willing to share more about how you're implementing the in-browser neural network? Doesn't look like webassembly, and I don't see you load any external libraries, just download a binary with content-type "application/macbinary".
I implemented the network from scratch in vanilla javascript. The file is just a custom binary file containing the names, shapes, and weights of the network parameters. If you have more specific questions, feel free to ask them.
The value is almost entirely as a technical demo of current machine learning techniques. Handwritten fonts will always be a more reliable way to create synthetic handwriting, so I don't see this as something that will be used as anything other than a toy.
Hey HN — This is a simple web app that allows you to enter text and watch as an in-browser neural network generates the corresponding handwriting in real time. I’m happy to hear feedback or answer questions about how it works!
Non-statistical methods of imitating handwriting are always fairly easy to identify as synthetic due to the uniformity of strokes and unnatural kerning (among other things). Even methods which incorporate random variations don't hide these artifacts sufficiently well. In my opinion, the only way to generate synthetic handwriting which is convincingly real is to use statistical methods (i.e. machine learning) to model all variation which is present in real handwriting. I've implemented a neural network in javascript which does exactly this - https://seanvasquez.com/handwriting-generation/. You can play around with it and find a few weaknesses, but in general I find that it can produce handwriting which is indistinguishable from real handwriting.
That's super-cool. Honestly that's what I was expecting the submission to be.
When I looked at the example image in the original submission, the same letters all look basically identical to me -- I can't figure out why it's JavaScript rather than just a font.
Yours actually looks like real handwriting. It's quite slow, though -- like the actual speed it takes to write. Was just wondering if you slowed it down intentionally for effect, or if that's just how long the NN takes?
Also, I'm not familiar with the space, but this feels like something you could commercialize as a plugin for Illustrator or something like that -- particularly if you had a range of sliders to really fine-tune the desired text properties (width, weight, weight variation, etc.).
It's not intentionally slowed down. The sampling process is computationally heavy and you'll notice improved performance when running it on better hardware. There are many algorithmic ways to speed up neural network computation and I'm fairly sure I could speed things up by an order of magnitude or two if I invested some time into it.
The model was trained on a relatively small dataset which has few occurrences of the letter x, almost none of which are in cursive. As you've found, asking the model to write out an x in a cursive style can totally derail the generation process.
Any ELI5 for people who know close to nothing about NN, AI, machine learning etc?
Like, how does one letter it doesn't know how to write well make it skip/mess up the following ones too? I mean, "derail" seems a fitting description to what I'm seeing, but how?
The "handwriting" data for this model is basically the coordinates of a pen. The length of the string representation of the text is very different from the length of the coordinate representation of the text, therefore the model "learns" a window corresponding to when it is drawing the current letter, and when to start the next letter. For these letters, as the model doesn't learn how long this window should be, nor how to transition from it to the next letter, it gets stuck and outputs nonsense.
I got that to after playing around a bit and then trying that phrase. It gave me a really good laugh, as if AI just went "ah shit, why does everyone enter this stupid phrase eventually?"
Wonderful! And as it generates single stroke SVG exports, I can output to a laser cutter and have it write for me :)
I can't think of a practical use for this, but that's not stopped me wasting the last 15 minutes getting my machine to write out expletives in neater cursive than I could manage...
- ed
Oh, and this is by far the best digital 'handwriting' I've ever seen, despite the occasional glitches and believe me - I've been interested in this sort of thing for decades.
Having played with this a bit - if you can flesh it out a bit, I am sure you can monetize this.
I've been dicking around with it for an hour or two and I'm getting results that - if i mingle outputs - are to my eye completely human in nature.
Honestly, as an ex sort of still designer who still engages with that sort of community, i see a tool like this that makes neat scripty vector paths (and I think that is particularly important) as being super-useful. It outputs unique forms that are individual in nature. I'm thinking all the graphic designers who makes stuff for weddings and invites and whatnot.
Needs work, and I have absolutely no concept of how complex it is to refine things, but what you have here is a helluva start to something I think could be immensely useful to some people.
- minor eds
- ed ed
Can you manage the stroke widths through the cursive forms, thinner when drawn quicker and straighter, fatter when slower around corners, for example?
Very interesting. I did notice that with my input of "Hello, this is some hand written text!", the output of the word "text" consistently looked more like "tent" than text, regardless the style or legibility settings.
If I write for example 5 letter "k" in a row (space separator), the technique used to write them varies in a way that irl wouldn't for a real human.I guess you could argue why would someone write 5 k letters.
(I won't mention that I observed the same variation when repeating the process using a name of a certain notorious organization that uses 3 k letters)
How would you go about creating a minimal/medium source of input to train this?
Could you create a 'written test' type thing - a couple of pages of selected forms of repeated instances of glyphs and interactons - that would be used as input and to learn a passable subset of one's own handwriting?
A random FYI: the demo seems to consistently fail on the letter x when using style #9 and typing out:
The quick brown fox jumps over the lazy dog
I tried a few different spots on the legibility slider with similar results..
It draws the first "top left to bottom right" line, but then instead of a "top right to bottom left" it draws another "top left to bottom right" next to it..
Seeing this had a strong uncanny valley like effect on me. Like my brain can't help but imagine a sentient agent writing the text and suddenly becoming psychotic.
This looks very interesting. I wonder how much data would be required in order to imitate a particular handwriting style? i.e. if I know someone with a lovely penmanship and I can ask that person to create such a reference data set, how much should that person write in order for the network to be able to reproduce the style reliably?
Yours is amazing! By the way it has a big problem with "/" in the default settings. I tried doing a date like "08/20/2020" since that's something I need for signing forms but the slashes mangled it. Dashes worked ok.
This is amazing. It's cool how you can drag the legibility slider up and down while it's writing and it responds live. It looks the most realistic to me when I alternate between high and low legibility, especially with certain styles like #3.
Is there one that instead imitates pressure sensitive input and then follows the same pattern? With a sufficiently large brush and downsizing the image it could look very good. Maybe the lack of artifacts from the brush is the reason for the unnatural look
How much data is needed to train the model to replicate a specific person's handwriting? Could this recreate my hand writing with less than a page of notes in my own handwriting?
Works for me on iOS. Did you click the “write” button in the bottom right corner? It seems you have to tap it after any change in text or settings for it to update.
Were you using actual Safari or an app’s stripped down UIWebView? If Safari, then I’m guessing you have a content blocker that’s stopping access to WebGL.
Yes, it's mostly based on that. Primary difference is that this model adds an inference model (VAE style), which allows you to a) sample style vectors from a latent distribution b) do so more efficiently than the priming mechanism described in that paper.
This is awesome. The fact that it writes the same word very differently each time is impressive. Would be interesting to know the underlying mechanics.