I want to share a small project, closely related to the paper. You can try it out if you have free 5-10 minutes and an Android device:
https://edge-ml.dmz.teco.edu/
You login; create a so-called workspace; add some labels like walking, jumping, tapping on the device etc.. (simple actions for testing purposes); collect data for your labels with your mobile device and create a machine learning model with a single click. Then scan the QR-code of the model and your actions are predicted by the model in real-time. It's really interesting how accurate the predictions are even with a small sample size.
Possibly-pathological usage report: training with default parameters against three samples each of tapping the left and right edges of my device ~rapidly (2.5Hz?) doesn't produce a model that can accurately infer which side of my device I'm tapping.
I wouldn't be surprised to learn that adjusting the model parameters or adding more samples would likely remedy this. (By all means go dig out the samples and model created under "i336_" if you're interested.)
The UI is also notable as well. I initially visited on desktop because that's what I was already using, then realized the site is actually designed to be used in a desktop/device split scenario. Nice.
Some pedantic UI nits, in case you want them :) -
- The backend very occasionally 503s, which creates a toast/popup full of HTML in the UI. Only noticed on desktop, but can probably also occur on device too.
- The current approach of tap-label-then-begin-typing-name-to-rename is cool but doesn't catch the backspace key, and I also had to guess that I could type. Create-textbox-on-mousedown would be more intuitive ("ah yes clicking this does let me rename it") and also eliminates the gotta-catch-all-the-keycodes problem.
- I can't either preset the countdown and duration, or have it be persistently saved on mobile. (I wanted to set countdown to ~1-2 seconds.)
- If you submit an empty countdown in the mobile UI the site gets very confused :) a full page reload was needed
- Once I'd hit Train I was like, "...okay, now what? Oh, the header parts ('Workspaces / blah / blah') are clickable, I click in there to go back. Right then"). Back and/or obvious navigation buttons everywhere would probably be helpful for orientation.
- After leaving the tab for some time then coming back to it I was met with a bouncing "Unauthorized" at the top of the page. The requests for .../samples, .../models and .../workspaces are all 401ing.
That's actually everything. The UI is overall refreshingly straightforward/to-the-point (ie, not full of unnecessary layers of indirection!), and fast.
For example, I didn't realize I needed to create a layer to begin with, then when I created it on desktop it immediately showed up on my phone. Nice. (Next step, long-polling :P)
I was hesitant to share because it's quite new and still in development but I'm glad someone from outside actually tried it! It has some issues as you have pointed out and I really appreciate the detailed report, I've noted them.
There are no currently no explanation of the ML parameters since I still use the website mostly for manual testing. Sadly, some of them are crucial for model accuracy and they are not easy to manually configure for the user, so the plan is to integrate meta-learning that finds the best possible parameters on its own. I will take a look into the model now :)
Edit: After a quick inspection I see that the problem is actually sample size. Your samples are about 3 seconds long and for a consistent model I need at least 30 seconds of sample for each label.
Maybe some background: You can simply set the recording duration to 30 seconds and keep tapping for 30 seconds, you don't need 10 samples each 3 seconds long. The server splits the recordings into 0.3 seconds long windows for training and the model tries to interpolate the function between these windows and their labels.
In your case, only 13 windows could be extracted from your recordings so the model has only 13 data points for learning the mapping and tries to guess the function based on them. The beginning and the end of a recording is also cut because the sensors are not very reliable during these periods, so a 3 seconds long recording becomes about 2 seconds long after that. This is why tiny recordings are generally not very useful for the model training (and I should definitely change the default recording duration which is 3 seconds!). From a 30 seconds recording, we can extract ~100 data points and that is usually enough for a somewhat reliable model.
But of course none of these are obvious (not even close..) and I should do a much better job explaining the process to the user. I learnt a lot from your experience, so thanks again :)
Adding this several days later, so I realize you may never see this, but if (when? :D) you do happen to notice, feel free to email me at the address in my profile (click my username), regardless of how far in the future. :)
Really glad I decided to have a look through my comment history. Not sure when you added the edit, just noticed it now.
I forgot to ask - what are you actually using this for? What's the actual use case? I'm guessing some kind of specific research situation, but I'm curious exactly what.
Thanks for the explanation. So much more goes into this kind of thing than intuition would suggest...
Thanks for having a look at the data too. I recorded some new 35 second samples (at 100Hz, since why not) into a new workspace - but then decided to make things interesting by recording both fast and slow taps :), I'm not sure if this was a bad idea lol
(By all means poke at these too)
If I'm honest (and I think that would probably be most useful here) it's difficult for me to say the model is reasonably more correct now. It still
- oscillates between tap side (and speed)
- doesn't correctly pick out which side a single tap was made on, or makes the correct prediction for 8.4 milliseconds and then rapidly follows it by all other possible predictions :)
- continues to flip through predictions after I stop tapping (possibly because the prediction system examines/averages the entire 5-second classification window?)
- placing my device on a flat surface apparently means "fast right", "left"
- holding my device mostly (say 95%) steady (ie, as little deliberate movement as possible) for several seconds apparently means "right", "left", "fast right", "right", "fast left", "right", "left", "right"... etc
There seem to be occasional spots of consistency, but I can't deny that's probably my brain wanting it to work :P
I'm still learning about sensor analysis and ML, and the above is what my gut instinct says might potentially be most useful to report.
While playing with the system again I had a thought - if you displayed toggle buttons (that showed as
buttons but functioned as radio selectors) for all sample labels during classification, and allowed the user to select which "actual" input they were providing in real time, then recorded the raw sensor data against the user label selection during classification, you could store this recording then repeatedly replay that training data into the model during development and eg show prediction vs user-specified actual input.
Given that model training only seems to take a few seconds (and can perhaps be run on higher-end equipment during development to make the process even faster), such a high-iterative workflow might viably be rerun on every source file save. (On Linux I personally use `inotifywait` for this; for a wall of text on how much I like inotifywait, see also https://news.ycombinator.com/item?id=26555366.)
--
On a completely unrelated note, I thought I'd ask: for a while I've wanted to measure my realtime speed on trips on the (...underground...) fastish metro/subway system that was installed a little while ago in NSW Australia. All conventional wisdom I can find universally suggests that using accelerometers is utterly useless for this type of task, but GPS generally needs to see the sky :). I was curious if there are any hacks/workarounds/cute tricks I might be able to use that might not be widely known.
You login; create a so-called workspace; add some labels like walking, jumping, tapping on the device etc.. (simple actions for testing purposes); collect data for your labels with your mobile device and create a machine learning model with a single click. Then scan the QR-code of the model and your actions are predicted by the model in real-time. It's really interesting how accurate the predictions are even with a small sample size.