Hacker News new | past | comments | ask | show | jobs | submit login
A tutorial on active learning (2009) [pdf] (hunch.net)
85 points by apsec112 on Oct 28, 2017 | hide | past | favorite | 6 comments



Great tutorial. I see active learning as a halfway house between supervised learning and reinforcement learning, because requesting labels is an action (as in RL), but of a very limited, predefined type.

A lot of problems which we initially model as supervised learning are in reality, in a live situation, more like active learning. Going the whole hog to RL may be unnecessary. But still, bandit-type models may be the right fully general setting.

These topics are very much in the air at the moment with a lot of new interest in bandit models and algorithms.


Are there decent recent implementations in python of these? Or tutorials of some sort?


The vowpal wabbit library provides an active learning setup https://github.com/JohnLangford/vowpal_wabbit/wiki/active_le...

I think you launch an active learning server, and the Python application interacts with it.

You may be able to run everything in Python https://github.com/JohnLangford/vowpal_wabbit/tree/master/py....

Here's further description https://github.com/JohnLangford/vowpal_wabbit/wiki/Command-l...


Does anyone know what are the pros and cons between this and “online learning” ?


They are difficult to compare because they are intended to solve different problems. Typical learning algorithms have access to the entire, pre-labeled training set, which they can repeatedly iterate over.

Online learners are still trained using a completely-labeled data set, but they cannot access the entire data set at once. Instead, examples arrive one at a time and cannot be saved or replayed.

In active learning, the labeled data can be processed in batches, but the learner either has access to additional unlabeled data or can generate new examples itself. During training, it can request that some of these examples be labeled. Imagine asking a teacher for help.

Online learning makes sense when you have so much data that you cannot possibly store it all; active learning makes sense when you have less labeled data and the labeling step is expensive.


Sorry I have only just seen this. What a fantastic answer, thank you so much I completely get it now.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: