Here's a business case for how this could be useful and why I'm going to try it.
I operate a string of ~60 local businesses that have physical storefronts.
On average each store has 5 local competitors.
Our customers are VERY price sensitive - they will almost always call the "lowest price" guy in the area first.
As a result we currently have a duct-tape system to monitor competitor pricing and tweak ours as fast as possible.
MonitorBook would be a potentially much simpler mechanism to orchestrate this.
If the MB team is reading this drop me a line and we'd be a potentially great case study / early adopter.
Things that would have to work to make this usable long-term:
a) API so we could pull this data into our pricing algo.
b) Clear error-checking confidence - if a site we're monitoring changes their code / display and we miss it, we'd have to go back to duct-tape which is sloppy and labor intensive, but reliable.
You change your strategy :-) It's like Cryptonomicon "The most advanced technology, usually wins the war".
If they can do that, you'll have to figure out quickly and adapt... But to my personal experience, there are way more what if(s) flying around than what actually happens.
Thanks for letting us know (I'm one of the members of the Monitorbook team). HN was working till a few days until a few days ago. I will look into it. We have done our best to support as many websites as possible but it is not a surprise that our scrapper is not perfect yet.
We will keep working on it :)
Why do I have to create a new account for this site? I have Google, Facebook, Twitter, etc; consider using one of those site and OAuth to provide a more frictionless experience for first-time users, especially considering your site requires me to come back X days later to really see it work. My $0.02...
I much prefer an email and password style login especially for new sites. I typically try a service with a burner email account, then I can easily walk away from it all without continuing to get email. Plus if you end up selling my address for spam I know who you are...
I don't mind creating an account, but I'd let users try the product without one. I don't want to go through signing up – no matter if via OAuth or creating a dedicated account – just to find out the product doesn't suit me due to some details I could not find out by the page or video. Unless absolutely unreasonable, I think you should always allow users to test your product anonymously.
Wow, seems like everyone has built a version of this. I've been working on something similar as well, and really went for nailing down the element selection. Seems to be a basic duplicate of what you have done. Best of luck.
Sorry, my work pc doesn't have sound so I can't hear the actual video. Does this track sites other than Amazon, or is that just the demo? Also, is there an estimate on the pricing for the premium service?
Overall, I think it looks pretty clean. But I would probably add just a bit more information on the main page. I see it's mostly in the video, but it would be nice to get pricing details, description of exactly what things you can track, etc..
Very cool.. but what's the implication on the legal side? Many of the sites forbid "automated" scraping of their sites.(sure they can be all circumvented). On that note, what about sites like kimonolabs (http://www.kimonolabs.com), they also let you create an API on top of an existing webiste. I imagine it's done by scraping. Would love to hear some thoughts.
Scraping Amazon might be illegal, or at least not specifically allowed, otherwise there would have been an API. However reading 'robots.txt' I guess that their policy is 'open to scrapping' by search engines. Because what google spiders do to virtually everyone, can be considered web scrapping.
On the topic of legality/forbidding scraping, Kimono offers webmasters analytics and control of the scraping users do through its service. I'm not sure if this service is actually live, or forthcoming.
Has there been work after RSS to improve pub/sub protocols for websites to notify customers of changes/deltas, without the need for wholesale scraping?
I'm not aware of anything, but if someone else has some input I'd love to hear about it. FWIW, I believe Amazon does have an affiliate api that lets you search through the catalog. Same with Ebay, Alibaba, and most of the major companies. The only one I can think of that doesn't is Craigslist.
For what it's worth, I'm pretty sure that it's intentional on Craigslist's part. Due to the first come first serve nature of a lot of the free listings, it means that 'honest'/non-tool assisted users would never get a look in.
Nice website. Do you have plans to release an extension or an app for this? I have been using Distill Chrome extension[1] to monitor pages with dynamic content with pages that require authentication.
Yes, the plan is to release a browser extension pretty soon. With the extension you will be able to track content in pages which require authentication.
We also have an iOS app in the works so that you can check your trackings on the go, and receive push notifications.
I had something like this on my back burner for years called Sitebeagle. It's command-line only. Glad these guys took the initiative to build something way more robust. It'll be great for tracking 1 day deals, seeing if somebody's logged in somewhere, and getting alerts for when band tickets go on sale. The possibilities are endless.
Very cool, I like it. Only comment is that you should try and get the video zooming in on where the mouse is or do some close-ups of the screen. It was hard to see exactly what was going on and it felt closer to a techie screencast than to a customer product demo.
Is it possible to track the price of an ipad mini on craigslist? ;p
I use websec as a cron command to do that for pages I care for and implemented at some point an app in python I called NoticeThat that fell into oblivion as I had no idea how to market it.
It's nice to see someone catering to that use case who can actually execute on it.
How does this handle HTML changes over time ? I believe this will store a URL + Selector , then periodically scrap the page over time. just curious what happens if the page changes later.
I found the 57s video on the front page to be quite instructive:
it's a bookmarklet that identifies some portion of the screen to scrape (a numerical value in the example), and tracks it over time, allowing you to set alerts at thresholds (presumably so you can buy at a favourable price)
The front page has an explanatory video. It looks like you can select a part of a webpage and it will produce alerts whenever that section changes by repeatedly polling the website (e.g., the price of a product on Amazon, like the video shows).