As for "The problem I ran into was that it takes you off the website you were searching on and takes you to a Google results page (complete with slightly irrelevant ads)."
Look, this is cool and all from a technical perspective, but why are people strapping themselves in static site generator straightjackets like this? At some point, doesn't it become much simpler to just use a server side framework?
My personal site is simple enough that there aren't any database-requiring features I actually want on it, and benefits of Jekyll include:
- Just feels nice. I actually enjoy playing with it.
- Cheap, I have it on AWS CF (a CDN) and it costs $0.10/month for
a few thousand pageviews
- Fast, with such few page views even wordpress would be fast enough
for me, but I like that it's REALLY fast
The price benefit isn't really a consideration for me, I still have the server I used to run it on (in fact I don't have a unix machine at home, so I edit the site on my server), but it is nice that it can be hosted on a really good CDN (which a database-driven website couldn't be.)
I use Armin Ronacher's rstblog site generator. I'd say the inverse of what you said is true, at least in my experience. At some point it is much simpler to just use a static site generator. That point is when you nail down your workflow. I wrote about this very topic this morning:
Do you have any plans to monetise this ever, and/or might you ever release the source so people can run it locally themselves?
I have a small enough amount of content on my Jekyll site that I don't want a search function right now, but if I ever were to want one, the missing feature I would like is the ability to add pages not from an RSS feed, as in standard non-blog pages such as /index.html and /about.html (or whatever). Obviously I could work around by adding them into an RSS feed... but that's a tiny bit messy.
edit: Assuming I'm right in thinking that one of, if not the, most suited system for Tapir is Jekyll, you could perhaps mention it on the actual Tapir site so that search engines can pick it up. (Not that I'm in any way an SEO expert, but in my experience with niche areas like this, very little or no attention to SEO can still get you ranking highly for something like "jekyll search").
You can actually just push any content with a link trough the API (see http://tapirgo.com/#docs > Push API). So with a deploy script that takes the content and pushes it, you can already do this right now!
Agreed. Right now this is technically "Search for static BLOG sites". If you had the ability to scan simple HTML files as well, it could work for any HTML site, even if it weren't a blog site (or another site type that has a time-ordered list of posts in an RSS feed).
We're definitely looking into this and hope to find a clean solution soon. We started by indexing RSS since it was what we needed and since it was _way_ simpler to implement for a first version. Stay tuned! :)
Never used Elastic Search, but it's not too hard to use Nutch + Solr for static sites. Nutch to spider your (truly static) site, and Solr to store and serve search requests.
That would be the simpler approach, but Google wants you to put their logo on your results page and limits you to 100 requests per day, unless you get a paid plan. (http://www.google.com/cse/docs/tos.html)
If you're fine with that, Google Custom Search is a good option. If you're not, maybe Tapir can help. :)
You can set it up to load the ads on the same page. For example, running on my jekyll blog: http://paulstamatiou.com/search