Hey HN! A few months ago in August I posted about Archivy [0] here, and got tons of useful feedback and really enjoyed the discussion. Since then, I've improved the whole project quite a lot and wanted to share and talk with you again.
I'm really excited about our improvements and am happy with our progress in this v1 release, it feels good to be back here after a lot of work :)
Hmm... It seems that I should finally stop using (good but too generic) DokuWiki engine in favour of this.
What interests me the most is an advertised ability to bridge self-hosted Wiki (did I get it right?) for taking any kind of notes with web-bookmarking and ability to perform a good-quality search over the _content_ of bookmarked pages. This is where I'm having a real pain. There are so much great useful stuff on the Internet, but it so hard to find some very specific thing that you know exists and you've even bookmarked it long ago... So if it'd solve that pain and also features a comparable to DokuWiki notes engine... I think, it'd be an absolutely great thing! I'll give it a try.
BTW: does it support importing existing bookmarks from Firefox?
Is there a privacy statement? Given open source PIMS that variously phone home (when started, or when visiting assorted websites, etc), it's now something I look for, right after self hosting.
I think you should put in a simple privacy statement that comes down to "Archivy won't make any external requests" if it's going to be as simple as that, and then stick to it like glue. If it's not that simple, be accurate to whatever room you'd like to leave yourself, and then stick to that.
Not having a statement like that leaves the future open for a future git pull to change the landscape. Technically the statement won't stop that, but it'll keep the project accountable.
FWIW it's way more important for this sort of thing because it's a stack you have to invest lots of time and notes into and buy into long term. Were I going to run a personal wiki, I'd be very interested in knowing I could likely use it for a long time.
May not be archivy directly. But for example if the web pages use static assets like google fonts, jquery, they will send external requests thus leaking the presence of the self hosted server. Or may be some javascript plugin that phones home.
Is there a specific, technical, reason, why you don't use an XML database with, let's say, an XQuery frontend for storing and retrieving document data?
> Extensible search with Elasticsearch and its Query DSL
Omg thanks. I'm so sick of Confluence inability to search through some documents. I really wish I had all of my Confluence spaces on disk so that I can `grep` it all.
Is it not searching/indexing pages or attachments? Ignoring the frustrating interface and query syntax it (in theory) should share some characteristics with elasticsearch since it uses lucene on the backend. In my experience it does an ok job of indexing pages and some file types. It would be nice if it just dumped the strings in binary files for indexing but I can see how that wouldn’t be very user friendly.
But it can only do ok when the indexing is working as it should. It can be easy to miss a stale index for a while because search continues to work. Then you’re waiting for it to catch up or even for a full reindex. The problems seem to get considerably more frequent with their HA products since you’re now distributing the search/index.
My impression from running a few different pieces of software and maintaining some smaller elasticsearch clusters is that search is pretty difficult to do well. Or even decent for that matter. Development, UX and operational support have to come together just right and when it’s just a feature of a product or a service instead of a core competency you get milquetoast and frustration.
More basic, but open source. Obsidian has very interesting features for showing and creating connections between pages, archivy seems to take a more standard structural approach.
Links between pages are a planned feature, though.
One of the big features of archivy is how bookmarks work: If you bookmark a webpage, it gets converted to markdown and stored in your knowledge-base to prevent link-rot, which is an awesome feature.
I'm not sure Obsidian allows to archive pages this easily.
I'm really excited about our improvements and am happy with our progress in this v1 release, it feels good to be back here after a lot of work :)
[0]: https://news.ycombinator.com/item?id=24199419