Show HN: We built an end-to-end encrypted alternative to Google Photos

mrobins · on Aug 29, 2021

I’ve been watching this project for a long time and personally am very excited. The fact that it’s #1 on HN today (congrats!) makes me think I’m not the only one.

There are also a lot of valid concerns in these comments about privacy and use of algorithms. A lot of it depends on what you’re looking to gain by adopting a new service/switching away from something else and individual concern.

Personally, I’m looking for a place to store personal photos: friends, family, travel etc. Critical needs - easy sharing ideally not locked into Apple’s ecosystem - not to have my photos mined for advertising and social graph data (most important) - ideally around for the long haul but in my mind this is for sharing, not backup

I’m not particularly concerned about warrants, government surveillance etc. Again for me this is about sharing so the expectation of true privacy is low. Any photos I considered sensitive I would store elsewhere.

For me, the biggest point of confidence I have in this project is that they charge money from day 1 and don’t have a forever free plan. I’m excited about projects that offer the benefits of “social” but where the software, not my data, is the product.

istingray · on Aug 29, 2021

re: "the expectation of true privacy" you might enjoy reading the Cypherpunk's manifesto [0]

"Privacy is necessary for an open society in the electronic age. Privacy is not secrecy. A private matter is something one doesn't want the whole world to know, but a secret matter is something one doesn't want anybody to know. Privacy is the power to selectively reveal oneself to the world."

[0] https://www.activism.net/cypherpunk/manifesto.html

Roritharr · on Aug 29, 2021

I'm in the same boat, have been watching, love that they have a businessmodel and am waiting for the time when they are covering my needs (face recognition, object / scene detection...). I'd even pay a 2$/month "lurker" subscription which has like 100mb of storage so I can check the features from time to time and support the team.

noduerme · on Aug 30, 2021

As someone who's never used cloud-based photo browsers... I always assumed the facial recognition aspect was primarily for social media apps that try to tag known faces from a user's friends group, to put it in those people's news feeds or something. It's one reason I avoid being photographed and ask people not to tag my name to my face if they do post a photo I'm in. I'm wondering, what's the utility of facial recognition if you're storing/sharing photos on a service that has no database of known faces? Or is this just for image editing or red eye removal or something?

[edit] as I'm rethinking it, would this just be for searching your own images for a particular person...?

skinkestek · on Aug 30, 2021

> as I'm rethinking it, would this just be for searching your own images for a particular person...?

My Synology NAS has face recognition and it is wonderful even if (actually: especially since) it has no pre-existing database and doesn't (to the best of my knowledge) share its database.

For someone like me who manages family photos for the entire family but isn't to good at recognizing faces it is just brilliant.

kingsloi · on Aug 30, 2021

I agree, Moments isn't a bad piece of software, especially being able to group/combine the same person, that is tagged as a different person. My newborn was like 50 different people when I first uploaded our pics, merging them together was as easy as a few clicks.

I wonder if it's a good idea to use Synology as onsite, and ente as 123 backup solution?

jetofff · on Aug 31, 2021

It's so incredibly useful to be able to bring up pictures but you don't remember the exact time or date that you took it.

Google photos has come in so clutch when you're searching through 50k photos.

thinkloop · on Aug 30, 2021

To be able to categorize by person, ex: "list all photos of Jim".

hulahoof · on Aug 30, 2021

This would be a useful feature for myself, I am also loathe to tag faces on social media with all that entails; but I find myself approaching a friends birthday or other events wishing I could search my images for everything that included them from the past year

bigiain · on Aug 30, 2021

So this is a project specifically marketed as E2E encrypted, and you are "waiting for the time when they are covering my needs (face recognition, object / scene detection...)"

You will be waiting a long long time for that.

The only way they can do that is client side, and if they go there we are back to the last few weeks discussion of Apple's new client side image scanning shit.

You do not want this service, it seems.

You want a non Google service who can do face recognition, and object/scene detection, but who'll pinky promise you they won't sell you out to advertisers or law enforcement or governments, even though they obviously could.

vishnumohandas · on Aug 30, 2021

> we are back to the last few weeks discussion of Apple's new client side image scanning

Apple has always been indexing images on the client side. What changed is that they're now reporting the presence of a predetermined set of hashes to authorities.

If governments were to mandate that such reporting is necessary, it is likely that the enforcement will be on a device/OS level, extending the example set by Apple. Demanding compliance from every single cloud storage provider out there (E2EE or not) would be a sub optimal route for them to take.

My point being, "client side indexing" is not the evil here, and it is unlikely that storage providers will be the ones forced to share data. Your concerns should probably be directed at your operating system.

diebeforei485 · on Aug 30, 2021

I don't think this is fair.

What iCloud Photos is doing for their client-side scanning is: (1) Not to your benefit. There is no positive outcome for you from your photos being scanned. (2) Mandatory if you want to use iCloud Photos.

In contrast, I presume this would be- (1) Only to your benefit, because all of this derived metadata around scenes and faces would also be encrypted end-to-end as part of the photo library. (2) Entirely optional.

illgenr · on Aug 30, 2021

What do you mean a long, long time?

Increasingly powerful GPU compute being released and constantly improving image recognition models out in the wild. I'd bet there's a nicely packaged, open source solution released in under 3 years.

mncharity · on Aug 30, 2021

I wonder how sales psychology might differ between a "lurker" subscription and an inexpensive limited plan? Lurker might have a more explicit "I think you're interesting and want to support/encourage you - thanks, we appreciate it" exchange. Or maybe defuse "but is it usable?" or "do I want to bother attempting to use it?" or yet-another-thing commitment concerns. Not "am I really going to use this?" but "does this look worth encouraging?". And maybe has a funnel story of "ok, now it's looking good, and I'll start using it for real... and not the mere limited plan". Sort of a patreon vibe, but blended with plans?

Chilko · on Aug 29, 2021

Looking at their pricing for €0.99 / month you can get 10gb storage, so go at it!

dexterdog · on Aug 30, 2021

Storage is cheaper on S3

bigiain · on Aug 30, 2021

If you're gonna dick them around over the difference between €0.09/GB/Month and $US0.025/GB/Month, they're probably ecstatic to not have you as a customer.

Either you're whining about their entire ecosystem of encryption, key sharing, mobile apps, desktop app, web app, etc - not being worth a cup of coffee a month "Cause I can do it all myself using S3!!!", or you're planning on storing many times more than ~200GB on their platform.

Youden · on Aug 29, 2021

The monthly storage costs are too high. For the price of 1TB from you (15€), I can buy more than 2 TB just about anywhere else.

Commercially, Apple and Google are both 2TB for 10 CHF and Amazon gives you unlimited as part of a Prime membership. Storage providers like Backblaze and Wasabi both charge around $5/TB and that's really the table-stakes price. For the more DIY-inclined, Hetzner sells a 2TB OwnCloud instance for 9.90€/month.

I'd prefer to buy software from you than storage. It's out of the question for me to pay you per TB but I'd consider paying a flat rate for software I then host myself.

imagine99 · on Aug 29, 2021

I fully agree. It's a hard sell getting people to switch from an evil but known cloud provider to an unknown cloud provider that claims to not be evil.

What we do not need is more cloud offerings that can change, vanish or lock us out at the blink of an algorithm's eye.

What we need, rather, are reliable and easy-to-use solutions that allow us to retain full control of our data (i.e. self-hosted and offline) while having feature parity with the big cloud-only solutions.

I for one am convinced that there is plenty of money to be made that way. Perhaps not as much on autopilot as with the quasi-scam that is cloud computing, but people willingly paid hundreds or thousands for software before clouds and subscriptions. People will do so again, if you bring a convincing, unique or competitive product to market.

That being said, I like, appreciate and support this project for its impetus, even though I think its distribution strategy is misguided and fad-driven (re-selling cloud space instead of selling software). It's not too late to change that...

vishnumohandas · on Aug 29, 2021

Hey, so the project had initially started off as a self-hostable software (with an option to buy a pre-configured device). We realized soon that it's hard to monetize such a product in the consumer space to the point where it can become self-sustaining.

We don't have a problem with offering a self-hosted variant. But given our limited engineering bandwidth we had to take a call on who our target market should be, and we felt that it was more important to make privacy accessible to people like my mom and dad. Hence this direction.

Radim · on Aug 29, 2021

> We realized soon that it's hard to monetize such [self-hosted product]

Spot on. We iterated on a similar product in this space: "privacy preserving", "self-hosted", "open source" etc. But focused on local AI indexing & search of personal videos and photos [0], rather than backups.

We ultimately shelved VideoNinja because we weren't able to find a sustainable business angle:

* Non-technical people simply don't care (happy locked into Apple / Google).

* Technical people understood the proposition, but are super stingy. Case in point, see the responses in this very thread: "$10 per year max; I can buy a HDD for less!". That's one (cheap) restaurant meal per year.

So I fully understand your decision to go "cloud". Although that immediately takes your product off the table for me personally. I want nothing of mine (of value) in the cloud.

I feel there must be a way to square that circle, the market exists.

[0] https://video-ninja.com/

pkursawe · on Aug 29, 2021

Just put a price on it, ffs! Make it extensible with plugins. To gain 100% trust make it open source. I am happy to pay good money of a local, non-leaking AI based tagging software for video and photos.

tw04 · on Aug 30, 2021

> To gain 100% trust make it open source.

I think until they've got a customer base and a proven model a happy median is to put the code in escrow and agree to give the source to paid licensees should the project be abandoned/more than x months without updates/whatever.

Z_I_F_F · on Aug 29, 2021

Very surprised no one has mentioned Synology yet. This has been done. And it's awesome!

I currently have a self-hosted google photos clone and I only paid for the hardware. Highly recommend.

naasking · on Aug 30, 2021

Synology's Moments is ok, but it has issues. Not mobile friendly at all, can only create one shareable link per album, and others can't contribute their pictures to your album. Those are the biggest issues in my experience.

iforgotpassword · on Aug 30, 2021

I'm still not satisfied, but photoprism seems to move into the right direction here. Digikam os great of you want everything on a single machine. Shot well has other advantages. None of them have a good solution to immediately and automatically import any photo taken on your phone.

mekster · on Aug 31, 2021

Use apps like PhotoSync and it will upload automatically when photos are taken.

true_religion · on Aug 31, 2021

While it unfortunately didn't work in the consumer market, there's a space for video recognition in the business space:

- Scene finding for directors/news channels. AP and other sources have a lot of material but you pretty much literally have to watch the entire video in order to find a good scene.

- Scene finding for the XXX crowd. Very underserved market.

- Scene finding for police/lawyers. While it may seem like the opposite of 'privacy preserving', defense attorneys are literally just swamped with video evidence in an attempt to make them give up. Similarly if you're suing a big company for something as simple as an on the job injury or harassment, and need to prove there's a pattern of harm... they'll give you everything and let you do the work of finding out that there was a pattern of bad behavior.

It's the kind of thing that'd be useful as an open source solution... or failing that having a company which is 100% neutral in operation is also good.

I'm currently using Microsoft for something like this because they're absolution massive and apart from their OpenAI division, they only care that what you process is legal.

thinkloop · on Aug 30, 2021

> I want nothing of mine (of value) in the cloud.

What's the issue with the cloud if you encrypt client-side? It's off-site backup. Isn't it too risky to have your life's work on a few drives in the same location?

iforgotpassword · on Aug 30, 2021

And then after a year of usage it hits the news that they botched the encryption, or that they helpfully back up the encryption key in the cloud too.

edude03 · on Aug 29, 2021

I’d pay for this if it could run locally. Not sure what it would take to be sustainable but solving this problem is worth at least $20/month to me.

pratnala · on Aug 29, 2021

I think too many technical people have too much of a distrust of the cloud. I, for one, am happy to offload as much as possible to the cloud (except latency-sensitive things like games) and not carry around drives and drives at home.

Youden · on Aug 29, 2021

I get the decision but I think it misses part of the problem: how do you convince people like your mum and dad to start paying for backups and how do you convince them to pay extra for privacy?

I suspect the way it usually happens is that somebody your parents trust (like you) tells them to sign up for a privacy-preserving backup service.

But who's going to tell them to do that? Do you have the money to pay for advertising?

Normally, I'd suspect it's the tech-savvy younger folks who'd tell them to buy something like this but with your pricing and lack of self-hosted options, I suspect you've alienated a large portion of the tech-savvy audience you need to advocate for your product.

pininja · on Aug 29, 2021

If their service works well and is convenient to use, I’ll be recommending it by word of mouth. In the case of my parents, if I can finally consolidate and de-duplicate the photos from our 3+ Apple Photos collections by pointing the service at “library” folders from a few computers and devices, I’ll be a big fan.

vishnumohandas · on Aug 29, 2021

> how do you convince them to pay extra for privacy?

We are hopeful that we will be able to reduce the pricing as we scale up and hit a critical mass.

> who's going to tell them to do that?

We plan to implement a referral program, similar to what Dropbox did, to incentivize existing customers to spread the word.

That said, you do bring up interesting points. To repeat, we aren't averse to the idea of maintaining a self-hosted variant. Just that due to our limited bandwidth we had to choose one direction over another. Having advocates is important and I suppose with time we will have clarity on how to best do this without stretching ourselves too thin.

ignoramous · on Aug 30, 2021

For our (nascent) product went the other way and prioritised self-hosting at the expense of stretching ourselves too thin, as that's always been #1 ask from folks looking for "consumer-first" alternatives.

Time will tell if it was the right way forward, but I just went with "you can't fight gravity" and built it the way folks expect it to be (ex: supabase / posthog / gitlab).

Z_I_F_F · on Aug 29, 2021

I really hope the self-hosted option becomes a thing, but unfortunately "we are not averse to the idea" means especially little in the tech world these days.

That being said, really really hoping for your success! It finally fills a MUCH needed gap in 2021 consumer image viewing software.

There are many many gaps in it right now. Synology is basically the only self-hosted photo solution that grandma could use. Honestly surprised that more people aren't taking advantage of the opportunity.

robertlagrant · on Aug 29, 2021

I think that's a bit apocalyptic. Plenty of time to observe and adjust.

nicoburns · on Aug 29, 2021

Can I suggest adding pricing tier(s) between 100GB and 1000GB? I have between 100gb and 200gb of photos, and £14.99/month seems like a lot considering I only pay £2.49/month for google storage. I'd definitely consider paying a premium for this service, but not 6x.

vishnumohandas · on Aug 29, 2021

Drawing a direct parallel with Google will make this difficult, since they own their storage and network infrastructure and have ways to monetize your data. But here's an explanation on why there are large gaps between plans:

- Our 1TB plan costs only 3x the 100GB plan. This model works under the assumption that the average utilization of a 1TB plan (across all customers) will be ~30%.

- If we were to bring in an intermediary plan (say 500GB), we would have to increase the pricing of the 1TB plan (since at least 50% will now be utilized), and also set the price of the 500GB plan to at least 2x of the 100GB plan. Both plans now appear unattractive.

- Since Apple and Google don't support per GB billing yet (which IMO would have been the fairest way to go), we had to pick buckets, and the current ones seemed like the fairest possible.

I hope this makes sense.

unclebucknasty · on Aug 29, 2021

>If we were to bring in an intermediary plan (say 500GB), we would have to increase the pricing of the 1TB plan (since at least 50% will now be utilized), and also set the price of the 500GB plan to at least 2x of the 100GB plan

What happens if you start by pricing all tiers "honestly" (i.e. reasonably profitable even at 100% utilization)? Have you determined that the market won't bear that pricing? If so, is there any way to meet in the middle?

In general, you may be erring a little too much on the side of asking some customers to grossly overpay for their actual utilization and, in practical terms, 100GB to 1TB is just an extremely wide gap, as evidenced by your parent's comment.

So, it seems that most who tip over into the 100GB - 1TB plan will be there, overpaying, for a long time. And, obviously, most people who make it to 1TB will pass through that range. So, if you do see a higher concentration of users in that range than at 1TB (as intuituon would suggest), then you're essentially "punishing" a plurality of your customers by asking them to subsidize a smaller group's pricing.

Failing other options, it may be better to do the inverse: raise the pricing of 1TB to accomodate a "friendlier" 500GB plan.

nicoburns · on Aug 30, 2021

I definitely empathise with the difficult in competing with the big cloud providers on price. Your service is inevitably going to end up more expensive. Having said that, I'd be interested to know how you're hosting the content.

When I was looking at setting up a similar service, it seemed like you Backblaze B2+Cloudflare might well be the best combination. B2 will sell you storage at $5/TB, and you can get free bandwidth out to Cloudflare's network. It's against Cloudflare's terms to use free plan for image hosting that isn't just images as part of webpages. However, one of their staff members commented on a thread that they'd likely to be willing to set up a custom plan for a business who wanted to do this. And I'd bet that Cloudflare's bandwidth would be a lot cheaper than B2's.

vishnumohandas · on Aug 30, 2021

Pre-signed URLs generated with B2's S3 APIs are incompatible with Cloudflare at the moment. We are working around this by using a Cloudflare Worker to proxy data from B2 to the client. This is currently free if you're on the Bundled plan and Cloudflare's support has promised that when they decide to start charging, they will alert us in advance.

Interestingly, Workers Unbound charges 0.045/GB which is more than B2's 0.01/GB.

A viable long term alternative could be Wasabi that offers free egress in return for a $6/TB plan. But we're waiting to see how things pan out before executing an expensive migration.

bjyule · on Aug 30, 2021

When you say incompatible, are you talking about the cache not working or something else? How are you working around this using workers?

vishnumohandas · on Aug 30, 2021

B2 documentation suggests that after adding a CNAME (eg. cdn.ente.io) for their bucket endpoint (eg. bucket.s3.eu-central-003.backblazeb2.com), you will be able to replace the latter with the former. This breaks with the native B2 APIs with the following error:

```

{

  "code": "not_found",

  "message": "/api/top_level_url_mapping",

  "status": 404

}

```

The last I checked was a few months ago, not sure if things have been fixed now.

With Workers, we simply fetch the remote resource from B2 and return it back to the client, acting as a thin proxy.

istingray · on Aug 29, 2021

Curious about alternatives. GB to GB, other services will always be cheaper. How do you help frame pricing What about charging per picture? Likely a non-starter, but you get where I'm going with this. iPod = 1,000 songs in your pocket.

If not you, someone will figure this out. Charging by the GB seems hard. What if instead your levels were: 1,000 photos 10,000 photos 100,000 photos

You might get people who store super high res files, but work that into the pricing.

vishnumohandas · on Aug 30, 2021

I had thought about this a year ago when I was pitching the product to my parents who had no idea what a GB was. But I was put off by the possibility of abuse once I extended the framework to videos.

istingray · on Aug 30, 2021

Appreciate your reply! It gets to the core of your value proposition though. Surely you could add in some limitations if needed. If it worked, maybe the biz would grow so fast you don't care about a little abuse.

Do you have any marketers to help you? Will be hard to navigate the messaging alone.

thinkloop · on Aug 30, 2021

My phone photos are 2.2 MB each. 1,000 GB's is 1M MB's which equates to approximately 450,000 photos. At $18.99/TB/year, 1,000 photos would cost ~$0.42 a year.

Photos can easily be 30 MB each or more, especially from dedicated cameras. If all photos were 30 MB it would cost $5.69 per year for 1,000 photos.

Not making any point, just calculated it for myself and thought to share.

istingray · on Aug 30, 2021

I like this line of thinking.

You know it really gets me thinking about packages rather than GB for this service. Maybe there's a "family plan" opportunity here. Do families value anti-surveillance in general, or is it simply lone actors?

Just the idea of archetypes flashed through my mind. An opportunity to sell to difference audiences. What kind of algos do individuals need, pro photographers, families?

missedthecue · on Aug 29, 2021

What about Google photos is evil? I don't get it.

SXX · on Aug 29, 2021

Okay it's easy to downvote, but I'll elaborate instead. First of all Google is trainihg AI models on your data and also able to create shadow profiles for people including those who decide against using Google services.

They also used dark pattern on Android for years by enabling cloud sync by default for everything. So a lot of people got all their photos uploaded while they had no idea about feature.

So it's not any different from Facebook that constantly tried to collect as much data on you as possible. Do you know what is evil about facebook?

missedthecue · on Aug 29, 2021

I don't really get what's evil about AI models and cloud sync.

And I don't think anything is wrong about Facebook's business model. I think most people are uninformed about it and believe that they sell personal data, but if you understand the way they make money, it's very difficult to say that there is any particular issue with it.

istingray · on Aug 29, 2021

Ah, what you really meant was "what's evil about selling my data?" which is a much larger question. And it sounds like you already have your answer.

fomine3 · on Aug 30, 2021

They actually don't "sell".

krageon · on Aug 31, 2021

They take your data and turn it into something that has value to them. With actual selling, that something is money. In this case it could be something else, but saying it that way will not help the general discourse of this problem at all. Much like being pedantic over terminology.

istingray · on Sept 1, 2021

Ouch. This post reminds me of that one about GoogleSpeak: how Google limits thought about antitrust https://zyppy.com/googlespeak/

the_biot · on Aug 29, 2021

The other day I sent out a link made with Google Photos' "create link" function. That's not a share to another user, just a link that anyone can open, no Google account required. But one person showed me that hitting that link on her phone, Google wanted to authenticate her before showing the picture.

That is utterly unacceptable.

Z_I_F_F · on Aug 29, 2021

Genuinely curious - could you elaborate on why that is so unacceptable? What does requiring authentication imply, or lead to in the future?

Tempest1981 · on Aug 30, 2021

Prevents sharing with friends who don't have a Google account. It breaks what could be a general purpose sharing mechanism.

missedthecue · on Aug 29, 2021

This sounds like mild inconvenience.

What's evil about that?

naasking · on Aug 30, 2021

Mild inconveniences can become problematic at scale. One person taking a crap in a lake is typically not a big deal. 1,000,000 people doing the same is a serious health risk. Scale matters.

sam0x17 · on Aug 29, 2021

Yeah, if you are client-side encrypted, where you choose to host doesn't really matter because even with a warrant there is nothing you could do to recover data, so why not go for something like Wasabi?

woofie11 · on Aug 29, 2021

I can pay for a terabyte of Amazon Glacier for $50/year. Amazon Deep Glacier is $12 per month.

$300/year for 2TB isn't happening. I can buy a 12TB HDD for less, if I shop around.

I'd like a service like this to keep small, well-compressed 1080p or 4k photos available for instant access, and original files in archival storage of some kind.

I'm totally glad to pay the $10/year for the baseline service, and another $12 for deep glacier costs. I'm not glad to pay thousands of dollars for a service like this over the lifetime of my photos. I'm not quite sure where the line between that is.

I'll also mention: open-source, data export, and the option of self-hosting is helpful. I don't want to spin up an EC2 instance for this when I can buy $12, but if you go out-of-business, I'd like to have the option. Could also be an option you only guarantee if the service is discontinued or has substantially different costs/terms.

Strom · on Aug 30, 2021

> I can pay for a terabyte of Amazon Glacier for $50/year. Amazon Deep Glacier is $12 per month.

You can pay even less to store that data in /dev/null. To make a more realistic comparison you should also include data retrieval & data transfer costs. Reading a terabyte from those services costs around $100.

woofie11 · on Aug 30, 2021

I can think of close to zero times when I would need my full photo collection, in full resolution, all at once. In most cases, for showing photos, even 1080p highly-compressed is fine. In rare cases, I want to edit an old photo, and I want the original RAW file in full color depth and resolution.

mrobins · on Aug 29, 2021

With Amazon and Google you’re paying half in monthly fees and half with your mineable data. This service seems geared towards people who don’t want that.

Rolling your own on top of a cloud storage provider is great too but for an incremental $100-$200/year some people would pay for something that “just works”.

headmelted · on Aug 29, 2021

I’d love for something like this to exist (a fast, clean, well-designed mobile and desktop app for backing up my photos with E2E), but I’d only switch from one of the big providers if it were FOSS and I can bring my own backend target (e.g. S3, SMB, FTP).

In a perfect scenario I could generate my own private key to plug into my client devices and just have everything push to private S3 (and then from there archive to the cheapest, coldest glacier tier after it’s been synced to my home storage).

This to me would not be that complicated to build, but would essentially provide E2E Photostream and a backup of last resort in the cloud.

Obviously (as is the problem with all FOSS) you have the dilemma of how do the developers get paid, which I’m sure is why you went down this yet-another-paid-cloud-provider route instead of what I’ve suggested above.

All that said - I like what you’re trying to build, I could see it being useful to some, but providing E2E photo storage as a direct-to-consumer service is IMHO just asking to be held liable later for what your users store there should you gain any considerable traction.

hiimshort · on Aug 29, 2021

I'm sure this isn't a popular opinion due to the technical know-how involved, but these days I much prefer to selfhost my own services. Far too many times businesses have gone under, changed their practices, had pricing wildly fluctuate, or remove features I wanted. Having setup a handful of useful services on a cluster, I have much more peace of mind involving my data, feature access, etc.

I would love to see a FOSS version of ente available for me to host. My family is currently split amongst multiple photo library services and it'd be nice to say "Here's ours."

siscia · on Aug 29, 2021

Well you can, I wrote how here:

https://redbeardlab.com/2021/08/03/my-syncthing-setup-cheap-...

The nice thing is that S3QL allows setting a secret key, so your files just get encrypted before to be pushed on the cloud.

barbazoo · on Aug 29, 2021

+1 for custom storage target

iknowstuff · on Aug 29, 2021

I tried it, but unfortunately the complete lack of auto-categorization in all of those e2ee photo storage apps renders them unusable for anyone with a large library. Ente is not the first one to do this, there are many others with similarly lacking UX, like MEGA.

Both Apple Photos and Google Photos:

1. have easy search by location on a map of the world.

2. allow browsing to any date in an instant.

3. index photos by objects/faces and allow for instant searching - Apple even does it on-device.

Also, frankly, I don't trust you to stay around for long, so I would appreciate the option to store encrypted photos on a cloud of my choosing that I already pay for, with a separate subscription for using your app. Not sure what the Venn diagram of <cares about privacy>, <willing to pay for your storage>, <needs excellent browsing experience> looks like.

Looking forward to an app which works for people with large libraries. :)

vages · on Aug 29, 2021

All the features you mention are already addressed in the original post as planned future developments. Knowing that they are planned makes me put my trust more in Ente than in Mega (which I use as an alternative to Dropbox and am very satisfied with). Not that there’s anything wrong with confirming interest in their planned features; I’m just pointing Ente’s plans out for anyone who scrolls right to the comments.

As for possible bankruptcy, you can never be too certain, but it’s easier to stay in business with Indian costs of living than US. (The company is located in India.)

dotBen · on Aug 30, 2021

Have you tried the Synology Photos app (https://www.synology.com/en-global/DSM70/SynologyPhotos [1])

While it does have some kinks it's surprisingly good and has the features you are looking for in a locally hosted/publicly available option. You do have to buy one of their NAS's however.

I have moved over to this partly for privacy and partly due to cost (I produce way too many photos per year to store them economically at Google)

[1] fyi this is the reasonably 'new' instantiation as linked to here, they EOL'd the very old, different app of the same name from their v old NAS's. Adding that here in case anyone has or buys an old NAS, you may get the old version of the app - I think you need a NAS with a decent processor to perform the face detection etc.

UseStrict · on Aug 30, 2021

The biggest problem I've faced with their app suite is it seems to make my disks spin 24/7, constantly seeking even if there is zero external activity. It wouldn't be such a big problem if I didn't live in a small apartment and have to listen to them seek all night. Other people have reported the issue, but it doesn't seem like they plan on addressing it.

naasking · on Aug 30, 2021

I think it only does this until the catalogue has been indexed, depending on what options are enabled. For instance, if face matching is enabled then it has to process all of the pictures for faces and group them.

dotBen · on Aug 30, 2021

I'll be honest I keep my NAS in a purpose built server closet in my house which is shielded.

Maybe costs of SSDs are coming down enough you could use those instead?

novok · on Aug 30, 2021

I personally would want both options. I use mylio with many similar features and it has e2ee but you manage your own space / cloud and your still paying a monthly fee.

For the non nerd friends, managing your own cloud space is mostly a non starter although. The best choice is cloud storage managed by the provider as an option, along with the self hosted option.

istingray · on Aug 29, 2021

Thank you! As someone who as of a week ago is hardcore switching from Apple to Linux, I applaud you. I've purchased a 16" MBP, both Airpod models, iPhone, and iPad in the last 24 months. Now on to System76!

Whatever the past is, I believe there's a new market in 2021 for Apple-switchers that will unleash new funds for companies like yours. De-Google movement will pale in comparison to this in terms of economics. Looking into signing up just on principle. Non-E2EE encrypted, closed source, without ability to self host is a dead end, why put a penny more towards it. Open source options may suck today but it's the only path forward. Thank you for what you do - whether your company succeeds or simply inspires 1,000 new companies in its place.

What are your plans for Linux support? Your site only mentions Android and iOS, I see electron mentioned, but again I'm one of these Apple switchers, I have no idea what I'm doing really but I'm willing to pay for solutions.

Take my money!!!

Spooky23 · on Aug 30, 2021

I think that you are overestimating the size of the audience of people who have nerd rage over whatever we’re pissed off at Apple about or have meaningful concerns about government surveillance.

They’d be better off focusing on making a better user experience instead of E2EE drama.

istingray · on Aug 30, 2021

The size of the audience may be small indeed. But Apple users on average have deep pockets and are willing to spend. A quick search suggests Apple users spend >2x what Android users spend. No data yet on what Apple users spend compared to Linux users. It's part of the reason "but 5% marketshare!" was never a good argument against the rise of Mac/iPhone.

ludjer · on Aug 30, 2021

The moms and kids dropping 1000$ on candy crush every month are not going to switch off apple. The big spenders in apples echo system are not the tech literate. The tech literate normally cost way more to maintained from a client perspective. Also they are allot more concerned over costs with regards to space as seen by allot of comments on this page talking about how expensive storage is.

saladuh · on Aug 30, 2021

> without the ability to self host is a dead end

Indeed, this is why you would be foolish to use Ente as you cannot self host it. At any point they can choose to lock things down, make their clients closed source, etc etc, and you'd once again need to spend time jumping ship because you'd need to find a new ecosystem.

Ente is just convient and is coming at the right time (hence the massive amount of upvotes) but does not give you total control nor your freedom back. Using them instead of something you can self host is just running in circles.

> open source may suck...

What? This was extremely random and out of place with the rest of your comment.

What you really want, if you care about self hosting and all the other stuff you mentioned, is Nextcloud[0]. And if you don't want to self host yet, you'd be better off hosting Nextcloud in a VPS, even on Linode you can just 1 click deploy a Nextcloud instance in their app store[1]. That way you don't become dependent on a service you cannot control/deploy yourself.

[0] https://github.com/nextcloud [1] https://www.linode.com/marketplace/apps/linode/nextcloud/

raman162 · on Aug 30, 2021

I think ente does fill a niche, people don't mind paying dollars for companies because it is supposed to guarantee a level of service/polish. And in the case of photos, if the service were to shut down, there's very likely a path one can take to perform a migration.

I'm a big user of open source solutions, I use Linux on my machine and use syncthing to sync files across all my devices. I'm aware that my solution is not doable for everyone and that's the problem with most open source solutions, the lack of polish/ease of use. There are tons of systems that aren't open source that we are forced to rely upon for day to day, airplane software, traffic lights, telecommunications) and we've just accepted it because of convenience and trust.

What I'm trying to say is that we don't have to worry about self-hosting everything and force ourselves to only use open source tools. I do think that if we do use private tools, we should understand how our data can be exported to a new system if necessary so we're not "locked in".

istingray · on Aug 30, 2021

Standard Notes seems like a good example of this balance to me. While you can self host, I assume 99% of people don't. It must be an option, otherwise I wouldn't use it.

istingray · on Aug 30, 2021

Yes, Ente needs to have self-hosting on their roadmap or I won't support it.

> What? This was extremely random and out of place with the rest of your comment.

Edited to say "open source options may suck today"

Thanks for giving me the chance to explain. My comment here may give more context: https://news.ycombinator.com/item?id=28321460

I've tried NextCloud, even 1-click hosted by a third party. For all the power, it's not built with me in mind, it seems to treat my photos like files/data, not like photos. I want to pay money for that extra oomph, for algorithms, searchability (about $10/month for my photos seems about right), and I want to pay money so I don't have to pay with my time. Is there something I can buy that's on top of NextCloud?

raman162 · on Aug 30, 2021

There are many people like you willing to spend money on a good solution that gets the job done who have no interest in self hosting and reviewing the source.

After experiencing the ease of Google photos, any basic file management system to store photos is a downgrade after that.

If ente can figure out how to do the extras (search, face matching) without invading privacy (not even sure how possible this is) I can see this being valuable to the people who want to de-google and maybe even de-apple.

saladuh · on Aug 30, 2021

> not built with me in mind

Fairo

> on top of NextCloud

Not that I know of. Could have a look through the nextcloud marketplace for something or another. Tbh, I don't see any open platforms having Google/Apple photos kind of functionality for at least a little bit. Google and Apple trained their algorithms on the people using their free tiers for years. Google especially had access to so much information on the user using Google Photos that it was able to build the algorithms it has today. For an open platform to have this functionality, it would need to wait for an open source model/algorithm to exist, else it would need to build it itself by using user data (no E2EE then).

Unless Google open sourced whatever models it uses in Google Photos today, don't expect this level of searchability yet. Actually even if Google did, it would probably be so tied to Google user information and be incompatible with E2EE.

istingray · on Aug 30, 2021

How to create an open source training set without surrendering my data? Like Numerai but instead of a hedge fund it's photo data: https://numer.ai/

vishnumohandas · on Sept 6, 2021

Sorry for the delayed response, I missed this comment.

If you're on Linux you can either use our web app[1] or our desktop app[2]. The latter is just the former wrapped in electron, but with the ability to sync uploaded files to your local disk drive.

[1]: https://web.ente.io

[2]: https://github.com/ente-io/bhari-frame/releases/latest

meibo · on Aug 29, 2021

I don't think I'm ready to invest in a photo hosting solution again, be it with my time, my money, or my data, without it being open source/self-hostable or at least open core with a community behind it.

Been duped too many times.

jwr · on Aug 30, 2021

Similar sentiment here. I wish this project well, but photo storage is a long-term thing, and I've been bitten too many times (most recently by Apple shutting down Aperture, which left me with big libraries which are very difficult to migrate).

I considered writing my own software and making it open source, but then realized that photo hosting/sharing software with password-protected sharing features will be used by criminals to store/share CSAM. So, if I end up writing my own solution, it will sadly not be shared with anyone.

Incidentally, I think this service will run into a similar problem: end-to-end encryption is great, but if it gets to a certain size, governments will intervene.

mixmastamyk · on Aug 29, 2021

Curious about the details of how you were duped.

ncann · on Aug 29, 2021

Not OP but I have had many cloud photo accounts in the past: myphotoalbum, Kodak Gallery, photobucket, Flickr and more. Eventually all of them either shut down, or got sold and became unmaintained. Google Photos and Apple's are the only ones that I can trust will still be around in 10 years' time.

meibo · on Aug 30, 2021

Picasa as well, though Google nowadays does a pretty great job at getting all pictures you have on your account together with https://get.google.com/albumarchive/

marwis · on Aug 30, 2021

Pity that once uploaded there's no way to get your data back from Google. API scrambles EXIF location metatada while Takeout, besides being pain to use on an ongoing basis, fails if you store too many files.

vishnumohandas · on Aug 30, 2021

FWIW, ente processes all of the location metadata generated by Takeout during an import via web.ente.io.

marwis · on Aug 30, 2021

That's probably the best once can do other than reverse engineering the protocol used by Google Photos Android app - as that app seems to be able to download files with full exif, unlike official API.

Unfortunately, as mentioned, multiple users report that Takeout does not work once you get past certain size (I have 350GB and it fails every time). It's been failing for years, probably always. Of course Google doesn't care.

I guess if someone was in EU they could try to ask Google for their data under GDPR data portability, face inevitable non-answer and then go to court if they are determined enough.

vishnumohandas · on Aug 30, 2021

I sincerely hope that someone sues.

Google has blocked access to their APIs for migration[1] which IMO contradicts with their stance on data portability[2]. It is hard to assume good intent here.

[1]: https://developers.google.com/photos/library/guides/acceptab...

[2]: https://datatransferproject.dev

marwis · on Aug 30, 2021

As the famous saying goes, never ascribe to malice that which can be explained by incompetence. This is Google we are talking about, they are infamous for their lack of strategic focus and disorganization.

ridaj · on Aug 29, 2021

What are your plans for when your app is found to host content such as terrorist executions, child porn, etc.? (This isn't trolling, it's something that eventually happens with every product, and I've been wanting a non-Google version myself but wondering how that kind of abuse would be dealt with.)

cf_ · on Aug 29, 2021

Since it‘s a paid service with user accounts. You would be able to ban users that have been reported to use this service for illegal means. The same question can be asked to WhatsApp / iMessage / Signal / etc.

olah_1 · on Aug 29, 2021

the answer is right here https://ente.io/transparency

sobriquet9 · on Aug 29, 2021

It does not say how often it is updated. Wouldn't it be better to say "as of 8/29/2021, we have received no such requests and we are updating this page monthly".

sam0x17 · on Aug 29, 2021

Yes, this is a good first step towards a true warrant canary, but you need to date it and provide a cryptographic hash of the content.

mynameismon · on Aug 29, 2021

I don't think they would be able to do anything about it, since (from what I could infer from reading) it is zero-knowledge, so no one from the company can access the pictures. I might be wrong, though

cf_ · on Aug 29, 2021

Well, depending on legislation, they could be ordered to change the code to send the user password to them on next login for that account and then decrypt everything…

commoner · on Aug 29, 2021

The architecture of Ente (https://ente.io/architecture) prevents your unencrypted master key from being exposed to the server. The password authentication appears to be client-side, which means that the data could not be compromised solely by a malicious server-side change.

Now, Ente could still change its web application to somehow leak the master key and not disclose the changes in the source repo. One solution for this vulnerability is to package the entire web client as a browser extension, which is what Mega is doing:

https://github.com/meganz/web-extension

dane-pgp · on Aug 29, 2021

There are a couple of other ways to mitigate the problem for web applications. If you're willing to install a browser extension, then it might make more sense to use the Signed Pages extension[0] which applies PGP signature checking to web pages. The other solution is to use Secure Bookmarks[1], which combine SRI integrity hashes with Data URIs to ensure that a fixed bundle of JavaScript is running in the page.

[0] https://github.com/tasn/webext-signed-pages

[1] https://coins.github.io/secure-bookmark/

user-the-name · on Aug 29, 2021

Yes, and that is a problem.

mynameismon · on Aug 29, 2021

What is the problem/why is there a problem?

shuckles · on Aug 29, 2021

When push comes to shove, technology is subservient to society: https://en.m.wikipedia.org/wiki/Lavabit

user-the-name · on Aug 29, 2021

Well, first and foremost, if I ran a service, I would not want to help either terrorists or pedophiles. I would be very unhappy if I was doing that.

Secondly, if you do provide service to terrorists or pedophiles, and take no steps to stop doing so, law enforcement and society in general is not going to be very happy with you.

koheripbal · on Aug 29, 2021

The answer to this question is why the only solution in the long run is local storage.

imhoguy · on Aug 29, 2021

Just imagined a distopian future where storing data locally would be illegal, for the society good of course /s

fortran77 · on Aug 30, 2021

Not when you have government-mandated software checking your local files against hashes. Not today, but someday.

claudiojulio · on Aug 29, 2021

It is not possible to prove this, because the photos are encrypted.

thesuperbigfrog · on Aug 29, 2021

Encrypted content can be decrypted.

Links and data tranfers can be traced.

Warrants and suponeas can make such traces / actions legal.

lol1lol · on Aug 29, 2021

something that only showed up in mainstream media 10 years after smart phones got launched. gawd.

barbazoo · on Aug 29, 2021

Please please support custom storage back ends, I'd love to use my Dropbox or S3 or whatever to still fully own my pictures. And I'd love to pay extra to opt out of and analysis, tagging, etc of my photos. Basically I'd like the interface to be similar to Google Photos but with a privacy focused storage engine and clients.

DenisM · on Aug 29, 2021

I concur. However storage is how they plan to make money, so there will need to be a different monetization strategy for BYO storage. As yet I can't imagine any.

EDIT:

I think have an idea! Add the S3/OneDrive/Etc support but comment it out. To make use of it one would have to download the source, XCode, compile it, and deploy it. This puts a cap on the number of people who can do that, so you won't end up with everyone getting a free copy. Those people who are able to do it are likely to be asked for advice by their less techy friend, so this is basically free software to key influencers.... Ok, so this does not sound as exciting as it did before I started typing, but maybe this will lead to something...

fy20 · on Aug 29, 2021

The problem with that is that some kind fellow on GitHub will clone the project, uncomment the code to enable the premium features for free, and change its name. If it's released under a FOSS license, the original authors have little recourse.

This is what happened with Emby (a media server like Plex). The backend was open source and there was a license to activate premium features. Somebody cloned it, and then released the premium features to everyone for free.

vishnumohandas · on Aug 29, 2021

So it's a little more complicated than that.

Our API server runs the following

- authentication

- replication

- differential sync

- and a few more errands that are necessary for the apps to function

The solution to this would be to offer a self-hosted variant where you can plug in your S3 credentials. But like I mentioned else where in this thread, maintaining such a project comes with an overhead we cannot afford right now. Hopefully sometime in the future we will be able to afford the necessary engineering bandwidth.

barbazoo · on Aug 29, 2021

I like how Joplin does it for notes. You authorize them as an application in Dropbox or give them credentials to a S3 bucket. Don't get me wrong. I want to pay for your service. I just have to be able to access and decrypt my files if you had to shut down your service all of a sudden.

vishnumohandas · on Aug 29, 2021

Our pricing model is such that the product can self sustain itself. Also, we have a desktop app[1] that syncs your uploaded data to a local drive, so you don't have to worry about a lock-in.

But even if we do have to sunset the service due to unforeseeable reasons, our cold storage is relatively inexpensive and we will give our customers ample time to migrate out.

Also, in such a scenario we would want to publish our entire system in an easily deployable way so that all our efforts would not be in vain.

[1]: https://github.com/ente-io/bhari-frame/releases/latest

barbazoo · on Aug 29, 2021

I see where you're coming from and I really appreciate that you're taking the time to respond. I know it's unlikely for a service like this to shut down from one day to the next but it's not impossible, plus the whole thing about a service having the ability to shut me out of my own data, that's just scary. And many of us are already paying for storage on Dropbox and have secondary backups set up for instance. I'm just saying that this would probably convince more people to switch, leveraging a service they're already paying for plus whatever you're charging to facilitate - less than the full service with storage would cost but enough to make you some money as well. Again, offering privacy in a field that was previously devoid of it is a great step in the right direction.

toredash · on Aug 30, 2021

I would pay for a self hosted solution, or for a solution where I can plug into a backend you support.

I would also pay upfront, e.g. kickstarter

papito · on Aug 29, 2021

Heh. Yeah. Been building something like this, where you can have your choice of metadata storage and file storage. Out of the box, it would be Sqlite and the local FS, and then you can become adventurous. Postgres and S3? Elastic and S3? Sure.

Needless to say, years later, I am still building it. For one guy doing this on my own time, it's a lift. Maybe after I quit my job soon :)

barbazoo · on Aug 29, 2021

Is there something to share and possibly collaborate with others? Just now on the drive home I contemplated doing a POC with S3 storage but I acknowledge hoe much work that probably would be.

papito · on Aug 29, 2021

My journey with this started back in Java and Play 1. Now it's a Scalatra project. I am rewriting the front-end because the original was written with JS5 and Knockout, becoming essentially dead on arrival and pretty unmaintainable.

The idea is that the "engine" is going to be open-source, but the UI would be free and proprietary (you would be able to bolt on your own UI).

Once the UI is presentable to a point where I can actually test the engine against it, it would be ready for collaboration. But again, it's been a rough stop and go. No wonder something like this does not exist.

To be accurate, this is not a photo management project, it's a full on DAM. But I am doing photos first. Could end up being less ambitious at first, however. Even the baseline is a massive project.

mnming · on Sept 4, 2021

you may want to take at look at: https://www.boxcryptor.com/en/

pilingual · on Aug 29, 2021

Re: Shared Albums

>the receiver just needs a free ente account.

I feel like there should be an even more frictionless option to make it easy for family to access photos. For example, if there were a way to just trigger a mailing list when an album is added to, that would be perfect. “Here is an update on our trip: [link]” I love that you mention you are security and privacy focused, and I see how this could conflict with that mission. Perhaps a tradeoff here could be allowing one viewing via link and future viewings require account?

vishnumohandas · on Aug 29, 2021

> if there were a way to just trigger a mailing list when an album is added to, that would be perfect

We can do this if all of the participants are already on ente.

> allowing one viewing via link and future viewings require account

We are hoping to come up with an implementation similar to this where in a link to an album can be shared with N devices. We will persist an accessToken on the viewer's localstorage so that they can re-view the album multiple times without having to sign up.

sam0x17 · on Aug 29, 2021

It's funny, I see this being the first feature they kill off unfortunately when it becomes the new super easy way of sharing CSAM on shady forums.

explodingcamera · on Aug 29, 2021

This is looks super cool, however not something I'd be interested in using myself if I can't selfhost it (at least it looks like thats not possible from the website).

YPPH · on Aug 29, 2021

Self-hosting a zero knowledge service is probably unnecessary.

If you're hosting the service, there's no need for data to be encrypted client-side. Unless, of course, you were intending on running the service on a public cloud which you didn't control, but that's something I don't think many privacy conscious folk would do.

There's plenty of open source, self-hosted alternatives to Google Photos.

sam0x17 · on Aug 29, 2021

Yeah, having attempted to operate a service very similar to this (only more focused on general encrypted cloud storage) I will say there are no good economics in usage-based billing. You're much better off selling a license to use the software and give users the ability to use common cloud storage providers (minimally the s3-compatible ones but also things like Google Drive) as the backing for this. Even safer from a legal perspective would be not having accounts at all and allowing users to purchase a 1-year license based on license keys that are cryptographically validated but not stored anywhere. Then it's impossible to do anything user specific whether you are compelled to or not.

cdata · on Aug 29, 2021

To me it is a canary signal that I have the option to self-host.

Most likely, QoS would be better from ente's hosting and I would be inclined to take advantage of that. An open source server can be audited and offer an off-ramp should their service no longer suit me.

Then again, the economics of enabling self-hosted infrastructure are probably less exciting compared to locking users in to marked-up, white-labeled infrastructure.

user-the-name · on Aug 29, 2021

How do you know it's zero knowledge?

YPPH · on Aug 29, 2021

The source code of the client-side apps appears to be available on GitHub. So if they're bluffing, it won't be too long until someone calls them out on it.

TheRealPomax · on Aug 29, 2021

Without a fully described mechanism to confirm that the client you download is not compiled with additional code (i.e. without specifying exactly how the client is compiled, using which version of which compiler, and which compile flags, dependency versions, etc) any kind of "the code seems to be on github" is kind of meaningless.

dane-pgp · on Aug 29, 2021

Ideally they should support reproducible builds so that anyone can confirm that the hash of the app corresponds to a specific tag on the source repository. Unfortunately app stores are making it harder to know what the hash of the app you are installing is, but for side-loading this should still be possible.

For web apps, the situation is even more difficult, but there is a technique called Secure Bookmarks which allows you to confirm that a specific bundle of JavaScript is running (at the expense of some usability):

https://coins.github.io/secure-bookmark/

ignoramous · on Aug 29, 2021

F-Droid supports reproducible builds. Any serious FOSS app, I think, must priortise publishing to F-Droid.

user-the-name · on Aug 29, 2021

Unless they only send compromised code to you personally and nobody else.

dane-pgp · on Aug 29, 2021

One way to mitigate that is through Binary Transparency, which would allow people to detect if a release is made for which there is no source code available (assuming the project already has reproducible builds). There is already a project attempting this for Arch Linux packages[0].

Of course it's still possible that an update could be sent to everyone which contains some code that only runs when a certain username is entered, so users would need to avoid updating the app until an audit by a trusted third party had approved it.

[0] https://github.com/kpcyrd/pacman-bintrans

mynameismon · on Aug 29, 2021

https://ente.io/transparency/

user-the-name · on Aug 29, 2021

That's just a non-binding promise. If that's enough for you, you don't need encryption at all.

commoner · on Aug 29, 2021

I think the correct link is: https://ente.io/architecture

user-the-name · on Aug 29, 2021

Again, just a promise.

lol1lol · on Aug 29, 2021

self hosting is not worth the time and effort.

andrewmunsell · on Aug 29, 2021

That is not categorically true.

On the business side, there's plenty of companies that have offered and succeeded with self-hosted software. On the client side, there's many individuals like myself willing to dedicate time, money, and effort to self-host services. I spent quite a bit of time setting up my NAS with self-hosted services, not only because the number of photos and media I store would be prohibitively expensive to host elsewhere (I do photography and videography as a hobby, 120 fps 10 bit footage adds up), but because I enjoy the hobby.

lol1lol · on Aug 29, 2021

we have so many consumer facing apps. you'd want to maintain all those and actually have a life to use those? good luck!

necovek · on Aug 31, 2021

Not everybody has to use "many" apps. You can only self-host those you care about.

novok · on Aug 30, 2021

Another thing to keep in mind with this kind of software is tracking data loss, corruption and deletion. I've used photo management services before, and have had data loss that I can't explain from this year or that year. Did I delete it? Did I do a migration wrong? Did the software silently delete it? I'm not quite sure. What is even worse is you cannot get 'another copy' of these photos from elsewhere, because they're all unique.

Having a 'recycle bin' and an ability to see the history of photo deletion, modifications and imports can be useful in tracking down what causes data loss. Also having masters accessible in a simple plain directory is essential in being able to audit that the software is working correctly, can be backed up in a simple manner and if your service goes belly up, is easy to migrate from.

Another issue is bitrot. Your desktop can bitrot modify a photo, and then your photo management software detects this as the 'new version' and destroys the original good version. You have to make sure you mitigate this by storing a hash on import and restoring to the original hashed version.

vishnumohandas · on Aug 30, 2021

Sharing some of the steps we've taken at ente to reduce the probability of such events:

- All files uploaded to ente are versioned and older versions are available for 60 days from the day you updated them.

- File deletions are performed only as a function of user action. Deleted files are again recoverable for 60 days.

- Two copies of each file are maintained with separate storage providers. Both of these providers offer 11x9 durability.

- For each uploaded file, we compare the number of bytes uploaded from the client to that received on the server and request a reupload in case there is a mismatch (to be replaced with a hash check).

We understand your concerns and will continue to invest in steps that improve data integrity and durability.

joshmn · on Aug 29, 2021

Super cool. Did you roll your own storage solution or are you using one of the many cloud providers? If the latter, which one? I ask because I've done a ton of work in optimizing costs in this area (at large scales), and as the top comment mentioned, $15 is kind of steep for 1TB.

vishnumohandas · on Aug 29, 2021

Hey, we're currently using two S3 compliant storage providers (Backblaze and Scaleway). I would love to talk more about how we could reduce our pricing. Please let me know if I can reach out to you over the email mentioned on your HN profile. Thanks!

joshmn · on Aug 29, 2021

More than welcome to!

dddw · on Aug 30, 2021

Oh please do share some nice tips in this regard

chmod775 · on Aug 29, 2021

Very reasonable pricing, though you could advertise the free 'trial' tier a bit more prominently. I thought the service was paid only until I re-checked the pricing page and read the tiny gray on black text before writing this comment.

You also didn't set a single tracking cookie. Nice.

vishnumohandas · on Aug 29, 2021

I'll increase the opacity of that line, thanks for the feedback!

benbristow · on Aug 29, 2021

Your homepage says "protect your photos/faces etc. from algorithms"

The algorithms are what makes Google Photos; Google Photos. If I wanted to just store my photos I'd throw them in a S3 bucket or Dropbox or something.

Google Photos lets me automatically categorise my photos by person, lets me search my library using text search for anything (e.g. I can search 'museum' and see pictures I've taken in museums). That is where the real value of Google Photos comes into play.

> But we are far from where we want to be in terms of features (object and face detection, location clustering, image filters, ...) and user experience. We are hoping to use this post as an opportunity to collect feedback from fellow hackers.

So you're going to implement algorithms then?

vishnumohandas · on Aug 29, 2021

> So you're going to implement algorithms then?

Yes, we will implement the algorithms, purely on the client side, such that we don't hold indexes to your personal data.

But I understand how that piece of text could have thrown you off, I'll think of ways to rephrase it. Thanks for pointing it out.

godelski · on Aug 29, 2021

Actually I'm really curious how you do this. If the photos aren't stored client side, then how do you search? Do you have a thumbnail of every photo client side? Is that enough? I mean ImageNet scores are still pretty low for small/fast neural nets. And ImageNet isn't even representative of real world photos. So obviously to be successful you're going to have to continue training. So how do you do this in a privacy preserving way? Even federated learning can have some issues because images can be reconstructed from gradients.

vishnumohandas · on Aug 29, 2021

> Do you have a thumbnail of every photo client side

In the happy path the files/thumbnails are indexed before they are uploaded. But we are designing a framework that will pull files/thumbnails for indexing if they are unindexed or indexed by older models.

> how do you do this in a privacy preserving way

Our accuracy will not match that offered by services who index your data on their servers. But there's a trade off between user experience and privacy here, and we are hopeful that ente will be a viable option for an audience who is willing to sacrifice a bit of one for a lot of the other.

rock_hard · on Aug 29, 2021

As someone who has worked on systems like these let me translate:

“You stuff will be private but in return accuracy will be so bad that the UX is gonna suck!”

That’s the key piece people miss when they wanna do anything with ML…that’s it’s a different problem compared to writing code because it’s not about the code anymore, it’s about having great training data!

vishnumohandas · on Aug 29, 2021

Apple Photos seems to be using just Core ML[1] for on-device recognition and it does a pretty good job. As for Android, we plan to use tflite, but the accuracy is yet to be measured. And if customers do install our desktop app, we will be able to improve the indexes by re-indexing data with the extra bit of compute available.

We don't feel that the entire UX of a photo storage app will "suck" because of a reduced accuracy in search results, and we think that for some of us the reduced accuracy might not be a deal breaker.

[1]: https://developer.apple.com/documentation/coreml

divbzero · on Aug 29, 2021

Up until recently I’ve used Apple Photos happily since it provided a good combination of convenience plus the privacy of on-device recognition. You have a compelling product if you can convince customers you are as reliable and more trustworthy than Apple. You do face the disadvantage of not being the default option for iOS/macOS but that should be balanced by being available cross-platform in Android, Linux, Windows.

creato · on Aug 29, 2021

Core ML and TFlite are just tools for running ML models. Generating the models is the hard part, and that is what encryption will make more difficult.

vishnumohandas · on Aug 30, 2021

We will resort to models that are available in the public domain.

rock_hard · on Aug 30, 2021

Bingo!

godelski · on Aug 29, 2021

To be honest, that wasn't a concern with my question. I think most people on HN understand this aspect. My question was more about how you improve your models when you don't have the same feedback mechanisms as non-privacy preserving apps. Google can look at your photos and see what photos fail and collect the biased statistics. In a privacy preserving version you won't be able to do this. Sure, you can on an internal dataset, but then there are lots of questions about that dataset's bias and if it is representative of the real world. I mean how many people think ImageNet is representative of real world images? A surprising number.

Nalta · on Aug 29, 2021

As someone else who works on systems like these, I agree training data is the whole problem. However you can use some techniques like homomorphic encryption and gradient pooling to collect training data from client code while remaining end-to-end encryption. It's hard, but it's not impossible.

finnh · on Aug 29, 2021

Really? Have we had a revolution in homomorphic encryption such that it can be used for anything other than 1-million-times-slower proofs-of-concept?

I know IBM has released something lately, but given the source..

Does anyone use HE for the type of ML application you are describing?

godelski · on Aug 29, 2021

So I guess there is more to the question that I'm asking.

> Our accuracy will not match that offered by services who index your data on their servers. But there's a trade off between user experience and privacy here,

I think most people here understand that[0]. We are on Hacker News after all and not Reddit or a more general public place. The concern isn't that you are worse. The concern is that your product has to advance and get better over time. That mechanism is unclear and potentially concerning. The answer to this is the answer to how you ensure continued privacy.

You talk about the "push files/thumbnails for indexing" and this is what is most concerning to me and at the heart of my original question. How are you collecting those photos for _your_ training set? Obviously this isn't just ImageNet (dear god I hope not). Are you creating your own JFT-300M? Where are those photos being sourced from? What's the bias in that dataset? Obviously there are questions about the model too (CNNs and Transformers have different types of biases and see images differently). But that's a bigger question of training methods and that gets complicated and nuanced fast. Obviously we know there is going to be some distillation going on.

There's a lot of concerns here and questions that won't really get asked of people that aren't pushing privacy based apps. But the biggest question is how you get feedback into your model and improve it. Non-privacy preserving apps are easier in this respect because you know what (real world) examples you're failing on. But privacy preserving methods don't have this feedback mechanism. We know homomorphic encryption isn't there yet and we know there are concerns with federated learning (images can be recreated from gradients). So the question is: how are you going to improve your model in a privacy preserving method?

[0] I think people also understand that on device NNs are going to be worse than server side NNs since there's a huge difference in the number of parameters and throughput between these and phone hardware can only do so much.

vishnumohandas · on Aug 29, 2021

> how are you going to improve your model in a privacy preserving method

We will not improve our models with the help of user-data and will resort to only pre-trained models that are available in the public domain.

istingray · on Aug 30, 2021

This is one of your best replies in the whole thread.

Yes to this. Prove it as well.

godelski · on Aug 30, 2021

Why is it such a great reply? They didn't really answer my question.

istingray · on Aug 30, 2021

I liked the clarity of response. Public models, not user data seems a clear answer to your question?

godelski · on Aug 30, 2021

Not really. In fact it might suggest something I'm specifically more worried about. Datasets that we use in research aren't really appropriate in production. They have a lot of biases that we don't exactly care about in research but you do in production that can also get you into a lot of political and cultural trouble. So really if they are going to just use public datasets and not create their own then I expect a substantially low performance, potential trouble ahead, and I'm concerned about who is running their machine learning operations.

istingray · on Aug 30, 2021

Appreciate the detail here. Given your relevant experience sounds like something that the devs need to address.

godelski · on Aug 30, 2021

Being in the ML community I have a lot of criticisms of it. There are far too many people, especially in production, that think "just throw a deep neural net at it and it'll work." There is far more to it than that. We see a lot of it[0]

[0] https://news.ycombinator.com/item?id=28252634

istingray · on Aug 30, 2021

Wow fascinating. What do you ideally want to see in terms of datasets enabled by user data?

Having vendors vacuum up my data is sub-optimal from a privacy/ownership standpoint. I'm curious how to enable models without giving away my data. Open source models owned by society? Numerai style training (that I don't understand) https://numer.ai/ ?

godelski · on Aug 30, 2021

Datasets are actually pretty hard to create. You can see several papers specifically studying ImageNet[0] including some on fairness and how labels matter. There's also Google's famous private JFT-300M dataset[1]. JFT was specifically made with heavy tails in the distribution to better help study these areas, which is specifically the problem we're interested with here and one that is not solved in ML. Even with more uniform datasets like CIFAR there are still many features that are noisy in the latent space. This is often one of the issues with doing facial recognition and why there's issues with people with darker skin. Even if you have the same number of dark skinned people as light skinned you may be ignoring the fact that cameras often do not have high dynamic ranges and so albedo and that dynamic range play a bigger role that simply "1M white people and 1M black people". There's tons of effects like this that add up quickly (this is just an easy to understand example and one that's more near the public discourse). You can think back to how Google's image search at one point showed black people if you searched gorilla. On one hand you can think "oh got a dark color humanoid" or you can think "oh no... dear god...". That's not a mistake you want to make, even if we understand why the model made it. It is also hard to find these mistakes, especially because the specifics of them aren't shared universally across cultures because this mistake has to do with historical context.

This is still an unsolved problem in ML. Not only do we have dataset biases (as discussed above) but models can also exaggerate these biases. So even if you get a perfectly distributed dataset your model can still introduce problems.

But in either case, we don't have the same concerns in research as we have in production. While there are people researching these topics most of us are still trying to just get good at dealing with large data (and tails) in the first place. Right now the popular paradigm is "throw more data at the model." There are nuances and opinions to this why this may not be the best strategy and why we should be focusing on other aspects (opinions being key here).

Either way, "using publicly available datasets" is an answer that suggests 1) they might not understand these issues and 2) the model is going to have a ton of bias because they're just using off the shelf models. I want some confidence that these people actually understand ML instead of throwing a neural net at the problem and hitting go.

> I'm curious how to enable models without giving away my data.

Our best guess right now is homomorphic encryption. But right now this is really slow and not as accurate. There's federated learning but this has issues too. Remember, we can often reconstruct images from the dataset if we have the trained model[2]. You'll see in this reference that while the reconstructions aren't perfect, they are more than satisfactory. So right now we should probably rule out federated learning.

> Open source models owned by society?

Actually models aren't the big issue. Google and Facebook have no problem sharing their models because that isn't their secret sauce. The secret sauce is the data (like Google's proprietary JFT-300M) and the training methods (though most of the training methods are public as well as few are able to actually reproduce due to not having millions of dollars in compute).

I hope this accurately answers your questions and further expands on the reasoning behind my concerns (and specifically why I don't think the responses to me are sufficient).

[0] https://image-net.org/about.php

[1] https://arxiv.org/abs/1707.02968 (personally it bugs me that this dataset is proprietary and used in their research. Considering how datasets can allow for gaming the system I think this is harmful to the research space. We shouldn't have to just trust them. I don't think Google is being nefarious, but that's 300M images and mistakes are pretty easy to make).

[2] https://arxiv.org/abs/2003.14053

istingray · on Aug 31, 2021

godelski, I really appreciate such a thoughtful response to my curiosity.

Looking at this while better understanding the problem, I wonder what features I really want for my own photo library. Thinking of iOS photos. Matching people together seems hard. But grouping photos by GPS location or date is trivial. So we have to get clear on what features are important for home photo libraries.

I can now see how the idea of "use public libraries = solution" falls short. It neither presents a viable solution or demonstrates rigorous understanding.

godelski · on Aug 31, 2021

Hey, that's what HN is about. You got experts in very specific niches and we should be able to talk to each other in detail, right? That's the advantage of this place as opposed to somewhere like Reddit. Though expanding size we face similar issues.

These are good points about GPS and other metadata. I didn't really think about that when thinking about this problem, but every album I create is pretty much a combination of GPS and temporally based (though I create this with friends). But I think you're right in suggesting that there are likely _simple_ ways to group some things that aren't currently being done.

> I can now see how the idea of "use public libraries = solution" falls short. It neither presents a viable solution or demonstrates rigorous understanding.

ML is hard. But everyone sells it as easy. But then again, if it was easy why would Google and Facebook pay such a high rate for researchers? There's a lot of people in this space and so it is noisy. But I think if you have a pretty strong math background you start to be able to pick out the signal from the noise better and see that there is a lot more to the research than getting SOTA results on benchmark datasets.

natch · on Aug 29, 2021

You can run algorithms locally and still violate privacy by uploading private facts derived from the data with algorithms. Saying you won’t hold “indexes” doesn’t begin to cover it.

Moodles · on Aug 29, 2021

Well, it does begin to cover it. Do you have to be so strident?

natch · on Aug 30, 2021

What do you think is meant by indexes?

cientifico · on Aug 30, 2021

But that will mean that for every version of the algorithms, it have to read all the photos since 15 years ago... my phone battery will die soon.

And if I need to have other kind of client... like a nas to do that... Why I need the cloud?

vishnumohandas · on Aug 30, 2021

> phone battery will die soon

Indexing will be opt-in. You will be able to run the indexing only on your desktop client for instance.

> Why I need the cloud?

So that you don't have to manage your own storage infrastructure? But if you would like to do that, then there are self-hosted alternatives that will better serve your use case.

istingray · on Aug 30, 2021

Agree with the above poster. I don't care about algorithms. I want algorithms. But I want algorithms that only work for me. Screw off everyone else.

Apple used to sell this. Then they stopped.

Aspos · on Aug 29, 2021

Those "algorithms" can run locally, on a NAS or a desktop, generate the metadata and make it available to you only on your mobile.

I can see myself paying for such software if it was mature enough.

izacus · on Aug 29, 2021

Synology Photos is one such solution already for example.

Aspos · on Aug 29, 2021

I have Synology, actually. Is Synology Photos trustable?

chias · on Aug 29, 2021

The software with these features is called Synology Moments. I use it and I mostly love it, at the very least as a backup for my Google Photos.

My experience is that it works great, provided that you're on your local network. When away from home or traveling, less so. Maybe I could configure things better to alleviate that, I don't know, but I haven't managed to yet.

Sharing is less convenient. Trying to share a photo on-platform is a terrible experience for the receiver with multiple slow redirects, so much so that generally if you're on mobile it's easier/better to just download the photo to your device and share the photo directly. The Moments android app has a flow for doing this, which is nice. It also makes a certain amount of sense: the alternative would be others connecting to your NAS online, which is always going to be less nice than just connecting to Google photos.

The search capabilities are pretty decent. It can recognize people and tag them appropriately. It can recognize some things. In some ways, I prefer searching it over searching Google Photos. But again, only if you're on your local network with your NAS.

--

Edit: see aborsy's response to me below. Looks like I'm a version behind. Maybe on-platform photo sharing is better now, I'll update the software and check it out

chias · on Aug 30, 2021

Yeah, new version is about the same.

If you want to check it out, here's a couple photos from when I picked some peppers the other day:

https://ojensen5115.quickconnect.to/mo/sharing/pgdYsVEqu

aborsy · on Aug 29, 2021

In DSM 7, it’s called Synology Photos!

chias · on Aug 29, 2021

Thanks for the heads up! Looks like I have an update to install :)

pininja · on Aug 29, 2021

For at-home NAS, is Synology the best for recreating Google services?