The unattributable “db8151dd” data breach

throwaway9993 · on May 15, 2020

Dataset for sale: [redacted]

Similar data structure: https://stackblitz.com/edit/angular-soswe4?file=src%2Fapp%2F...

Covve: This simple yet state-of-the-art app will revolutionise your business relations like you've never seen.

Edit: Response: https://twitter.com/covve/status/1261287954967941120

amatecha · on May 16, 2020

haha, I found exactly the same! https://twitter.com/amatecha/status/1261231178423517184

A user who replied to me also shared some anecdotes that indicate further evidence towards that being the source (a private email address only used for GSuite admin purposes, on her iOS device, upon which she had Covve installed) -- thread here https://twitter.com/angelalgibson/status/1261314415829237761

amatecha · on May 17, 2020

Covve has actually made a post and confirmed it was indeed their server that was breached: https://covve.com/opinion/security-incident/

Nextgrid · on May 15, 2020

The metadata in the breached records like "Imported from EverContacts" or similar supports the theory that it comes from a contacts app.

dmix · on May 15, 2020

Curious why it has people's Github and Pintrest accounts when it's contact data.

Looks like it was mined from somewhere and combined with other data...

Unless people are putting their github urls in the contact apps?

stedaniels · on May 16, 2020

A lot of CRMs are enriched with social media accounts and their web of connections.

iamacyborg · on May 16, 2020

Stuff like Clearbit can "enrich" a profile with social accounts like Facebook, Twitter, LinkedIn and Github.

https://clearbit.com/attributes

That entire market needs to be killed off with hefty fines.

Redoubts · on May 15, 2020

Oh man, what is even going on with that raid forum.

Nextgrid · on May 15, 2020

A quick glance suggests there's barely any skill in there and it's all bottom-feeders so you'd expect this to be an easy bust for law enforcement worldwide and yet they seem to be happily operating with total impunity for quite some time.

blaser-waffle · on May 15, 2020

Noobs and relatively skill makes me think H O N E Y P O T

tialaramex · on May 15, 2020

No, more likely it's like street corner drug dealing, or say, the industrial area near me that has street walkers (well I presume it doesn't now because neither street walkers nor their johns want to die of COVID-19)

This stuff happens, at a low level, and prosecuting it is expensive and makes little real difference so why bother?

It's not even like busting shop lifters and petty burglars where at least you make the victim feel better by arresting somebody even if it likely isn't cost effective overall.

alexpotato · on May 15, 2020

I remember one of the online British banks writing up a whole detailed post on how they knew exactly who has been stealing money from them.

They wrote up all of their information and sent it to the police who came back with: "Yeah, thanks. Here's the thing: this is non-violent crime and the total amount stolen is less than GBP 100,000" (don't remember the exact number but something thereabouts).

People like to think that the police are salivating for every crime that could come through the door. More realistically, they are an overworked group with less resources than they need to tackle and or solve many of the cases they are presented with.

Plus, just like you, they have to prioritize their work based on various dimensions of incentives such as what looks good to their boss, what is hard vs easy etc. For example, do you tackle the case that is small and easy to close with not a lot of publicity or the big case that may be harder to close but will net big wins in the PR budget?

Scoundreller · on May 15, 2020

Usually in that situation you could sue the person in court.

Easier to prove too.

ta17711771 · on May 16, 2020

People are still going to Starbucks, mate, you think quite high of intelligence of those picking up streetwalkers.

bryant · on May 15, 2020

The responses to the comment just below you (https://news.ycombinator.com/item?id=23190102) (and the nature of some of the corporate hits I've seen) seem to be consistent with a contacts database of sorts.

Not sure I'd go so far as to accuse a specific company on a public forum. But in this regard, the idea that a contact management app could be behind this DB is plausible.

bryant · on May 15, 2020

Adding: this dump appears to be from a source with data at least as recent as April 2019 based on a dataset I'm working with.

mattlondon · on May 15, 2020

Forked the stackblitz for posterity https://stackblitz.com/edit/angular-3nxvlm?file=src/app/app....

viro · on May 15, 2020

thank you

alexproto · on May 15, 2020

Hi all, Alex here, CTO at Covve. Just got alerted of incident db8151dd in . We’re investigating as top priority with our security experts what relation this may have with Covve. We are monitoring the feedback in this blog and would really appreciate any additional information you may have on this as we investigate (alex@covve.com).

service_bus · on May 15, 2020

It appears your organization left an elasticsearch database exposed to the internet. This happens frequently due to poor configuration.

You're either going to have logs pointing to an IP that the individual used to siphon your data, or nothing.

With an exposed elasticsearch database, you possibly had the data being siphoned by many parties, and are only aware now because of this particular incident.

If you have any operations regarding customers in Europe, you need to notify your relevant Data Protection Authority

https://edpb.europa.eu/about-edpb/board/members_en

You should also sign your engineers up for this course:

https://www.elastic.co/training/specializations/elastic-stac...

outworlder · on May 15, 2020

> It appears your organization left an elasticsearch database exposed to the internet. This happens frequently due to poor configuration.

sigh

Why is everything being deployed publicly accessible? If one is relying on their database configuration as their only protection, they are one fuckup away from disaster.

Layers, people, layers. If this is on a cloud provider, put it on a private VPC/subnet. Add a load balancer or similar serving traffic only to the instances you need traffic routed to(which are unlikely to be databases themselves, more likely web servers). Configure firewalls accordingly. And of course, configure the servers properly.

the_mitsuhiko · on May 15, 2020

> If you have any operations regarding customers in Europe, you need to notify your relevant Data Protection Authority

The entire company is in the EU. The need to reach out to their DPA ASAP.

michaelcampbell · on May 15, 2020

As of this writing, I don't think it's been determined yet whose organization this data came from, has it? All we have so far is a similarity in data format/structure.

polote · on May 15, 2020

Almost all their employees have their emails in the breach :

https://covve.com/about

email format is <first_character_firstname>.<lastname>@covve.com

mdip · on May 15, 2020

Interesting; based on what I'm seeing, it certainly looks like a matching structure and it's got enough uncommon fields in it to suggest that it's likely to be related to Covve software. There is a link in the comments to the source in question, and I don't know enough about Covve's product -- can someone run this on-prem or in their own defined infrastructure (is all of it on GitHub?) or is this a case where the data/server is proprietary to Covve and making it unlikely that someone created a compatible server with a similar structure.

Kudos for reaching out to the greater HN community as a channel for information. A lot of companies are concerned that such a public request gives the impression that they don't have a handle on things. Let's be honest: there isn't a company on a planet that, immediately following a breach, has a handle on things. Honesty is a pre-requisite to re-establishing trust in the (seemingly likely) event that this is a breach if your customers' data.

I don't envy the position you're in. By now, you've hopefully downloaded the link to the data dump[0] and have compared it against your own data to confirm that it is or is not a breach of your own operations. Please put out a communication as soon as possible if you confirm it's their data. Immediately after closing off access to the data (and I'd consider taking the whole thing offline[1]), before you take the additional steps to protect your environment from breach.

The next step is to lock it all down, everywhere. Rank the risks associated with your data; bubble that up to the components that touch it. Encrypt data and protect your private keys (HSM/virtual HSM), to the extent possible, segregate your data by risk, assign separate accounts to different risk categories and ensure lower risk accounts lack permissions to the data and cannot acquire the key to decrypt. Your "Staging", "Development" and "Test" databases ... any chance they have a snapshot of production from some point in the past[2]? Reduce the public exposure of your infrastructure -- create multiple private networks; ensure data can be accessed only by the thing or things that need to access it on the permissions and network layer. Depending on how you're set up, isolate management interfaces to a private network requiring separate authentication in addition to device authentication. Grant permissions to staff on a "minimum required to work" policy. For staff that require day-to-day permissions to high-risk assets, minimally get them a separate (individually assigned) administrative account to avoid accidental changes. But generally stick with "this person, and this alternate (bus factor), only, can alter permissions related to accounts used in production infrastructure"; ideally, requiring both for permissions changes would be awesome, but I'm not aware of broad adoption anywhere.

Audit roles assigned to everything. If this is AWS, you're going to be spending some time in AAM-related tools. Look at every account, every permission and everything it's assigned to and challenge it: does it need this much access? Can I make the access more specific (device narrowed)? Can I assign less access and achieve the same result? Can I separate out these two services with different risk profiles so permissions can be assigned more carefully?

All the best -- not a fun situation to be in.

[0] Someone posted one in the comments; might be gone, dig around the usual places and find a link from a "direct download" site if it's been taken down. (aim for mega.co.nz links; less costly, or google awesome-piracy for workarounds).

[1] I co-authored large parts of the internal security policy at Global Crossing (carried forward to Level 3) about a decade ago - we had a "Critical" category -- when triggered, a situation call started and didn't end until the issue was stable and root causes/solutions were identified. It also meant "if a device was categorized as being able to be infected (we were often dealing with aggressive malware), it was allowed to be taken offline regardless of impacts to the business" - i.e. the cost of failing to contain this is higher than the cost of turning off customers' service. We threw the switch a handful of times. It was hell.

[2] I used to lose it when I saw people doing this with live customer data... except that I've encountered it on 80% of projects I've worked, so I'm numb to it. You can roll fake data pretty easily with various different tools (online and CLI); nobody protects staging/test/dev like they protect prod and since you've determined you must protect this data in production, you don't want to have to protect dev/staging/test that same way.

bob33212 · on May 15, 2020

Did you have guarduty or VPC flow logs turned on?

xenophonf · on May 15, 2020

Troy's fighting the good fight, but it's so freaking depressing. If he has hundreds of millions of records worth of personal data from just the breaches that have been shared with him, what _else_ is out there in the hands of criminals and corporations, neither of which have the public interest at heart—only naked self interest in exploiting members of the public for as much money as they can get?

tialaramex · on May 15, 2020

Millions per day. This used to be part of one of my old jobs. A feed of stolen PII would drop into our SFTP server every morning and we'd process it.

There's no honour among thieves so there were a bunch of duplicates pretending to be "new" data, but yes there is a cottage industry of stealing smaller quantities of PII, focused particularly on email addresses and passwords (because those get re-used elsewhere) and credit card data (because you may be able to either buy something with it or at least fool your way past an immediate check on the card)

Do not re-use passwords. Like, that's the really easy "Wash your fucking hands" level lesson here. As someone who isn't employed to work with this data any more I'd say that 99% of the value isn't with like stolen passports (though we did see some passport data) or even credit cards, but the passwords.

If you hate that this is even a problem adopt and (if you write code or specify software) implement WebAuthn. Nobody would steal passwords if they didn't work. Not only does stealing WebAuthn credentials from a site's database not work (they're public, the secret that's valuable never leaves the user's FIDO dongle) crooks also wouldn't bother doing it, just like crooks don't steal farm machinery to pull candy vending machines off the wall and steal candy, whereas they do attack ATMs in exactly this way.

heavenlyblue · on May 15, 2020

One of the cool things of having a password manager is that a password manager can’t auto-complete the form for websites not sharing the domain with the old one.

If you don’t know the password yourself, then phishing is less effective as it’s quite rare that your password manager forgets that it needs to fill out the form for you.

tialaramex · on May 16, 2020

> ... then phishing is less effective as it’s quite rare

In practice users who're successfully being phished curse the password manager and override it. Not always but often enough.

WebAuthn bakes the site-specificity into the protocol thus preventing you from shooting yourself in the foot, even if you're convinced that's what you need to do.

cantrevealname · on May 15, 2020

> what _else_ is out there in the hands of criminals and corporations

Don't forget governments. Whatever criminals and corporations have that they shouldn't have, governments probably have an order of magnitude more.

Nextgrid · on May 15, 2020

For the people that use unique per-merchant e-mail addresses (like someone+amazon@...), could you try some of those aliases on HaveIBeenPwned and see which ones come up in this breach? That might shed some light onto its origin.

deng · on May 15, 2020

BTW, since many people don't seem to be aware of this: If you have your own domain, you can get informed by haveibeenpwned automatically if any mail address from that domain is in a breach. All that is required is that you're reachable on that domain through an address like 'postmaster'. This feature can be found under 'domain search'. Since I use a new address for pretty much anything this is very handy.

mysterypie · on May 15, 2020

I have a large list of unique emails to test, but they are not from a domain I control. It seems that I can test these through the API, but is there any simpler way? I tried obvious things like putting a list of comma-separated email addresses in the search form, but it doesn't work.

numpad0 · on May 15, 2020

Kinda lets adversaries figure which account used which password from which breach and until which point

koheripbal · on May 15, 2020

Unfotunately, it no longer seems to list the impacted email addresses in those domains have been comprimised, so it's not too useful.

Jestar342 · on May 15, 2020

I've found it does list them if you request the full report, but that the initial email doesn't. (note the last time I used this functionality was about 3 weeks ago, I accept it may have changed since then)

rpadovani · on May 15, 2020

Wow, thanks, I've never used the notification alert service since I use a custom email for every site I sign up.

That's cool, thanks!

ohlookabird · on May 15, 2020

Wow, thanks so much, that's really helpful!

pricechild · on May 15, 2020

That's a brilliant tip, thank you!

huhtenberg · on May 15, 2020

I am listed, but it's an address that was never used to register or subscribe to anything online. It's also under a year old.

It must've been vacuumed up from other people's contact or email data.

luckylion · on May 15, 2020

Or from the email provider, if it's not your own server.

I know that e.g. GMX has had a leak at some point (or sold data), as an email I created there ages ago was used in phishing. Okay, that's lame, but they've also used the fake name I had given to GMX, spelled perfectly. I've never used that name anywhere when signing up, so it must come from the database.

huhtenberg · on May 15, 2020

I use a private email server.

css · on May 15, 2020

For me, the HaveIBeenPwned domain search only lists one item in this breach: my LinkedIn@... email. Searching my inbox shows that the only emails sent to that address are from LinkedIn, so it probably came from a company I sent a job application (LinkedIn Easy Apply) to at some point.

devinegan · on May 15, 2020

This. My e-mail in the breach is a LinkedIn specific email. It has to be part of the clue to attribution. Social media scraping, possibly from multiple sources seems to be more likely than another LinkedIn breach.

edent · on May 15, 2020

I use unique emails. My record in this breach is just a generic "contact@" address.

Nextgrid · on May 15, 2020

Could it be from whois data? Seems like a reasonable place for which to submit such a generic address.

jerome-jh · on May 15, 2020

Or could be the spammers sanitized them.

Scoundreller · on May 15, 2020

How would they know which to sanitize???

stavros · on May 15, 2020

I use the format you mention for almost everything, but my email address in this breach is one I haven't use in something like ten years.

alberts00 · on May 15, 2020

HaveIBeenPwned now has feature set to find e-mail addresses which were breached under a domain, there is normally no need to search for separate aliases if you own the e-mail domain.

https://haveibeenpwned.com/DomainSearch

koheripbal · on May 15, 2020

Unfortunately the email notifications don't tell you WHICH email addresses leaked.

funnybeam · on May 15, 2020

Yes they do, you just have to click the link in the email and request the full report

willvarfar · on May 15, 2020

Does hibp know enough about the regular providers such as gmail that support this, to be able to attribute someone+amazon@gmail.com with someone@gmail.com?

alias_neo · on May 15, 2020

That was my first though, I also use "company@mydomain" sometimes. Too many to go through... if only I could get hold of my record....

Nextgrid · on May 15, 2020

I believe HIBP offers domain admins a way to get all their pwned users after domain verification.

esnard · on May 15, 2020

Instructions are on this page: https://haveibeenpwned.com/DomainSearch

jraph · on May 15, 2020

For people who care, it uses reCAPTCHA. I stopped there.

smichel17 · on May 15, 2020

Thanks for saving me a click. No desire to play "guess how many minutes I'll have to spend clicking sidewalks" today.

noxford1 · on May 15, 2020

If it takes you minutes to solve a recaptcha your problem might not be the recaptcha...

eitland · on May 15, 2020

It might just be that you use Firefox.

Seems anybody who doesn't use Chrome is automatically flagged even if you are logged in with a >12 years old gmail account that is linked to paid storage.

NSAID · on May 15, 2020

I use Firefox with numerous tracker blockers and only had to hit the checkbox.

smichel17 · on May 15, 2020

Turning on privacy.resistFingerprinting in about:config is the big one. The temporary containers addon also makes a difference.

throwanem · on May 15, 2020

You tend to see a whole lot more challenges if you're on a VPN. It taking a couple of minutes isn't at all implausible.

UI_at_80x24 · on May 15, 2020

It must be his fault, things like being visually impaired cannot ever be accounted for or considered.

Recaptcha is hostile to end users and makes my like a fucking hell. But your right, it's my fault.

ta17711771 · on May 16, 2020

No, the problem is Google trying to outsource creating their Waymo test data onto us.

smichel17 · on May 15, 2020

reCAPTCHA adjusts how many it makes you solve depending on how much info it can gather on you. If I disable my privacy settings & extensions I never need to solve more than 1-2. I'm not usually willing to do that.

alias_neo · on May 15, 2020

That's really useful info, thanks. I'll check it out this weekend.

bryant · on May 15, 2020

I follow this pattern exclusively, though I haven't actually received any recent HIBP notifications. I'll do a manual check.

Edit: three personal domains registered nothing. One corporate domain registered a double digit hit. If I discern any clues I'll get back to the thread.

m-p-3 · on May 15, 2020

I'm waiting for Firefox Relay to become available just to better control who has my email address and the flow of emails, but I'm worried it will make the task more difficult to follow breaches.

Maybe Mozilla could partner with HaveIBeenPwned to help dealing with that?

tinus_hn · on May 15, 2020

Remember that once you try an email on a service like that, it’s no longer unique to the merchant.

Qwuke · on May 15, 2020

If hibp started using something that guarantees k-anonymity when checking for an email, like their password service does[1], then I think it'd be possible to keep the email unique.

1: https://www.troyhunt.com/ive-just-launched-pwned-passwords-v...

VectorLock · on May 15, 2020

So many things disallow + in email addresses I don't even bother any more.

multidim · on May 16, 2020

All services so far seem to accept dots, but the number of possible dot arrangements can be quite limited, and it is a pain to actually use (figure out next one to use, figure out associated service from dot arrangement, etc).

VectorLock · on May 16, 2020

Gmail won't let you put anything arbitrary with dots. So if you're whatever@gmail.com you can use what.ever@gmail.com but not whatever+somemerchant@gmail.com. Other email system obviously can work however they want.

Scoundreller · on May 15, 2020

Or accepts it at account creation, but not at login!

mattlondon · on May 15, 2020

My gmail is on it, but not my burner-domain. So either the data is old (year or two), or they got my gmail from somewhere else.

I'd be interested to see the whole dump to see my full record...

koheripbal · on May 15, 2020

a year is not "old"

cr3ative · on May 15, 2020

It's got my generic one (firstname@), and an older Facebook login email address (facebook@, changed now since Kickstarter leaked that one). Interesting.

PanMan · on May 15, 2020

I did, and I usually use site specific emails (eg amazon@username ) but it found my "generic" firstname@username email... So no insights there.

simias · on May 15, 2020

I suspect that Troy Hunt would have noticed if there were many emails with "+someservice" in the dump since he can easily dump them all.

blauditore · on May 15, 2020

Not sure of this, because I assume only a tiny fraction of people does this, and those who do probably aren't consistent. E.g. for Amazon Prime, some might use "+amazon-prime", some "+amazonprime", some "+amazon" etc., so there would be very few overall repetitions even in a large data set.

simias · on May 15, 2020

Right but grepping for "+" in emails is also high on the list of things I'd do to identify an unknown information dump. Given that he's used to dealing with those I'd be surprised if he hadn't thought of that, although it probably doesn't hurt asking him if he did try it.

dgellow · on May 15, 2020

> Why load it at all? Because every single time I ask about whether I should add data from an unattributable source, the answer is an overwhelming "yes"

To be fair, you’re asking your followers on twitter. That’s as biased as you can have, I would be really surprised if the majority would say no.

SideburnsOfDoom · on May 15, 2020

I got notified that I'm in this breach, and I honestly don't know what (if anything) I can do with this information, which implies "If it's not actionable, why bother telling me at all?"

Unique passwords per site, with a password manager? Done a long time ago. Should I change some of them? OK, which ones? there are hundreds.

Details of what else about me is in this breech? Not clear where I can find that.

ric2b · on May 15, 2020

> Should I change some of them? OK, which ones? there are hundreds.

The ones that you know were pwned.

In theory you should change all passwords all the time, but this is a practical middle-ground between that and "never".

SideburnsOfDoom · on May 15, 2020

> The ones that you know were pwned.

Breaches like this one give no indication of which password is exposed, if any.

AFAIK, there is nothing actionable.

onefuncman · on May 15, 2020

This is a positive bias IMO, and any negative reactions that bubble up in the replies are going to be more useful.

numpad0 · on May 15, 2020

Could it be Google+? 3 of 3 my Gmail addresses associated with their profile in some way were on it. Two of it I might have used to register a domain, but the last one I used for G+ and one other website only and none of any friends know this. Also I'm not in US or have US background, can't be from American friends' phones or retailer CRM.

onefuncman · on May 15, 2020

This seems like a winner to me. Iterating a graph along some association explains the ordering mentioned in the blog post, and explains the breadth of connectivity.

anoncareer0212 · on May 15, 2020

shocked to read this, you can immediately rule it out after reading the article or looking at the sample data

Jaxkr · on May 16, 2020

It’s covve, a free personal crm app

londons_explore · on May 15, 2020

> Recommended by Andie [redacted last name]. Arranged for carpenter apprentice Devon [redacted last name] to replace bathroom vanity top at [redacted street address], Vancouver, on 02 October 2007.

Given that, surely Troy can contact those people and ask "who knew this info?". Not many people would know who replaced my bathroom vanity top...

pfundstein · on May 15, 2020

Sure but perhaps Devon used a SAAS CRM system whose servers were breached... Or maybe Andie posted on Devon's public Facebook page to organise the job. Maybe it's just the LinkedIn leaks resurfacing, etc, etc.

typpo · on May 15, 2020

I use a unique email on my personal domain for everything I sign up for.

The email contained in this breach is the one I provided to Facebook. It was probably hacked or sold from one of the handful of apps I've connected with FB over the years.

secfirstmd · on May 15, 2020

One of my emails is currently on:

"Pwned on 19 breached sites and found 5 pastes.

If this is public breaches, I would guess in reality I can probably assume it's on double/triple that for sites that have been breached but the data hasn't been posted online.

wincent · on May 15, 2020

I don't really get the utility of HIBP. The answer to the "have I been pawned?" question is, of course, yes, multiple times. I think about the only way to keep your email out of the hands of the bad guys is to not use it or give it to anyone ever, at which point you don't need an email address.

What am I supposed to do whenever I'm involved in a new breach? Burn all my accounts and start again?

koheripbal · on May 15, 2020

If you use a password manager to give you unique passwords per site, then these alerts allow you to only change the impacted site's passwords.

...though in a case like this it wouldn't help since we don't know the site.

Normal_gaussian · on May 15, 2020

The monitoring service is useful, when a leak is detected you can reset that password.

Knowing that you have been historically breached is less useful.. Until I need to convince somebody to start taking account security seriously.

Its quite sobering to discover that data breaches are commonplace.

scrollaway · on May 15, 2020

The biggest contribution HIBP makes is in teaching people not to reuse passwords (and use a password manager instead).

multidim · on May 16, 2020

>What am I supposed to do whenever I'm involved in a new breach? Burn all my accounts and start again?

If you reuse passwords, then change your passwords for all the accounts that use the breached password. Hopefully, it'll spur you to start using a password manager so you can easily have strong, unique passwords.

If you don't reuse passwords, then change your password for the breached account. Sometimes services don't tell you about breaches and it is HIBP that first informs you about the breach.

If there is some email address that you really, really don't want bad guys to know about (perhaps a dedicated email address for your important financial accounts), then it helps you know when to switch to another email address.

HIBP helps you know how often a service has been breached in the past, and that might help guide what services you want to use/not-use in the future.

numpad0 · on May 15, 2020

Check account recovery procedures, change password for that website, check login history and active sessions, see if anyone had done anything that could be done through that credentials, on top of using random generated passwords in the first place.

And I think you’re about to describe Sign In with Apple.

sbarre · on May 15, 2020

As the other comment also said, it's a public education service.

Remember that most of us on here have extremely advanced knowledge of the Internet and its workings. This is not the case for the vast majority of Internet users.

xondono · on May 15, 2020

It depends how many emails do you keep. If you get a hit it’s a good idea to ensure that you keep control of the services related to that address (change passwords, set any extra security measures).

I mostly use it through 1Password, because it also notifies you when a service has enabled new security features like 2FA.

EmilioMartinez · on May 17, 2020

For me it's a shortcut to explain why it's always a risk to divulge personal information to 3rd parties, however trustworthy they seem.

polote · on May 15, 2020

After how many breach of ES clusters, Elastic will decide to make their db not accessible from external IP by default ?

zaat · on May 15, 2020

That's the default for a long time already, but people actually want to use it from outside the server and so they configure the listener.

https://www.elastic.co/guide/en/elasticsearch/reference/6.3/...

outworlder · on May 15, 2020

Even then, that also means that their machine has a public routable IP and can answer incoming requests from the internet. My question is: why?

Sebb767 · on May 16, 2020

For many cloud VMs you spin up, it's the default. Having your servers behind a NAT not only requires a lot more infrastructure knowledge (you need to know you need it and manage access and routing), but also quite a bit more capital investment; i.e. you need to set up a full infrastructure compared to spinning up two+ VMs.

That's not to say it's a good thing, but I'm always surprised by the lack of deeper network knowledge by a lot of engineers (and that's not meant degrading - it's not something that you get for free when programming).

Lastly, you did probably start the project with a single VM - and at that point it's far harder to say when the point comes to move to a NAT, even more given that getting your second server is probably needed in a sudden spike and the switch is a lot of work with no immediate payoff.

r1ch · on May 15, 2020

Is this dump online anywhere? I got the notification from HIBP but it only tells me my email address appeared and I'm curious how accurate the rest of the data is.

esnard · on May 15, 2020

> Back in Feb, Dehashed reached out to me with a massive trove of data

I guess searching on https://www.dehashed.com/ should give you some additional data.

Nextgrid · on May 15, 2020

Surprisingly enough searching my pwned address in this breach doesn't bring it up on Dehashed.

Operyl · on May 15, 2020

The Dehashed indexer is extremely slow, according to their FAQ. Mine hasn’t showed up there yet either, but I was informed by HIBP. Could still be indexing I suppose.

celticninja · on May 15, 2020

exactly what I want to check. it's almost expected that at some point my email address is going to end up in a breach, but there is a chance that by reviewing the data I can ascertain where it came from, at least in part .

guessmyname · on May 15, 2020

> Email addresses, Job titles, Names, Phone numbers, Physical addresses, Social media profiles

I just got the email notification from HIBP (Have I Been Pwned) a few minutes ago [1], but I am not worried about the compromised data because 1) my personal email address, job title and phone number are all visible in my resume which is publicly available in my website, I actually encourage people —mostly tech recruiters— to download the PDF and contact me via email or phone all the time and 2) my physical address is irrelevant because I have been moving houses every year for the last seven (7) years (even across countries a couple of times. All the social media accounts I have are completely empty, I just keep them around to get a hold on to my nickname.

I recently found, in my website’s HTTP logs, several requests from a web crawler controlled by ZoomInfo [3] an American subscription-based software as a service (SaaS) company that sells access to its database of information about business people and companies to sales, marketing and recruiting professionals. I was going to configure my firewall to block these requests but then I remembered —hey! my website only has information I am comfortable sharing, so it doesn’t matter— but I’ve been thinking it is just a matter of time before someone hacks one of their systems and leaks their database.

In my previous-previous job I found a fairly simple (persistent) XSS vulnerability in BambooHR that allowed non-authorized users to access data from all employees registered in the website including Social Security Numbers (SSN). I told my boss and we immediately edited everything before migrating to a different system. We never knew if BambooHR fixed the vulnerabilities and I wouldn’t be surprised if the data was leaked before or after I found the security hole.

Software security is such a Whac-A-Mole game, even if you get the budget to conduct security audits on your code, there is always going to be a weak link somewhere in the chain and that will be your doom. This is one of the many reasons why I left that job as a Security Engineer, the other reasons were Meltdown [3] and Spectre [4] they both made me realize I was fighting for a lost cause.

[1] https://haveibeenpwned.com/NotifyMe

[2] https://en.wikipedia.org/wiki/ZoomInfo

[3] https://en.wikipedia.org/wiki/Meltdown_%28security_vulnerabi...

[4] https://en.wikipedia.org/wiki/Spectre_%28security_vulnerabil...

cpv · on May 15, 2020

> Email addresses, Job titles, Names, Phone numbers, Physical addresses, Social media profiles

Probably these can have a different impact if your threat model is a bit different (money, status, living area, position held, etc).

Reminds me the story about an investigative reporter known in these parts, who was swatted: https://krebsonsecurity.com/2013/03/the-world-has-no-room-fo...

or received a drug package from an investigated person, basically it was a trap: https://krebsonsecurity.com/2015/10/hacker-who-sent-me-heroi...

The journalist knew about this and informed the police beforehand. Happy end.

To add a little more, I have seen people posting on social media answers to posts like "your favorite car, your place of birth, name of mother, name of pet". Guess who uses those words for similar secret questions?

Some personal identifiable information can be used to fabricate fake IDs, for various purposes.

And if we have a linked graph with all the personal, job, address, interacted people, geo-places, etc, it can get creepy (sounds like Facebook, but much more open).

Not saying we all should get paranoid, but leaked data could be used in different ways.

sirius87 · on May 15, 2020

The BambooHR theory is interesting. I looked up email addresses of co-workers at a startup I worked for a few years ago (Jul'15-Jun'16). I was with them earlier in 2012-13. My work email isn't there. But the slice of people between Apr'13-Jul'15...all there. I guess we ran through a bunch of HR software during the period, BambooHR being one of them. So either it's a subset of BambooHR or its some other product a bunch of people at my workplace signed up for.

nucleardog · on May 16, 2020

Our company's been on BambooHR for 3-4 years now I think (me personally for a little over two). Can't find any of our company's addresses in there. So either partial or old if that's where it came from.

Others are saying they've found data from as recently as mid-2019, so could be possible that the reason it's so hard to find a source is that this is multiple sources. Looking at this as a dump from some sort of contact manager, could see this being a dump from some sales guy's CRM or something where he'd imported multiple datasets as potential leads alongside his personal contacts.

Thoughtful · on May 15, 2020

On the BambooHR issue, can you elaborate a bit more?

guessmyname · on May 15, 2020

BambooHR is written in PHP and as it is widely known PHP allows incompetent programmers to create insecure websites. The majority of BambooHR pages are loaded by referencing a page ID, for example, you can access this URL [1] to render a form that allows you to send documents to arbitrary e-mail addresses, and this URL [2] allows you to edit your own profile.

So far so good, if you are a competent PHP programmer (or any other programming language) you make sure these IDs are not consecutive to avoid enumeration attacks, but even if they are part of a guessable sequence you can still secure them by restricting access to all pages except the ones associated to the user ID in the session.

The vulnerabilities I found were a combination of Path Traversal [3], Forced Browsing [4] and Stored Cross Site Scripting [5] that allowed anyone to 1) force a specific PHP file to load arbitrary pages, 2) access data associated to other employee identifiers and 3) send all documents associated to these employee IDs to arbitrary emails by accessing the “Email File” page and crafting a simple HTTP request to bypass a rudimentary form validation.

When I told my boss he continued the investigation and found that we could access certain amount of data associated to employees registered in other subdomains. People who are familiar with BambooHR will understand how stupid this specific problem is considering each subdomain is isolated from the others, so one would expect them to isolate the databases as well.

I don’t know anything about the architecture of their system so I cannot explain why these security holes allowed us to access data from other companies. I was very scared to continue digging into it and my boss was super pissed off. We didn’t know if they used soft deletes so instead of removing the company’s data we decided to edit it with garbage information, then we migrated to another system.

And that was the end of our story with them. We never reported the problems because I started my “research” without previous authorization from BambooHR so if we reported our findings they could sue my employer and we would be in bigger problems. Same thing happened when we found a vulnerability on HipChat [6] in 2014 or so, we reported it and they got super angry at us for conducting that penetration test without permission, the company made an agreement with them and we migrated to Slack.

Good luck to anyone whose employer is still using BambooHR to manage their employee database.

[1] /ajax/employees/files/email_file.php?id=19

[2] /employees/employee.php?id=EMPLOYEE_ID&page=PAGE_ID

[3] https://owasp.org/www-community/attacks/Path_Traversal

[4] https://owasp.org/www-community/attacks/Forced_browsing

[5] https://owasp.org/www-community/attacks/xss/

[6] https://en.wikipedia.org/wiki/HipChat

gtsteve · on May 15, 2020

> PHP allows incompetent programmers to create insecure websites.

The points you bring up are good but my first instinct was to distrust you as you opened with that. I don't believe any specific shortcoming of PHP makes these issues more or less likely. Anyone can make an insecure website in any language.

Secondly I don't think I quite agree with the ethics of dropping a security vulnerability in a public forum. I think you should edit this message to remove the details and go through the proper channels to get this resolved, if it is indeed still a problem.

Nextgrid · on May 15, 2020

Bare PHP (without any framework) and the tons of bad advice surrounding it make it easier to screw up than other languages where it's very hard to do web development without a framework so most beginners start off with a framework directly which provides structure and guard-rails against doing insecure things.

fragmede · on May 15, 2020

I mean, you're not wrong, but starting the post off by insulting PHP is childish and doesn't inspire trust that the rest of the report is worth reading.

fragmede · on May 15, 2020

Personally, the ethics of it are secondary to the fact that BambooHR could, sue HN to recover the IP address guessmyname used to post, followed by suing their ISP to get an address, and then trawl through their records/backups to link it to an individual. Now, BambooHR may not be run by assholes (I've never encountered them before), and choose to fix the bug quietly rather than go after "guessmyname" with a lawsuit, but companies are not known for being especially insightful when computer security comes up. (Such as the HipChat example mentioned.)

Hopefully guessmyname always uses VPN/public hotspot to access this site, if it turns out that BambooHR is run by litigious jerks.

notwhereyouare · on May 15, 2020

I reviewed them when looking for a HR provider...thankfully they didn't offer everything I was looking for. But man, that's scary

throwaway834792 · on May 15, 2020

Based on a large (over 50 results) domain search for a company I work for, the data I found was very old, circa 2014.

I know this because almost everyone in the domain search stopped working for the company on or after 2014. Everyone else has worked at the company since 2013 or earlier.

bryant · on May 15, 2020

Heads up, found at least one match for 2019 from a dataset I'm working with.

lawnchair_larry · on May 15, 2020

That doesn’t set an upper bound on when the breach happened, it sets a lower bound. Old email addresses aren’t deleted by whoever had them. It just means it contains data from at least 2014, up to and including 2019.

koheripbal · on May 15, 2020

The email notification doesnt list the emails impacted. Do you need to rerun the full report to get the details?

Nextgrid · on May 15, 2020

If you run the domain report manually on the HIBP website you get the actual email addresses involved.

tru3_power · on May 15, 2020

I did some quick searching for the dataformat included in the snippets from the article. Lots of repos with stored secrets that match:

https://github.com/acalvoa/SRID_CHANGER/blob/da367e68433b3fd...

Stored secret:

https://github.com/acalvoa/SRID_CHANGER/blob/master/config.p...

Will look more into this later

amatecha · on May 16, 2020

Ehhh, to me those seem like pretty common fields for any kind of contact data. It doesn't have some of the more unusual or IMO implementation-specific fields like "ShowableNonVisibleToOthers" or "PopulatedCleanNumber", for example.

killswitched · on May 16, 2020

Some emails that turned up on my end: Dr. Dobbs and New Relic, although the leaks occurred from parties to whom these sites had provided my data, including at least unique email addresses.

forgotmypw23 · on May 15, 2020

The first thing that comes to mind is recaptcha with some overlays. they would know almost every account you've registered for.

cm2187 · on May 15, 2020

Does elasticsearch have no authentication by default like mongodb or did someone deliberately make it public?

tyingq · on May 15, 2020

Fixed now, but this was a common sequence of events at one time: https://discuss.elastic.co/t/ransom-attack-on-elasticsearch-...

cm2187 · on May 15, 2020

My god, it looks even worse than no security by default. It gives you a false sense of security then unlocks in your back when you are not watching.

leetbulb · on May 15, 2020

No authentication by default.

wnevets · on May 15, 2020

Am I the only one who dislikes some of those column names?

isNonIndividual, IsNonVisibleToOthers, ShowableNonVisibleToOthers

akersten · on May 15, 2020

I can smell the enterprise ball-of-mud spaghetti code from here :)

outworlder · on May 15, 2020

Negative flags sucks.

wjnc · on May 15, 2020

Question: It was my understanding that a lawyer could sue the cloud provider for customer details of the cloud service in detail? It would be relevant information in determining liability for leaking this PII.

voidmain0001 · on May 15, 2020

Firefox Monitor includes the db8151dd data: https://monitor.firefox.com/?breach=db8151dd

yahelc · on May 15, 2020

Probably because they include HIBP data https://www.troyhunt.com/were-baking-have-i-been-pwned-into-...

jonykakarov · on May 18, 2020

what I can't understand is that I never heard of this covve app neither most of the affected users in the comment section on reddit or troy website or even here as no one thought of it , and my email does exist on the breach, also the data seem to be huge (103,150,616 rows/90GB)for an app that have about 100k install, need some explanations here.

bluesign · on May 15, 2020

It’s contact data from iOS and android phones probably scraped via some malware app/apps

akmarinov · on May 15, 2020

Contact data doesn’t contain CRM references

Nextgrid · on May 15, 2020

Could it be that CRMs had their own contacts integrations which synced CRM data into someone's contacts, where a different app then scraped it and got pwned?