Hacker News new | past | comments | ask | show | jobs | submit login
First American Financial Corp. Leaked Hundreds of Millions of Insurance Records (krebsonsecurity.com)
418 points by PatrolX on May 24, 2019 | hide | past | favorite | 164 comments



I did a penetration test for $NATIONALINSURER and they had an FTP site with weak credentials where all the remote offices uploaded claims. Millions of records and scans of SSNs, home addresses, bank information, etc. Their mitigating controls were: we put it behind a firewall.

Then again I didn't expect much, their MSSQL in prod had SA/SA credentials active.


I did EDI work for several major national and international companies you've definitely heard of. This is all too common, we're talking about millions of dollars of transactions per day flowing over insecure FTP sitting on the internet. VANs, originally used dial-up modems to deliver EDI, now they often use insecure FTP.

A few brave companies have tried to put their FTP systems behind VPNs, but the momentum is hard to overcome. What is more popular is firewall rules that only allow large blocks of IPs owned by other vendors they deal with. It is good in theory, until you see how large/diverse some of these blocks are (e.g. all of AWS's Eastern data center).

It was a very loud wake-up call seeing what inter-business stuff looked like. It is the wild west or a flashback to the 1990s security wise.


I'm currently fighting against management dragging their feet on using 2FA.

On HIPAA PHI.

(I know HIPAA doesn't actually mandate 2FA, but it's recommended by many best practices and guides.)

Apparently some tech folks don't like the inconvenience of 2FA.


DoD contractors are now required to have 2FA.

https://duo.com/blog/federal-contractors-must-meet-cybersecu...


Maybe not a popular or shared opinion (so nobody take this as advice), but IMO 2FA (especially phone-based) is overrated while being a serious inconvenience to users and developers.

Many of the recent attacks I've seen simply bypass it altogether in favor of phishing or other traditional techniques.

Clients (Ansible?) simply don't work with it or do it well, which leads to hacks that undermine your 2FA deployment anyway-- rogue admins opening reverse tunnels to allow file transfers, webshells, etc.


2FA freaks me out. It means I'll be locked out of all my key accounts and services if ever my phone breaks or gets lost. Probably right when I need these services most.


That’s why they usually have backup keys that you physically keep in a safe place.


That sounds decent but I've not seen this a lot. Its often just "give us your phone number and we'll SMS you an access key when you log in"


Unfortunately I see this all too often on systems that have enabled 2FA but not TOTP.


I've seen Retail stores with revenues in the 10s of billions using Telnet for the POS clients in 2019. They also used FTP glaore and were worried about the security of cloud. :)


I worked for <business imaging company X> where every network-connected copier automatically setup its own web server where unauthenticated users could peruse all of the jobs printed recently. Sure, a firewall might prevent public access, but it also wasn't hard to use Google's inurl: function to find the (at the time) 5% or so companies using these things that had public ip's assigned. You could also upload documents like PDFs to be printed out. Many of the high end fax machines had the same "feature". HP printers did something like this too, but that's not where I worked.

EDIT: Oh, and the network controllers that ran them were uniformly updated and managed with fully open "admin" username no-password telnet and ftp services. IoT insecurity began a looooong time before the term IoT even existed.


Ditto. I've seen the same thing going on at an investment bank processing $350 billion / day in transactions.


Wonder how many retailers there are that make 10s of billions


Around 20-25 I believe.


Depending on how weak those credentials were, this sounds like something you should report to Krebs as well.


I mean the whole point of that company hiring someone to do a penetration test is to be sure this doesn't end up on Krebs. You pretty much lose your reputation if the company thinks you will go around and leak what you found out through that private investigation.


Good point, I jumped the gun a bit there. I misread the comment and thought they meant the firm didn't do anything about a pentest result, not that they found it themselves.


I'll bet client4 is already breaking an NDA or two just by posting this.


According to the NDA, client4 doesn't even exist...


Security by NDA, cheapest infosec.


A lot of discussion on technical side, but not from organisational.

How could audit, both internal and external, not find this? 2003 to today is 16 years. Audit is a last line of defence and certainly not to be relied on upon as a buddy to catch your errors. But... how? This is a major financial institution in the most developed country in the world (the clue's in the name). It should subscribe to the the highest integrity and tightest scrutiny. This seems an opportunity for both internal and external auditors to tighten their game.

Outside of audit, surely an employee might have noticed? Was there no formal method to speak up without fear of recrimination? According to Wikipedia [1] there are eighteen thousand employees. Someone never noticed?

This seems an organsiational failing, not a technical one.

[1] https://en.wikipedia.org/wiki/First_American_Corporation


Is not the tech a part of the organizations way of doing business?

These things are highly related to what’s going down in a thread [1] from yesterday (about “shitty projects”).

I’m sure these guys spend many millions each year on security products, but either people in the know on the tech side is ignored, or they have no competencies left.

In the thread I mention above I have actually posted about my general experience from a major insurance player.

A concrete example:

We were making changes to a custom software and as there were concerns about bandwidth requirements and latency I took it upon myself to figure out what a specific process looked like, from the business perspective.

In short, in the middle of the workflow, customers journals was written to CD and mailed to physicians. Encryption? Eh, no... Any process in place to ensure safe keeping and return/destruction? Uh, forget about it...

This was in the time when a lot of these “lost usb devices” and hacked systems seemed to pop up daily.

I obviously raised this with the security team, the security officer and the business unit.

No one wanted to touch this finely tuned business process.

It felt like I was working at fawlty towers.

Again, that companies have drawn this line between business and tech, “‘cause tech is not core bidniz”, will haunt a lot of big players for years to come.

[1] https://news.ycombinator.com/item?id=19998806


> In short, in the middle of the workflow, customers journals was written to CD and mailed to physicians. Encryption? Eh, no... Any process in place to ensure safe keeping and return/destruction? Uh, forget about it...

That's a manually initiated transaction done internally and should be a red flag to anyone. Data outside of the organisation is data with no control. You could keep escalating this. That's an example of no 'speaking up' channel. If a channel to escalate is missing or poorly implemented, frauds will happen by internal or external agents. The process doesn't sound finely tuned at all.


Of course I was being ironic about it being ”finely tuned”!

What I’m saying is that in spite of having, in a sense, all the resources at their disposal, this process was chosen by the business, for the business.

An encrypted on-line service could, and should, have been implemented. But being far from tech & dev the business choose a process matching their compentecies.

Messing with this several years in, and trying to digitize a process obviously in need for it, is met with much resistance.

Another gem of a process:

Many (like hundreds) employees needed personal printers. But why?!

Because:

- printing claim from “modern” client/server system.

- Pinning an also printed bar-code to the pages from step one

- scanning these in to software that reads the bar-code and adds them to queues for mainframe processing.

D/A -> A/D? Huh?!

Holy cow! I almost fell off my chair...

And the inherent security risks in play here, not to mention acres of forrest consumed during the years. My mind is boggling...

Am I actually living my working life inside a Dilbert strip?! It’s not even funny, because it’s true.

What I’m saying is that many large corps are anything but in fine tune with tech.

It’s gonna’ cost em’ in the long run.


Right. I'm 'business' and the split 'business' vs 'tech' should not be there. I'm sure we've both seen terrible things, these are reinforced by organisational constructs. Escalate escalate escalate if you see something wrong. To coin a bigcorp slogan, of a company I admire the mission of, "Do the right thing" and "Not good enough."

I recently opened a new bank account in the UK and chose a 'challenger' bank. The process was secure, very smooth, the customer support very nice. They have no branches. This is regulationtech, not so much fintech, and challengers are coming from all sides, including in insurance. I wish these challengers well as being on the inside of incumbents I'm just left scratching my head "Why?".


I for one am through escalating stuff in a hierarchical organization. Too much politics.

I’ve been out of that game for a few years and have no ambitions make a career for myself at such a place.

In a very big, top down org. ponder the following:

Granting, in a specific scenario, that I’m right — this “whatever” is a disaster waiting to happen or possibly an already flaming disaster, heads have to roll.

Someone always have to take the blame, as this most likely will affect someones budget or set goals.

It might have profound effects on the current “1.” or “.One” consolidation & synergetic tech project that management is giving misdirected focus at the moment.

The “1Whatever” projects usually have bizarre amounts of $$$ attached, and end up holy.

Have you worked big enough companies you know about the “whateverOne” projects I’m referring to!


I don't know what 1.whatever is. Yes, I've worked for supercorps, mainly financials, and I have a responsibility to ensure customer and employee data are managed responsibly.

It is important to escalate what doesn't seem right. Sometimes that means email after email after email (written record) and that if it still doesn't smell right to keep pushing. Ops was a strange place, but 500 emails per day is no longer a challenge.

I commented this as an organisational failing rather than a technical one as a debate about UUIDs seems to be missing the point that people could have been aware something was not right but did not, or weren't allowed, or it got drowned in organisation, to do anything.


Clarification on 1/one — in my experience big co is naturally striving for synergies and often target “IT” as it’s seemingly an obvious candidate.

These projects often bare a description such as “ProgramOne”, “Platform1” or 1SomethingAwesome, and is of a “bite of more than you can chew” character.

At least at three of class leading companies I’ve worked, all with 90.000+ employees.

It’s just my disillusionment shining through! :)

I believe we agree — the future holds a merger of tech with business and what I’ve stated above are org failures. That’s true.

IT must not block business, it should expose opportunities and be inherently secure by convention.


> the future holds a merger of tech with business

Completely agree. And regulators are pushing for this. When.. in retail and SME banking and financial services the future is here, in Europe, but has yet to gain traction and public trust but that's coming quickly. That will be 5 years, change takes time, a long journey for VC money but not too long, they will see returns.

Asia will be slow. NA perhaps slower still. AU might pick up the ball but NZ will be faster if they choose. SG will lag HK because of technical debt in SG. I'm Asia based, not much idea on South America. South Asia are hand-tied by regulation, mainly currency, restrictions, the above regulators will be providing markets with the best retail and SME financial products. A hot place to be, Europe probably hotter coz PSD2.


My feeling is that at some places the effects is still somewhat underestimated.

There’s probably 90% lower hanging fruit than blockchain, if you know what I mean.

Throwing inventive projects at failing orgs will most likely fall flat.

I understand many of the challenges though, and no true recipe for change exists.

The closest I can think about is “letting go”.

I mean, you have people hired for a reason! If you don’t trust them doing their thing, who’s the one hiring them?


Yup, raise any flags and risk being treated as an outsider/treated poorly. That's my current situation after raising concerns ranging from being way overcharged on a government client project by a vendor who happens to be friends with a manager, and catching a now ex manager pulling mitm attacks on a router (to snoop/play politics) which happens to be on the same network as servers housing client data. It's an awful feeling not being able to have glaring issues resolved or be treated like shit after doing what seemed right and in the best interest of the company.

Needless to say, I'm making moves to get the hell out of there


And this is from a huge organisation. There are many more medium-large organisations that still operate under the Chinese-walls model: perimeter defense, but once you are inside the VPN/intranet the security is a lot more relaxed (if any). That is the security culture and very hard to change.

The market forces those orgs to start offer services online. They run those (relaxed security) services inside their intranet, so they start poking holes in their firewall. The next decade is not going to be pretty in that regard.


I think you're correct, that these will become more common. A couple of years ago I was chatting with friends who were doing APIs, because of market/regulatory pressure EU's PSD2, exporting JSON of transactions using COBOL for use in online. Because that bank - almost an order of magnitude larger than First American in terms of employees - will do everything in order to not move off COBOL / it's legacy system.

Interesting skill set: COBOL, DB2, JS, Angular.


You know what's a great incentive to actually care about this stuff? Legal consequences for not caring about it.

Anyone conceivably responsible for ignoring the developer's complaint should be on trial right now.


I personally think the board and CEO should be personally criminally liable. I don't know exactly how but if I can't use ignorance of the law as a defense for shouting in public (one yell back at someone who yelled "fuck you" at me and I got a $180 fine) then the CEO and board can't use that as a defense for leaking data for SIXTEEN years.

Didn't know this was happening in your organization? Fuxk you, go to prison.


As an American resident in Europe, I have to wonder what liability they face under GDPR.


You can access documents all the way back to 2003.

That doesn’t necessarily mean that this hole has existed that long.


Very good point.


Whenever you are compelled to upload/send a photocopy of an ID document it is sensible to write the date and purpose / file reference on it. If it appears in a document dump at some later date you know the path and date of the leak.


This is excellent advice. You might not be able to write on, for example, a passport taking a photo of, but can certainly put a postit note-type sticker on it.


that is a good idea but most of the time I need to hand over my actual ID, and not just a scan of it


I'm usually an uncooperative bastard at times like this. I ask them what their purpose is in retaining a copy of my identity document, and I ask for their privacy policy. When I'm travelling for work and a hotel asks I simply say no, and remind them I can lodge a complaint with our corporate travel provider that will have them delisted for future business; usually they come to their senses. For overseas travel (where they often have to fax your passport to the police) I ask for the photocopy back when I check out; since these documents are lost/misfiled regularly the desk clerk usually complies. Always try for point-in-time ID verification rather than retention.


I used to be uncooperative too but in the end, I usually have to still give it (or go somewhere else)


Where do they ask you for a copy of your ID? I've never had that happen to me at a hotel.


It's because of credit card fraud usually, they want it to prove to credit providers that the actual card holder was present, and if not, exactly who. And I've been asked in countries across the first world, though I haven't been keeping specific note of when.


Yeah, like most doctors offices will take my license with my insurance card, and they scan them into their own systems to associate with my record. No opportunity to insert my own add on content. Just hope whatever SaaS based patient management system they have is iron-clad secure. (But it isn't)


I guess you could stick a very small sticker on it


Yet another security vulnerability caused by:

1. Using sequentially incremented integer sequences as object IDs, and

2. Failing to protect sensitive data using some kind of authentication and authorization check.

This is becoming a trend with data breaches. Several of Krebs' other reports on behalf of security researchers were originally identified by (trivially) walking across object IDs on public URLs.

My cynical take is that Krebs couldn't go public before this afternoon because First American wanted it to hit the news at an opportune time, then get ahead of it with their own messaging. Krebs got in touch with First American on Monday May 19th. The story is only just breaking now on a Friday afternoon at 5 pm; markets are conveniently closed for the weekend.

I expect them to issue a hollow PR statement about valuing security despite being unable to act on security reports until an investigative journalist threatens to go public.


I once made an app not using sequential integers as object ids, as you suggest.

It was an absolute nightmare. Maintenance was a nightmare, you're constantly having to generate or replicate these things that add an extra layer of complexity to everything, and almost always unnecessarily.

It's also extremely bad for db performance, causes massive page fragmentation, indexes become useless almost straight after rebuilding them, etc.

For almost everything, sequential int IDs are fine. It's the things you expose to the users that you need to be careful with, and then don't use the primary key to access them, add another unique key to them, but keep the id in there for the db to use and for your own use.

My lesson was to go back to always using int ids, and on a few objects have a separate unique key column to expose to users for sensitive stuff.


I also don't think using UUIDs as a security (by obscurity) strategy is valid. But there are other reasons someone may choose to use UUIDs. For instance, it's convenient to generate identifiers in a decentralized manner. I want to counter your one bad experience with my (equally anecdotal) many-multiple good experiences. Databases do just fine with UUIDs. Though we may be working on different kinds of systems, and optimizing for different things. I don't frown upon using integers (well, longs) for identifiers, but I personally prefer UUIDs.


A securely generated 128 bit UUID isn't security-by-obscurity, but rather security-by-cryptography. It's still bad not to have authorization checks, because UUIDs can "leak" into logs, browser histories, emails, and things like that. But the security benefit of using crypto-random IDs is neither cosmetic nor superficial.

Most applications don't use UUIDs and many of them are fine and I definitely wouldn't ding an app for using monotonic IDs, but I'm increasingly thinking that it's worth praising UUIDs more.


If you know the (integer) identifier, and because the bad application isn't secured with authentication, you get access to something you're not supposed to. If you make the identifier a lot harder to know, and you still have no security, that smells like the obscurity part. I can absolutely see your point that the UUID identifiers are not just a lot harder to guess, they may be impossible to guess. But the security is still bad, and I don't think that the impossible to guess-property of the UUIDs should be a substitute for security. I don't think we really disagree, though.


That's not what people usually mean by "security by obscurity" when they critique the concept. Unfortunately the term is overloaded so it's lost its way over time.

To illustrate this for you, let me turn it around a bit. Is it security by obscurity if the only thing stopping someone from logging into your account is knowing your password?

Security by obscurity is when you (for example) roll your own cryptosystem and rely (in whole or part) on the secrecy of your new-fangled algorithm to save you. That is unsafe. But if you're saying high-entropy strings shouldn't be the only barrier to authentication, you're throwing out half a century of complexity theoretic cryptography.


Yeah, I think the understandable confusion comes from the idea that a UUID "obscures" the sequential identity of the id in the same way a password mask obscures a password, but the obscurity in security through obscurity refers to reliance on an attacker's ignorance of implementation details to secure the system rather than on a mechanism that is provably secure.


Another pretty direct comparison would be to 128-bit secret bearer tokens, on which a huge portion of the industry relies.


I think context matters here. If someone wants to hand out tokens, for instance via e-mail verification, I'm fine with relying on that being a UUID. When you make it harder (impossibly hard) to guess a "record number" by using UUIDs, which is what we were probably talking about, that's great too. (I already yielded that point.) But let's not lead the general population into thinking that UUIDs make everything safer (probably not what you were saying), because if something is "just an identifier" it may not be handled as safely, which is what this seems to be relying on in the context of security. Same as how user names were traditionally not handled as something secret or confidential. Sometimes UUIDs appear as just identifiers and are not handled with any secrecy, so they just can't always double as a security feature.


> Sometimes UUIDs appear as just identifiers and are not handled with any secrecy, so they just can't always double as a security feature.

I can see your point. If UUIDs are handled in such a way that they are discoverable by anyone, they are not enough to make the references secure.

I think the point tptacek and others are making is that this is an instance of the defence in depths principle, though. In scenarios where UUIDs are not simply discoverable, using UUIDs is inherently more secure than using a monotonic ID, simply because the monotonic ID can be easily guessed. Yet, they are still not enough in isolation and you should be additionally using proper access control (due to eventual leakage of particular UUIDs in emails and such).


But UUIDs do, in fact, make things safer.


On average: yes. Always: no.


Always no? What's a situation in which you'd be better off with monotonic ids?


I never said they were less secure. I said there are situations where they're not really more secure.

If I can see in this HTML page that your reply is /reply?id=12345, then it doesn't matter if Hacker News uses integers or UUIDs, if there's a bug in /edit?id=12345 that just lets me edit it without the appropriate security. If we say that UUIDs always make everything inherently more secure, we're doing everyone a disservice.

Now, the original discussion was about (1) discovering for read, and not about (2) escalating a read to a write. But if anyone reading this mistakenly takes from it that UUIDs are the way to solve these problems then they will go on optimizing for (1) at the expense of (2).


"Security by obscurity is no form of security."

That's been bouncing around at least since the time I noticed it on /. Which was a couple of decades ago.


Note most databases use type 1 UUIDs by default, not randomly generated type 4 UUIDs. There are tons of security holes out there because people are using type 1 UUIDs thinking they can be used as secure tokens.


> For instance, it's convenient to generate identifiers in a decentralized manner.

For an elegant solution to this problem, check out Twitter's Snowflake[0].

[0] https://blog.twitter.com/engineering/en_us/a/2010/announcing...


I always wondered why databases have not implemented a scheme like Microsoft's Active Directory RID master FSMO role. One server is responsible for handing out chunks of ID's to each server. They request a new block whenever a threshold is reached (50% by default IIRC).


Some coordination there courtesy of Zookeeper.


I don't think it's really fair to call it security by obscurity. The UUIDs have far more entropy than 99.9% of user passwords protecting them.


Not if they are type 1 UUIDs, which is the default on MySQL.


From https://littlemaninmyhead.wordpress.com/2015/11/22/cautionar...

> Do not assume that UUIDs are hard to guess; they should not be used as security capabilities (identifiers whose mere possession grants access), for example.

HN discussion: https://news.ycombinator.com/item?id=10631806


Sure, security is about taking a layered approach - I don't think anyone would seriously advocate using knowledge of a UUID as enough authorisation on it's own. Well, I hope not :)


I find UUIDs very useful for this reason - the IDs can be generated by different parts ofba distributed system, and be "guaranteed" to be unique.

In this kind of system you can also generate deterministic UUIDs, which are useful for idempotency (e.g. The same event can be recognised as a duplicate)


Sure, that's fine. The context of my point about IDs is for user-facing APIs. Note that user-facing really means "publicly accessible", even in the case of private APIs. As I mentioned elsewhere, market research groups will be happy to extrapolate as many metrics as they can from your APIs integer object IDs.

That being said I'm a little surprised to hear about the complexity. Are you able to share which DB/stack you were using? This functionality should be natively supported at two distinct abstractions: your programming language and your database.


In that case C#/EF/SQL Server is what that app was made in. his was like 6 years ago, admittedly, but it didn't geel as if it's really treated as a first class citizen. Everything's in ints in example code, you have to fight the auto-code generators a bit, etc. So in my experience it's never anywhere near as seamless as the int support.

But it's not just the support that's such the problem. You're testing, you need to switch category, you can't just change a 1 to a 2. You have to go find what random uuid the categories had added to it. You can't just go into the DB and add a new line, you have to open a UUID generator. You can't just quickly add a foreign key relationship, you have to look up the UUID. And a ton of other little annoyances.

Actually, categories are an excellent example of something that shouldn't be a UUID, they're actually supposed to be discover-able.

I think my present project has UUIDs on the user, company, invoice and payments tables, but still ints as the primary key. Everything else isn't worth it. There's a merchant table, but again, they're all supposed to be discover-able (and aren't editable by the merchant themselves).

I also generally implement controller level security that checks access to the root object being returned by default, so I can't really make a mistake exposing an unauthorised object. There's an occasional controller where I've made a conscious decision not to implement that level, generally actions that allow both authenticated and unauthenticated users (e.g. viewing merchants or categories).


You can generate uuids that play nicer with database storage / indexing. NEWSEQUENTIALID() in MSSQL, for example.

The keys will be easier to guess again, but if all you have to do is guess a primary key to get access to the underlying data, something else isn't right anyways.


I think this gets to the crux of the issue.

It's not about using hard-to-guess UUIDs[0], but restricting access to the underlying data[1].

[0] https://en.m.wikipedia.org/wiki/Security_through_obscurity

[1] https://en.m.wikipedia.org/wiki/Access_control


It's not really security through obscurity. In these case I understand the ids where related to data that the company was making available to users through email links. A cryptographically secure 128bit UUID is impossible to guess, no more than a cryptographic access token. Now of course, you would probably rather want to have an authentication scheme on top of that, but that comes at a support cost in term of customers loosing their passwords, locking themselves out of their account, etc. And it is not clear you have increased security as people re-use passwords.

Then of course there is the issue that email is for the most part un-encrypted (or encrypted without validating certificates).


It's still an access control issue in that case. The user should never be aware of the UUID's. Only the backend should deal with it. If you have a _public_ API that deals with UUIDs, therein lies the issue.

And a side note: I wouldn't trust that the prng for your UUIDs are cryptographically secure. That's not a part of the spec.


I know, but as they're easier to guess, what's the point? Might as well just go back to ints.


Is MSSQL's NEWSEQUENTIALID secure? I didn't think it was.


Is the point of non mono tonic is schemes to make them -secure- secure?

I thought they were a bit of a hack to raise the bar a touch. In which case the crypto security properties of that function isn’t interesting. Instead the ergonomics are.


No, the cryptographic security of the identifier matters a lot. A GUID generated from an insecure PRNG can be used to predict other GUIDs. A UUID generated from 16 bytes of /dev/urandom can't be used to get anything but the object to which it refers.


Yeah there's nothing wrong with using sequential integer IDs in the database. But objects should be assigned random unique IDs as well, which is how they are referenced by and presented to the outside world. The random ID is what is presented to the frontend/user. I'm not sure what the issue you had with generating random integers for primary keys was, it seems like that should work fine. Is it because the index has to be rebuilt when an value is inserted into the middle of the ordered sequence?


One path is to use sequential ints internally and encrypt them externally with something like idgen:

https://pypi.org/project/idgen/

That provides IDs that are both opaque and, if you want, user-friendly.

(disclaimer: I wrote it.)


Something I realised looking at Google+ identifiers -- 21 digit numerics, 19 of those significant -- was that it made brute-searching the user profile space infeasible. There were only 4 billion and change legitimate profiles, there was a 4 in 100 billion chance of hitting one by chance on any given random request of the space. And IDs appeared to berandomly distributed.

And yes, Google also posted a sitemaps file (or rather, 50,000 sitemap files) with all profile IDs. But that was last marked updated in March 2017, for some reason. Being able to validate that would have been nice.

But as a mitigation against blind bulk scrapes, a useful tool. I'd consider that one of G+'s good design elements.


If you want an URL that you can share only to your friends then you have no choice. If you don't do that, then just use normal ACL.


There is nothing wrong with using sequential ids in and of themselves.

The typical web app has the concept of a validated user session per request. How hard is it really to

  Select ... From Documents where documentid = ? and userid = ?

So even if the user does a

  GET /Document/{id+1}
No documents would be returned.

Every web framework that I am aware of let’s you add one piece of middleware that validates a user session and won’t even route to the request if the user isn’t validated.


No, nothing wrong with it intrinsically. But if UUIDs were used instead, the lack of authentication or authorization checks wouldn't be as catastrophic. That would be somewhat comparable to having a reset password token which doesn't expire. Still bad, but not as bad.

The other commenter's point about leaking information is also correct. In the finance industry one of the basic tricks to obtaining alternative data is to scrape it from private APIs which expose sequential IDs corresponding to a source of revenue. For example, a publicly traded car company might have its revenue extrapolated from an open API which sequentially increments an ID every time a vehicle is sold. Research groups will reverse engineer mobile apps from companies with only one or two dimensions of revenue, find the private API endpoints (reversing request signing as needed), and then look for object IDs which can be thrown into a timeseries on a quarterly basis.

Generally speaking the risk and compliance department of a hedge fund disallows this kind of data if it's gathered from an actual security vulnerability (e.g. leaks PII). It needs to be "only" a neutral information side channel without sensitive data, so that doesn't really apply in this specific scenario. But it does apply for people considering using integer IDs for user-facing APIs.


Having done a few assessments in the last year where I was forced to downgrade sev:hi findings because nobody is realistically going to guess a 128 bit random number, I have to grudgingly acknowledge that UUID object keys are a meaningful security improvement. Which I hate to admit, because I'm generally of the opinion that "defense in depth" is a design cop-out, and here's a pretty potent counterexample.


I agree with you. Let me emphasize this explicitly: the real failure here is the utter lack of authn and authz. But it is meaningful that the integer IDs are being used.


One reason I <3 HN is that complex scenarios like this get described so clearly, succinctly like this.

I couldn't say it better myself when I'm speaking to management that makes these kinds of decisions. Now I can quote throwawaymath verbatim to drive the detailed point home.

Thanks!


> I agree with you. Let me emphasize this explicitly: the real failure here is the utter lack of authn and authz.

Bingo.


Nice. This reminds me of the German Tank problem in WWII, where the allies used samples of serial numbers from captured nazi tanks, to estimate their population. The tanks and their parts used sequential serial numbers. It could also be used to determine production rates too I guess.

The idea pre-dates web APIs many decades :-)

See https://en.m.wikipedia.org/wiki/German_tank_problem


That's how you can get a self-referencing tweet as well. https://twitter.com/spoonhenge/status/2878871344


Maybe not "wrong", but there are some very obvious downsides to exposing sequential IDs vs a randomized token:

- It exposes the count you have of a particular item

- It exposes your growth rate of those items

- If a developer accidentally breaks your authentication (or somebody hacks it), it becomes trivially easy to download all your items very quickly

And it isn't like using a randomized token is hard. In the most common implementation, it is just one additional column that gets filled with a random string and an index on the column.


In that simple scenario. What are some ways that a hacker could break your front end API to allow it to serve requests for multiple users without having access to multiple account logins? I understand that they could possibly get access to your database but that’s a different threat.

If they could somehow change your code, all hope is already lost.

But I do agree with it does allow someone to determine rate of growth which would be valuable more from a business intelligence side than a privacy violation.

The larger issue is that a developer forgets to add the “and userid = ?”

I guess the work around for that is to have a database that ties user authentication to records in the table/object store directly like DynamoDB or S3.


In my experience, many tables don't have a userid on the table that would be associated with the user. It would be a table join or two or three away.

So the developer may think it is safe to say select value from stock positions left join account on account.id = stock position.id left join user_accounts on user_accounts.accountid == account.id left join users on user_accounts.userid == user.id where user.id == session.userid.

Safe right? We checked userid. But then clicking on the position to drill in on the position data, they just select * from stock_position where stock_position.id = params.stock_id... there's no "and stock_position.userid" on that table, and the developer might be too lazy to spin up the entire join again especially if you don't need account data for this view. Whoops, suddenly a vulnerable page query.

I imagine there are other ways to screw up. Like insecure cookies, and just checking cookie.userid, ah yes, you're the right user. Whoops, didn't realize cookies could be spoofed.


If the cookie is spoofed and someone got another clients authorization token, then they would get any documents that user was authorized to see anyway.

But you don’t do cookie.userid.

You send the username and password to an authentication service which generates a token with a checksum. The token along with the username and permission is cached in something like Redis.

On each request, middleware gets the user information back using the token.


I'm familiar with that process. I was trying to illustrate a picture of how a poor developer might stumble their way into this situation. It's technically possible to store the userid in the cookie rather than using JWTs, but obviously it's not secure in the slightest.


(It's apparent that my initial reply didn't resonate, so I've made substantial edits to my reply for clarity's sake. If you've read it once, give it another read; it's from the angle of an organization with much in the way of legacy impairment.)

> Yet another security vulnerability caused by...

I mean, yes, but these are also some of the easiest vulnerabilities to miss even with out-of-the-box static analysis (code scanning and data analysis), automated dynamic analysis (pentests [edit to clarify for tptacek: automated pentests]), and a basic code review process. They're usually identified in live environments during manual penetration tests or, in more security-mature environments, with custom static analysis checks and custom linting rules.

As for best-case prevention: accomplished generally architecturally, e.g. language/framework decisions that enforce secure coding practices by design, or implementing certain patterns in development which whisks away some of the more risky coding decisions from engineers who may not be qualified to be making them, such as mandating authn/z and limiting exceptions only to roles and change processes qualified to make them. Checks including linting for specific privacy defects (direct object referencing using sensitive data or iterative identifiers as opposed to hashes/guids/etc) can help with catching them during development, and as you might've guessed, such checks tend to be custom for a given environment rather than out of the box.

I distinctly recall a card issuer whose name starts with a C in the United States having an http endpoint which allowed for enumerating account details by iterating full PANs (16 digit card numbers)... around a decade ago. Here we are today, and you're seeing the same bugs continue to arise.

Mitigation options in organizations with immature security practices typically rule out remediation simply because their existence might not be known, and practices traditionally reserved for defense-in-depth may need to be relied-upon instead (think monitoring web requests for anomalous behaviors and blocking traffic when detected) rather than trusting that one can fix all the defects, and even then you'll still lose a few records... but that might be the only solution available to you as a CTO, CIO, or CISO simply because of resource constraints and bureaucracy in an entrenched org e.g. in the financial or insurance space.

--

tl;dr: these defects are among the harder ones to catch for legacy applications especially in environments with weaker security postures, and they're as old as time. What I'm saying is that as much as we can call companies out for making these mistakes in hindsight, their existence in larger legacy systems is to some extent inevitable and must be managed in other ways.


There are no effective static source code security analyzers. Static analyzers aren't a bad thing to add to a CI pipeline, because why not, but anyone depending on static analysis is playing to lose.

This is absolutely not the kind of vulnerability that pentests tend to miss; rather, they're the first thing pentesters check for. You can miss bugs like this when they're in obscure backend features and your client or team didn't document the project adequately --- though you still shouldn't, and that's part of the point of getting an assessment, to find stuff like that --- but you generally don't miss them in an assessment where the bug is literally "edit a number in a URL".

Web scanning tools will miss findings like this. But, regarding web scanners: see static source code security analyzers.

As for code review: a competently constructed application shouldn't be relying on developers to catch every possible instance where numeric ids are used individually. In modern web frameworks, it should be obvious when you're looking an ID up without doing an authorization check; for instance, in a Rails or Django app, you can simply regex for lookups coming off the ORM class rather than the appropriate association instance.

In sum: I dispute much of this analysis.

People do miss things, even when they're things they shouldn't miss. Put 3 different test teams on the same application and you will get 3 overlapping but distinctive sets of vulnerabilities back. But this is not an instance of the kind of vulnerability that is hard to catch.

see below


> This is absolutely not the kind of vulnerability that pentests tend to miss

You're right; they don't. Which is why I called out automated dynamic analysis. I.e. the web scanning tools which you subsequently mentioned:

> Web scanning tools will miss findings like this.

---

> As for code review: a competently constructed application shouldn't be relying on developers to catch every possible instance where numeric ids are used individually. In modern web frameworks, it should be obvious when you're looking an ID up without doing an authorization check; for instance, in a Rails or Django app, you can simply regex for lookups coming off the ORM class rather than the appropriate association instance.

Right, which I also stated:

> As for best-case prevention: accomplished generally architecturally, e.g. language/framework decisions that enforce secure coding practices by design, or implementing certain patterns in development which whisks away some of the more risky coding decisions from engineers who may not be qualified to be making them, such as mandating authn/z and limiting exceptions only to roles and change processes qualified to make them. Checks including linting for specific privacy defects (direct object referencing using sensitive data or iterative identifiers as opposed to hashes/guids/etc) can help with catching them during development, and as you might've guessed, such checks tend to be custom for a given environment rather than out of the box.


I'll amend my previous comment to say that I only dispute much of the analysis, not "the whole" analysis.

A sibling comment makes the obvious point that no pre-auth endpoint should be touching this kind of data to begin with, which is another layer of "stuff you can just regex for".


> I'll amend my previous comment to say that I only dispute much of the analysis, not "the whole" analysis.

That's fine, but I'd appreciate it if you just read the entire analysis next time. It shows that you respect the time people invest into constructing and presenting guidance, even if you don't necessarily respect the guidance itself.

---

Editing mine to match your edit... as if to make my point about reading the analysis in its entirety:

> A sibling comment makes the obvious point that no pre-auth endpoint should be touching this kind of data to begin with, which is another layer of "stuff you can just regex for".

Correct, something which I'd also stated:

> Checks including linting for specific privacy defects (direct object referencing using sensitive data or iterative identifiers as opposed to hashes/guids/etc) can help with catching them during development, and as you might've guessed, such checks tend to be custom for a given environment rather than out of the box.


Yeah, no, I think you got this wrong, but more than that I was motivated to comment by the implication you made that these were "easy to miss" vulnerabilities because bullshit security tools that don't work miss them. I don't so much care whether you're right or wrong, but I do want to take every opportunity I can get to disabuse people about the effectiveness of scanners.


> "easy to miss" vulnerabilities because bullshit security tools that don't work miss them

> I do want to take every opportunity I can get to disabuse people about the effectiveness of scanners.

This entire exchange is frustrating because it's exactly what I said in my root comment:

> these are also some of the easiest vulnerabilities to miss even with out-of-the-box static analysis (code scanning and data analysis), automated dynamic analysis (pentests [edit to clarify for tptacek: automated pentests]), and a basic code review process.

[...]

> Checks including linting for specific privacy defects (direct object referencing using sensitive data or iterative identifiers as opposed to hashes/guids/etc) can help with catching them during development, and as you might've guessed, such checks tend to be custom for a given environment rather than out of the box.

---

I'm going to step away from my keyboard a bit; please forgive me.


You "stepped away from the keyboard", and then edited your comment. I read what you wrote differently than you appear to have intended. It is fine if we simply disagree about this. If you think scanners suck too, we might just not have anything worth arguing about.


> I read what you wrote differently than you appear to have intended.

I really appreciate this as this at least concludes that a miscommunication took place, thank you. I'll accept that there's likely a bit too much flourish to what I write for the sake of targeting nuanced clarity.

> If you think scanners suck too, we might just not have anything worth arguing about.

Largely yes, but I do think they have their place. I view them more as platforms to build upon or add to (e.g. custom data rules or enforcing the use of specific best practices) than generalized security salves, but as you'd pointed out, many of those objectives can also be achieved through much simpler means, e.g. just grep the code for things as a commit test.


By the sounds of it, another breach from a well-known, not new web application security vulnerability, "Insecure Direct Object Reference".

That vuln has been an explicit part of the OWASP Top 10 since 2007...

Unlike other common web app vulns (e.g. XSS SQLi) IDOR usually can't be fixed by a development framework (e.g. ASP.Net or Rails), it needs app. specific coding for proper Authentication/Authorization checks.


> He said anyone who knew the URL for a valid document at the Web site could view other documents just by modifying a single digit in the link.

Good thing he didn't post this bug online after getting no response. I remember reading about someone who did that on an AT&T website a while back and was sent to jail for simply incrementing an id number in the URL and talked about it on Twitter.


That was probably about weev, and they were after him long before that case, so it's not likely that it would get some random person (that the FBI doesn't have a file on and an interest in picking up) in the same trouble.


That is an incredibly low friction interface to our documents. /s

What are the odds they have access logs going back to 2003?


Pretty good, everything was probably set up and configured with default settings by that unpaid intern they had running their infrastructure back in 2003.


default settings would wipe/rotate logs after some time, no?


Rotate maybe, not wipe. As far as I'm aware for most webservers you have to tell it to rotate the logs based on some criteria, otherwise it just keeps appending.

If debugging is turned off it's entirely possible that they have been appending lines to the same log file for the last 20 years and haven't run out of disk space which would cause them to notice. Say 200 bytes in the log per request, and even averaging 10000 (probably more than they get) requests per day, in 20 years that's only 13GB.

It's also entirely possible they turned logging off or redirected to /dev/null in order to "be more efficient".


The organisation should have a data retention policy.


I just closed on my first house this week, and First American was of course my title company. I'll be interested to see if my data is included in this breach settlement or not.

I did notice when I was reviewing my docs that they emailed links to unauthenticated copies of docs, but they were mostly public records so I didn't think twice about it.

So they have my Name, address, email, SSN, copy of ID, copy of check from my bank with account/routing on it and much more, all in the open apparently.

I just went through an SSO implementation with a small team for a large user base. It was a bigger project than we had anticipated, but nonetheless manageable. I can't fathom that a financial institution of that scale could be that lax with basic security. Wouldn't their systems be subject to some regulation and require some kind of audit on a regular basis? Is this a failure of auditing systems, as well as internal security or even basic IT?


Programmers fault? Audits fault? Securities fault? Pentesters fault? It fault?

Listen until C-level funds these programs properly and security is taken seriously by all issues like this will forever be in the news.

I would be willing to bet their security like most have a long list of security gaps they cant get fixed because resource issues just hope they documented or it could fall on them.

Most coding classes just teach how to make things work in Mister Roger's world. Secure coding is an elective! Most run the DevOps model instead SecDevOps and only involve security after it is ready to go into production no matter what flaws security finds.

Why are black box pentests still taking place? Because company required to have pentest but really do not want testers to find things. Their goal is not to improve security rather check that box ... we had a pentest.

C-level, this keep the lights on budget you give Security/IT is costing you more than properly funding us! Oh yeah you put that $ into cyber insurance! Lol let's see how well that works.


If the financial penalty was high enough they would increase budgets. There is no accountability for losing customers personal information. If you can make a strong business case behind the average risk a company takes on it would help this discussion more. For each example of "company X had a major financial impact" you need to average it out against "company Y lost hundreds of millions of SSNs and had zero penalty".



I see a lot of comments on sequential as the issue. Really is that the issue?

Not the fact that John Doe can get to John Doe2 stuff without authenticating? WTF

Sequential or not if no auth I can run a scanner and get it all so what the hell does that have to do with the price of tea in China?


I like how this news was posted on Friday afternoon before the Memorial Day weekend.


Taking out the trash, as it were


Where does one go to learn how to not cause this one day?


OWASP Top 10 list. OWASP's website is kind of a mess in my opinion, but there are numerous external write-ups about the top vulnerability types.

https://www.cloudflare.com/learning/security/threats/owasp-t...

Also this github repo maintained by OWASP seems pretty exhaustive. The cheatsheets directory has a lot of different vulnerability classes.

https://github.com/OWASP/CheatSheetSeries/blob/master/cheats...

This "Insecure Direct Object Reference" was recently combined into the "Broken Access Control" category with a few others.


Thanks!


Lol everyone fights security and it is way under funded so can only get like 1 out of 100 risks fixed but must be securities fault.


"First American has learned of a design defect in an application that made possible unauthorized access to customer data. At First American, security, privacy and confidentiality are of the highest priority and we are committed to protecting our customers’ information..."

Who is coming up with these statements?

If you kept royally screwing something for years that you claimed to be your "highest priority" - then what can one expect from your normal lines of business?


>At First American, security, privacy and confidentiality are of the highest priority and we are committed to protecting our customers’ information.

is such a meme.

Things will continue this way until there are serious repercussions for entities carelessly handling data.


The next sentence too:

> We are currently evaluating what effect, if any, this had on the security of customer information.

It's downright dishonest to even say "if any": they were presented with concrete examples of leaking customer information; they don't get to wonder whether it had an effect on their security anymore.


Yeah, considering that they have sent these URLs to at least tens of thousands of people, it would be hard to believe that nobody ever inappropriately accessed a document that didn't belong to them.


There was a repercussion a couple of days ago actually for Equifax. But yeah, maybe not serious enough to matter that much, I'm not sure what the outcome will be.

https://www.msn.com/en-us/finance/markets/moodys-cuts-equifa...


At some point people will realise that holding large quantities of sensitive information is a liability, not an asset. Mindsets are slowly changing in this direction already.

The chickens will continue to come home to roost until people treat digital security as seriously as physical security.


While everyone here says "Oh that's terrible!" the market says "Oh that's terr-SQUIRREL" and then forgets it ever happened. Additionally, no appropriate fines have been levied nor jail time handed out for this sort of thing - right now the sane approach (money wise) is just occasionally have a breech and offer up an apology.


That's what everyone (including myself) said after Equifax but just this week their credit rating was downgraded by Moody's:

https://www.darkreading.com/attacks-breaches/moodys-downgrad...


Equifax should have been fined billions not just had a credit downgrade.


Big enough credit downgrades can hurt.

Still not enough.


The market shouldn’t be the only entity responsible for punishing this kind of massive damage.


It shouldn’t, but currently is and I have seen 0 signs of any change in direction.


> The chickens will continue to come home to roost until people treat digital security as seriously as physical security

Do people take physical security seriously? It doesn't seem like it.

Anyway, when I was an undergrad in the 1990s and took a computer security class our professor (Gene Spafford) talked about security being primarily an economic question. And that is generally how security, both physical and digital, has been treated since forever. And how it will always be.

The economic and physical damage caused by poor digital security is a rounding error compared to everything that happens in the real world.

As long as you understand that the following link is at least partly tongue-in-cheek, you may find this to be an entertaining read:

Cybersecurity is not very important http://www.dtc.umn.edu/~odlyzko/doc/cyberinsecurity.pdf


Very few people take workplace physical security seriously outside of a few industries like prisons. Complete strangers have physically unlocked and opened the door for me to enter "secured" areas because they assumed that since I was walking in that direction I must be authorized to enter.


Only if there are laws making it a liability, since investors don't seem to care much in the long term.


Laws won’t change this. It will only make it a compliance burden and the same mistakes will continue to happen. Plus standards and practices move way faster than laws.

What investors should be concerned about is the reputational risk and loss of business to competitors that are able demonstrate more transparent and secure practices.

Maybe laws for monopolies, but not for competitive markets where consumers have choice to shop around.

For a competing business these dumps are a powerful marketing tool. It’s a direct client list. They just have to be able to show that their security is better.

Laws would make things so much worse for everyone. The key is to keep hacking away at all systems. Break things apart and build them back together. And win customers by showing that you can!


Explain how your proposal would fix Equifax


I wonder if we should take mandatory breach reporting a step further too and require them to list all security vendor products and services that were in place at the time of the breach.

Should security solution vendors be held to account for failing to live up to the bold claims they make?


Depends on how they sold it. Did they sell tools, or tools plus configuration services and consulting?


It may not be workable, but when big businesses have invested millions in tools and services I can't help feeling there should be some vendor accountability.


That would be unfair, as the efficacy of most products depends on how they are configured, monitored and maintained.

For example, if I install an application whitelisting system, but whitelist too much, pay no attention to logs and alerts, or never patch it, then that's not really the vendor's fault.


> At some point people will realise that holding large quantities of sensitive information is a liability, not an asset.

That's my line :):

"It forces you to think about data as a liability, rather than an asset and that particular mindset is a good one to have when you are dealing with end user data."

https://jacquesmattheij.com/gdpr-hysteria-part-ii-nuts-and-b...

It stood the test of time rather well. Now we see a US push for a similar law and articles such as this one hopefully will cause that to arrive sooner rather than later.


This company is dealing in financial transaction data. Someone needs to hold it, and it can be deleted (especially when someone asks for it). I don't see how this particular situation advances your position.


For one they could split it into 'hot' data and 'cold' data that needs to be stored for legal and compliance reasons but that does not necessarily need to be part of the live set. That strategy alone would seriously limit the impact of a lot of these breaches.


I hadn't actually read your GDPR series, thanks!

Absolutely agree, and to further it I think this data liability goes beyond PII. Any data which could be used nefariously if publicly available is a potential liability if leaked - NDA'd documents, product roadmaps, source code of closed source software, private keys, pre-results earnings, the list is enormous.

With the shift in the economy from physical goods to IP I don't see why laws for physical goods storage, warehousing and safekeeping (eg. safety deposit boxes) won't be updated to include the digital equivalents in the not too distant future. And at that point I wouldn't want to be a Dropbox, EC2 or DigitalOcean unless I was very very sure of my security systems, never mind being a Facebook or Google.


Having a good definition of the data life-cycle is a very important step. A lot of companies only do CRU but forget about the D because they feel that more data is more value. As you correctly infer at some point in time the value of the data no longer outweighs the liability and it should be deleted, and long before that it should probably be moved to a much harder to reach system that contains historical data.


Unless they get punished or lose all their customers how is it a liability?


It's not.


One must hold large quantities of sensitive data forever to operate in the title insurance and settlement services business.


do you think digital security will ever be possible?


It seems that the stock price (under the ticker FAF) hasn't suffered very much. This was revealed on 5/19, and the response has been tepid.There isn't likely going to be very much backlash on the stock, unfortunately.


I think that's May 2019, not May 19th. From the bottom: This entry was posted on Friday, May 24th, 2019


I don't think there will be, we're well into "breach fatigue" territory now, and here there's not even currently any evidence of malicious use.

Unless/until this breach results in a large financial hit to the company (possibly via a class action suit) I doubt it'll have any impact and I'm not even sure a class action suit could show damages without evidence of misuse.


Equifax's costs as a result of their breach have exceeded $1 billion now, and Moody's downgraded them a couple of days ago.

I suspect this is going to hit First American pretty hard.


Doubt it. There's no evidence yet that FA's leak was used by malicious hackers. The breach was discovered by some random other person who typed in a different parameter to a URL. Very likely that no one else would've known about it.

Whereas the Equifax situation was intentionally breached by attackers and it can be assumed that the breach was used to capture information for later sale.

I suspect that First American knew about this earlier this week and intentionally did a garbage dump on a Friday evening on Memorial Day Weekend. Maybe trade down a few tenths of a percent on Tuesday and their CISO will probably get axed. Nothing to see here, move along.


Markets are closed, down slightly after hours, but after hours volumes are weird.


The breach was revealed 6 minutes before I posted this. First on Twitter, then here.


Mainstream media haven't picked up the story yet. It only just broke.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: