Hacker News new | past | comments | ask | show | jobs | submit login
Office 365 global authentication outage (office365.com)
342 points by Jedd on Jan 29, 2019 | hide | past | favorite | 154 comments




I love the top comment: "Pretty sure It's Office 361 at this point"


If that, been an enterprise customer with a company that is a gold partner and been using Office365 for 7 years now - it's down so often we stopped monitoring it, easily the most flaky, unreliable, slow and poorly managed online service I've personally used. (Words are my own, not necessarily those of my employeer(s) etc...)


It takes them forever to fix stuff as well. For about 6 months it had “Reply Al” instead of “Reply All” on OWA. Trivial but it shows how crap their QA is.


They fired their testers. Now users - both paid and free are testers. Then they go extra-miles to ignore user/tester feedback.

https://www.bloomberg.com/news/articles/2018-08-23/microsoft...


And also it shows they don’t use their own products (the Reply All typo should shock the guy who owns the UI). The one software the dev team does use is Visual Studio, and it shows. It keeps evolving, not frozen in time in the 90s, the evolutions make sense, they are user centric, pretty much every version is a visible improvement over the previous version (except vs 2013). It’s basically the exact opposite of Office.

That theory breaks on Windows though. Is Microsoft a MacOS shop?


About Microsoft not eating their own dog food.

Nadella used an iphone secretly for a long time and then openly and then finally killed Windows Phone.


And it's also the only major Microsoft software that doesn't use a ribbon!


> “Reply Al”

Maybe they're just fans of Paul Simon[1]:

  If you'll be my bodyguard
  I can be your long lost pal
  I can call you Betty
  And Betty when you write me
  You can call me Al
[1]https://www.youtube.com/watch?v=uq-gYOrU8bA


Great reference :)


You still can't actually change your billing country in Office 365 without starting over with a new account. Even support can't do it.

https://office365.uservoice.com/forums/273493-office-365-adm...

It's a complete shitshow.


It's the same on Azure, so I created a new account to circumvent that issue and then they randomly banned me for doing that.


The same goes for Azure. Set up the wrong country / VAT / thing? Better migrate all your cloud services to a new account. No dropdown, no changes, no nothing.


The same goes for Google Checkout / Pay / ... or whatever it is called this quarter.


My company has a subscription and the search functionality mysteriously and randomly doesn't work at times, or is really slow... I have to guess which one it is _this_ time when it starts acting up...


I find the same issue, on several occasions I've done some wireshark / tcpdumps on Outlook (desktop) traffic and it was quite shall we say 'messy', additionally as we notice emails sent via Office365 at times go missing or just never arrive (with no NDR), a couple of us started inspecting headers of Office365 sent emails and found them to be bouncing around internal Microsoft servers with broken certificates and NTP so out of sync that exchange online (or whatever component does this for their system) moved on to another server until it got to one that was working.


That's weird because I and all my clients use o365 and it works fine


It leaves very little place for interpretation(works for me) when most of these reports have a kb/issue tracking# in microsoft dashboard - check the comments for issue numbers.


I find most of the time when Office365 (especially the mail or calendaring component aka Exchange online) is down - their status pages all say everything is fine, if you contact support no matter what you tell or provide to them they want you to reinstall Outlook.


9 in 7 months makes a pro-rated 15.4 days per year, so really only Office 350.


It's maintaining at least nine fives. Not five nines.


I see Office 360 written a lot, and I'm convinced it's not 50/50 sarcasm and ignorance, but we'll never nail down the ratio, similar to nature vs nurture in behavioral science.


Apparently it is, at most, Office 356.


24/7 Pizza called. They want their misleading hours of operation back.


24 days a month, 7 hours a day, what's your problem?


This is exactly how it works in italy.


http://www.oecdbetterlifeindex.org/countries/italy/

Less (immediate?) satisfaction but people seem to live longer.


I guess unemployment preserves.


This is HN! No joking! Read the guidelines! /s


Out of curiosity, does anyone know of an actual comprehensive list that is being maintained by an objective third party?


Some of them seem to be limited to certain geographic regions. (UK/Europe)


    0 0 1 * * kubectl patch


Update: This apparently affected all of Microsoft Cloud products - not just Office365 as I stated in the title. Source: https://twitter.com/AzureSupport/status/1090366788972404737

*Edit: I believe this includes Azure-AD, which I would assume would affect people using it for auth for non-Microsoft cloud products (perhaps even on-prem (if people do that?)).


can confirm, our SSO uses Azure. I'm blocked from my remote environments until the service is back up.


Cloud always was a single point of failure that's why it failed in the past that's why it will fail when the first virus ignites a forest fire.

The truth is the users needs weight in little compared too drm and control of pricing via shakedown methods.


A well run service run by a world-class team will provide global availability performance and service that small teams are incapable of.

However, in this case, I don't think the service is very well run, so there's downtime. Not really an issue with "the cloud", this is just an issue because of bad management and/or design of the system.


Github is still up.


They’re probably still running on their previous systems not Azure


Also confirmed, I was having problems with it all day, working in the Azure portal.


I wonder if this will be a wake up call to Microsoft to maybe allow offline use of things like OneNote. It's really unacceptable to have something that important so reliant on an online service.

(I doubt they'll change their tune. Theee years of complaints about this and they still refuse to allow local notebook storage on Mac, so it seems unlikely that they'd change their minds now.)


For most of its existence, every single Email Outlook sent was unacceptable. As per RFC. It was so bad we had to ban Outlook users from mailing lists.

None of this has stopped businesses from sinking deeper into being dependent on MS products.

People will complain, crack jokes, and then sign up for more of that Microsoft.


What's the alternative? Outlook is an offline app that most employees (technical and non-technical) know well how to use. The only other alternative (for enterprise) seems to be IBM Notes and that one isn't necessarily better (and often less liked by employees in my experience).


I don’t understand why people don’t like thunderbird.

It’s not as nice as mutt but I feel like outlook users would be comfortable with it.


Outlook's integration of mail and calendaring (and to-dos) is very, very, very useful for business users. Thunderbird is only one piece of that.


Thunderbird has a calendar and to-dos as well (though I never used them personally, so I can't speak to their quality). Does Outlook do any special integration between them (outside of just having them in the same app)?


Yes. It uses SMTP as a messaging channel for calendar invite sharing - sending a meeting invite to someone (through the calendar) sends them an email, which their outlook client picks up and treats as a calendar invite. It also integrates with Active Directory.


Not quite SMTP, but yeah. It's whole normal transport system. Add on to that things like integration with conference call systems (eg: Skype) for meeting and dial-in setup, conference room scheduling (Room Finder), and it's honestly a really nice tool for business use. It does a lot of things that users really like.


I've used Thunderbird for years- ever since Evolution became unworkable. But evolution was very good too back in its day- more Outlook-y.


Piracy of Microsoft products that allow offline use is rampant, especially in poorer countries. A cloud service is a really good tool to prevent piracy.


The only issue is that Office is extremely easy to pirate, and it's pirate-able by the (you guessed it!) offline authentication service, KMS.

If it's so easy to pirate office then the cloud-based authentication service only serves to cause issues for those who legally purchased the software.


I don't mean the downloadable thing, I mean the online thing that runs in the browser.


Sorry how could you pirate "the online thing that runs in the browser"?

The problem is that recent OneNote versions (at least on Mac), require Microsoft signin and can't even open local note files.


Well, you can't, that's my point. That's why it's so great for Microsoft: Nobody to pirate it any more, means more revenue.


Office365 still has a desktop executable, parallel to its online program. It's not all browser-only. I think that's why you two are talking past each other.


They're barely desktop apps - they're essentially bloated javascript heavy web-frame like apps that perform poorly, have many bugs (IMO 7~ years of use) and require login to Office365 / Azure online.


Doesn't help that I've stopped using Office in favour of Open-/LibreOffice about 7 years ago or such :).


(While this is almost certainly the case for Office, a well-known leader in its category, it should be noted that this logic is not accurate for all products - Increased opportunity cost to the legal consumer and lack of piracy externalities such as network effects mean "always-online" requirements may be detrimental.)


OneNote for Mac is free.


It was just a SharePoint component. You can easily pirate it.


piracy is only an issue if it effects people paying for your product; if they were never going to pay anyway, why not give it away for free? keeping a digital product locked away out of spite is just... spiteful

especially when it comes to poor economies! the cost of a windows/office license is a good portion of (if not an entire) yearly salary in some countries! removing the tools that help them to compete with more affluent economies is pretty poor form


> the cost of a windows/office license is a good portion of (if not an entire) yearly salary

In poorer economies you can just adjust the price to be comparable to local yearly salaries. It's not that piracy means that end users get the licenses for free any way: there are many people who sell pirated versions of Windows and Office.

> removing the tools that help them to compete

You can be just as competitive with Libreoffice and GNU/Linux as you can be with MS Office and MS Windows. The cime of piracy steals a giant market of hundreds of millions if not billions of people from Libreoffice and GNU/Linux and turns these people into second-class citizens.


many software companies choose to upload "cracked" versions of their own software ...


> if they were never going to pay anyway, why not give it away for free? keeping a digital product locked away out of spite is just... spiteful

How do you draw the line between "if you weren't going to pay anyways" and "well we don't have to pay, so we ain't gonna"?


For me that is easy. If used for personal use, make it free (and thus, more popular plus free bug reporting for bleeding edge versions). If used for commercial purposes, not free. Then go after businesses that use unlicensed copies of the software.


So the companies in the poor economies have to pay for it? I understand you're not the previous poster, but these seem to be different motives.


I can't speak for Microsoft, but I could imagine some type of negotiation on pricing based on several criteria. When businesses purchase licensing for such things, there is always a negotiation phase that starts around 60% discount and goes up or down based on many factors.


Why not use libreoffice?


Hmmm that sounds like a perfect use case for...maybe those countries, rather than pirating Microsoft's products, use and contribute to open source projects, like libreoffice, and help make it a proper competition for expensive licensed products.


Some reason they can't use free software?


They will have a full desktop version of the Office suite for a very long time so that makes no difference.


The status page reads as below (I've copied the text here as it a) may not be accessible to everyone and b) often has old information removed / edited, c) this status page / information is usually pay-walled and only available once you've authenticated):

--

Microsoft 365 Service health status

Title: Unable to access Microsoft 365 services

User impact: Affected users are unable to authenticate to and access Microsoft 365 services.

Current status: We've received reports of an issue affecting access to Microsoft 365 services. We've identified a degraded portion of infrastructure that manages authentication requests and have restarted that infrastructure to mitigate impact.

Scope of impact: This issue may potentially affect any of your users attempting to access Microsoft 365 services.

Start time: Tuesday, January 29, 2019, at 9:15 PM UTC

Preliminary root cause: A portion of infrastructure that manages authentication requests is degraded, affecting access to one or more Microsoft 365 services.


I love how they are just "turning it off and on again".


That's literally the fix for almost anything Microsoft. If it goes any further than that it's "Restore from backup".


It's pretty common practice for complex systems. If it's already dead, and nobody knows what's wrong or how to fix it, you try restarting it. (normally you first redirect traffic to a different region, but apparently big orgs are still running critical infrastructure with changes that affect all regions and can't be backed out)


Kubernetes doesn't even have a "restart this deployment" ability...


"Thank you for calling 310-DELL. Have to tried restarting your computer?"

(Older Canadian tech types should recognize that.)


Update: "Preliminary root cause: A portion of third-party managed network infrastructure that facilitates authentication requests and access to Microsoft 365 services was degraded."

(Emphasis mine.)


That doesn’t seem right. For something so critical to many of their corporate mega-customers, there shouldn’t be any “third-party managed” infrastructure at all.


That’s not how the internet works. Every large provider has third-party connectivity. There’s no wire from Microsoft to your house.

You can buy connectivity from 5 diverse providers per data center, but if one of them continues to advertise your routes while not being able to actually pass traffic to your destination, you’re “down” to some people through no fault of your own.

I’ve had this happen with AT&T, XO/Verizon, and Cogent in recent years. It’s not uncommon. In all cases it was mis-configured ISP routers. Usually a support engineer admits a QoS or DDoS-protection configuration went wrong.


More so that there should be redundancy if it's a critical system.


"network infrastructure" - it'll be a fiber cut or failed router, together with a failover process that didn't work or was overlooked.


Not sure why I was down-voted for this, it's a direct copy-paste of the service status page which a) may not be accessible to everyone and b) often has old information removed.


> Not sure why I was down-voted for this, it's a direct copy-paste of the service status page

Being a direct copy and paste of th page linked to the thread without additional explanation is probably why it was downvoted.


... there's a pretty damn good explanation given in the comment actually

and it's pretty obvious regardless: status pages are highly transient. it's probably good to have a record of what's being discussed?


They edited their comment after which now makes it clearer why copy/pasted.


Please don't clutter the comments by copying and pasting what is already a fairly succinct page.


As I said below the "status page which a) may not be accessible to everyone and b) often has old information removed"


You made that comment after I had made mine, so I had not seen it.


I don't agree with this. Not everyone wants to click an extra link when browsing on a mobile device.


Also to note: status pages are, by their nature, transient. Having a record of what it previously said is helpful context for any discussion regarding those prior contents.


Looks like it's started with a DNS outage @ Level3: https://downdetector.com/status/level3/map/


My ISP (Telefonica) currently doesn't return addresses for login.microsoftonline.de (cloud Germany) while Google's DNS servers do. I wonder if that's related.


Interesting, I suspecting their 'restart' of 'authentication infrastructure' may be a rolling restart and perhaps they have not got to Germany yet, perhaps by using Google's DNS different 'authentication infrastructure' is being provided?


Moving to another Provider might involve shifting to other ips. Probably sucks to be working there right now.


We thought that too - gwmigprda.aadg.windows.net.nsatc.net seems / seemed to be failing to resolve, but then it seemed like it was internal to their infrastructure - which they later added to their status page (and clarified it was their authentication infrastructure which needed to be 'rebooted').


Perhaps it's related to the DNS flag day [0] that's coming up on Friday?

[0]: https://dnsflagday.net/


Certainly not, because it hasn’t happened yet


You've never made a change in advance of an upcoming problem that accidentally broke something else? Lucky...


Not certainly not. Hypothetical scenario: realise something in your infrastructure will have problems in a few days’ time, and so make changes, which accidentally break things.


I don't understand, could you explain? Level3 may be the biggest backbone provider but I doubt that Microsoft is peering only with them are they? This is maybe something that a smaller company might do but Microsoft seems too large for that.


Wording of the communication to my organization indicated that Level 3 is handling some subset of Azure-internal networking. I don't think it's a peering issue.


Thanks!


Why would Microsoft be using a third parties DNS? It’s trivial to host it yourself.


Ouch, level3 is pretty big. How could they have an outage?


Level3 is not exactly known for being rock solid.


And, while this may not be connected, Microsoft also seem to have deleted all of our Sensitive Information Policies in the Office 365 Security and Compliance Center overnight too. Policies that we running yesterday all seem to have been turned off or deleted this morning. This is a HUGE security risk. As I say it could be coinicidence but I'm not so sure.


Christ I cannot stand Office 365.

For some reason, I cannot open files that are in my OneDrive folder without O365 authentication shitting itself, and refusing to save files. This means to edit files that are in OneDrive, I have to move them out of OneDrive, open them, edit them and save them, then move the back into OneDrive.

The situation with having a different fucking login/account for every single fucking Microsoft service (Skype, Office, etc.) even when you're on the business tier for them all is insane and endlessly frustrating.


There seems to be an attempt at a single-sign on solution, but when I used it about 2 years ago, it sent me a on trip to about 6 different domains (live.com, microsoft.com, office365.com, something with sharepoint in the name, and several subdomains of those).

In the past, I've given folks the tip to not open links if they send you to a different domain than the link suggests -- by that standard, the SSO looks like a complete scam. It takes some serious knowledge of the Microsoft products to know that those domains all belong to the same corporate entity.


Office 365 Message Encryption looks exactly like phishing. (and is also useless as it doesn't have a second channel)

Good news: You can pay more to get phishing warnings in Office 365.... spooky


Can you clarify what you mean by second channel?


You have to send the way to decrypt data via a different method than your send the encrypted data, otherwise the person who can intercept one can intercept the other.

If you email me a file which needs a password, and then the password, that's pointless, you have to phone me or post me the password.

In MS's case the way you see the document is to login to MS's servers using your email account (so an attacker could send a password reset), or an emailed one time code (so an attacker can intercept and use it, either first, or if they can change the intercepted channel, not pass it on)


It is also a pain for parental controls if your kids use Kahoot or micro:bit with MS auth.

I've had to whitelist a bunch of non-obvious subdomains as well.


it was a marketing stunt! "hey everybody! you know that big google outage today? we have a cloud too! it's big, and it goes down every once in a while just like google's and amazon's! we're a real internet company now too!"


Around the world millions of accountants were able to go home early as they couldn't twiddle some cells in their spreadsheets, middle managers were spared from death by PowerPoint and countless people weren't told about their use of passive voice by Word's grammar checker.


Funny enough, GMAIL had regional outages yesterday too.


Yeah, my theory is that it's just about 2 weeks after everyone is back from holidays, which means all the engineers are back and ready to ship features and changes after nothing's been touched during the holidays. All this "go fever" is leading to somewhat reckless modifications of production systems.


FaceTime (a little bit different case though) 2 days ago, GMail yesterday, and Office 365 today.


Came here to make the same observation. Good on teams at Microsoft to not allow themselves to make a "we're more reliable than G Suite" joke.


Or not funny. Suspicious coincidence. Hope it is just that and not outside actors testing things.


Maybe it is just me but... The interest we here at HN have in showcasing service outages seems like it's mostly because we want to point to some big/other company and say "see they went down!" so we can feel better about it when our own services go down.


It may be that to a degree, but there is also a desire to point out reliability deficiencies in a product or platform that inevitably gets pushed down everyone's throats by management with low resistance to kool-aid.


Microsoft’s auth infrastructure has been a hot mess over the last 12 month. These outages are getting old.


The Office365 status page now states "There are currently no known issues preventing you from signing in to your Office 365 service health dashboard." and when you click the link and are forced to login (now that auth appears to be working for me at least) you're presented with an error preventing you from seeing service status or historical events stating "You don’t have permission to access this page or perform this action."

Screenshots: https://imgur.com/a/pALWiIR


Could this have affected my ability to log into Skype for Business? I could log in with my home wifi, but not my cell phone hotspot. Didn't think much of it at the time!


Yes, Skype / Skype for business uses Microsoft’s centralised servers including for authentication.



> I would say is pretty fair and based on today’s events everyone affected could be eligible for 25% discount on their bill as the service credit for breaching 99.9% SLA is 25%

https://www.reddit.com/r/sysadmin/comments/ajavl8/its_that_t...


I need 365 since it can do some things Google docs can’t do.

But, a while ago there was a widespread bug making is so that some percentage of people who had used the Mac or iOS apps couldn’t sign in. It was never officially acknowledged, and only fixed due to random people getting journalists to write about an emailing randome people on other teams within Microsoft.

Their cloud stuff is good but the auth and account stuff is a total mess.


Not sure if it is related, but the safelink service (the thing that wraps a link with a jump) is down at my school's mail service (that uses MS suite).


Someone pointed me to this [0], and I wonder if it is related? Maybe Microsoft tried some changes in response?

[0] https://medium.com/@lukeberner/how-i-abused-2fa-to-maintain-...


Lovely, there's this link on the page (doesn't work on Chrome): Add this page to your favorites


It's using the old IE window.external


Ironic to read all the negativity here when on the day after the OP Microsoft will announce quarterly earnings which will undoubtedly be driven by continued "Enterprise buy in to Microsoft Cloud Solutions".


Glad I purchased a standalone license (although I do have Office365 through work).


Does that mean that you don't (ever) have to login to Microsoft / Office365 online services with the desktop software? I have been hearing that people are having issues with desktop apps asking them to log back in to use them - and they can't.


You don't have to log in anywhere. MS sells Office 365 and Office 2019, 365 being a subscription service and 2019 a one-time purchase. People you talk about probably had issues with desktop apps that were a part of O365.


and today Microsoft's "Xbox live" cloud service is down and it's causing Xbox's to fail to boot: https://techcrunch.com/2019/01/30/xbox-one-black-screen/


I'm still using my copy of Office2008. It still works just fine. It never has an "authentication outage". And, as a bonus, it costs me nothing to continue to use it. I would feel sorry for all of the poor shmoes who are locked out of their office suites, but I'm sorry, you brought this no yourselves when you accepted a subscription model. If people just said no to the subscription model, no one would have this problem.


I don't think the complaints are about office suite access, as that has a fallback in case of failed auth. It's more about IM/VOIP/email, OneDrive/SharePoint, Azure cloud, etc.


Your personal home use case of Office is not really relevant here. Good luck organizing a huge company on that.


Companies are even more relevant because they have more leverage as buyers.


I see these comment sections get somewhat rowdy and negative. Anyone care to comment on the other side of things?

Are there any "it may be down, but it is up quite often and saves us heaps of time and labor, totally worth the occasional outage" type experiences?

We're considering G Suite and O365 soon.


Yes, there are some great things about O365.

For example, I often hear that O365 licenses are a bliss compared to complexity of microsoft enterprise licensing, when even ms reps don't agree how much licenses you need.


To be fair, "the licensing isn't completely miserable like their other product" isn't exactly a selling point.

My two cents from what I've seen (note mainly see GSuite sie of things).. You can get much more productivity & actual collaboration with G Suite, dependent on a few factors.

One, there has to be total buy-in @ the executive level because you most likely need to completely re-think how work gets done across every function. Often, we do what we've been doing and don't see the full picture because of that existing perspective.

So that requires a legit G Suite partner to help execute change management as for O365 I imagine. It's not so much a risk unless you do it for the wrong reasons; They've done it enough times to have a proven migration formula. Saving licensing costs for example should be on the bottom of your considerations because that more or less evens out and becomes irrelevant.


Yes, we're quite familiar with MS license audits and the auditors don't even know how everything should be licensed.

Good point, thanks.


Go check out /r/office365, this is a fairly irrelevant question to this discussion.


I know people make a big deal out of availability, but I personally don't usually mind that much when things go offline. I'm sure it could cause serious problems in specific situations though


Does anyone know how to get office 365 pro plus installed on a server to stop deactivating itself? Should we have just paid the insane amount for office enterprise or whatever the one time fee version?


How is it that everyone keeps screwing themselves with MSO when even if you pay for commercial use for libre office you come out way ahead. If you want to collaborate on a document or spreadsheet, you dont need to prostitute yourself to microsoft, its a case of being so self important and lazy that one more click or shifting between a telepresence application and an office suite application is to hard. All you people complaining about being down because msoffice products are unavailable need to get a grip on how to use a computer.


While I agree with your sentiments about the products, MS as a company and their software model, some of us are forced to use their products at work and of course some of us have tried to improve things and move away to more standards based and ethical software we don't always win (no pun intended) (and words are my own etc...)


Many people's work computers come preinstalled with MS products and have no choice. You might need to get a grip on how the corporate world works.


Incredibly this has taken down typingclub.com


Www.dayssincelastazuremajoroutage.com is constantly down because it is hosted on azure


Said domain is not registered, While I suspect your comment is just aimed at being humorous it may be taken by some as a bit of a troll comment.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: