A few years ago, we migrated our email service to Microsoft’s Office365 cloud service. Overall, it’s been very reliable and eliminated the challenges we had hosting Exchange ourselves. It let us get to our emails using Outlook installed on Windows, any internet browser, and smartphones. Office365 also offered other Office product online (Access Web Apps, Excel, Word, etc.), SharePoint and OneDrive Business.
Unfortunately, on the morning of June 30th, we discovered:
- Delays sending and receiving emails
- Some emails were bouncing back from recipients who couldn’t validate our Office365 Exchange Server’s SMTP (protection.outlook.com) with our domain name. That meant the Exchange SMTP server was no longer considered a trusted sender of emails from the @fmsinc.com domain.
- Our use of the Office365 SMTP server to send emails with our Total Access Emailer product was also failing to authenticate against the server
The problems began the evening before. Needless to say, we aren’t happy about this experience which impacted us and our clients using Office365. Reports are that it affects Office365 customers across North America.
Contacting Microsoft, they confirmed problems with the health of their Office365 Exchange Server. Throughout the day, problems lessened but persisted. We hope the problems are resolved soon and that we’ll understand what went wrong once we overcome the immediate crises.
These are the reports we’ve received from Microsoft. We’ll keep you updated as we learn more:
Exchange Online Service Degraded
This is what the Office365 Admin portal shows for Service Health:
EX71628 – E-Mail and calendar access – Restoring Service
Jun 29, 2016 12:11 PM
CURRENT STATUS
Our investigation determined that an existing transport feature which is designed to expedite the delivery of email messages became degraded, which caused impact to email delivery for a subset of users. We’re bypassing the affected feature to restore service
User Impact
Users may be unable to send email messages through the Exchange Online service. Email messages may appear to be stuck in the Drafts or Outbox folders.
Scope of Impact
A few customers have reported this issue, and our analysis indicates that for most customers, it’s unlikely that many users would report impact related to this event.
- Start Time: Thursday, June 23, 2016, at 3:00 PM UTC
Preliminary Root Cause
An existing transport feature that is designed to expedite the delivery of email messages became degraded, which caused impact to email delivery for a subset of users
EX71628 – E-Mail and calendar access – Extended recovery
Jun 30, 2016 2:18 PM
Current Status
We’ve developed an additional fix to address the underlying cause of the issue. We’re preparing to deploy the fix to the affected environment to ensure that the issue does not reoccur.
User Impact
Users may be unable to send email messages through the Exchange Online service. Email messages may appear to be stuck in the Drafts or Outbox folders.
Scope of Impact
A few customers have reported this issue, and our analysis indicates that for most customers, it’s unlikely that many users would report impact related to this event.
- Start Time: Thursday, June 23, 2016, at 3:00 PM UTC
Preliminary Root Cause
An existing transport feature that is designed to expedite the delivery of email messages became degraded, which caused impact to email delivery for a subset of users.
Next Update by: Saturday, July 2, 2016, at 7:00 PM UTC
EX71674 – E-Mail timely delivery – Service restored
Jun 30, 2016 7:35 PM
Final Status
We’ve confirmed that the remaining message queues have now drained after implementing a configuration change to optimize message filtering.
User Impact
Users were experiencing delays when sending and receiving email messages. Affected users may have received Non-Delivery Reports (NDR) when sending email messages.
Scope of Impact
Customer reports indicated that many users likely experienced impact related to this event. Our analysis indicates that this issue may potentially have affected any of your users attempting to send or receive mail.
- Start Time: Thursday, June 30, 2016, at 2:30 PM UTC
- End Time: Thursday, June 30, 2016, at 11:30 PM UTC
Preliminary Root Cause
The infrastructure responsible for processing Exchange Online Protection (EOP) message filtering became degraded.
Next Steps
- We’re analyzing performance data and trends on the affected systems to help prevent this problem from happening again.
- We’re reviewing our code for optimizations and automated recovery options.
- We’ll publish a post-incident report within five business days.
EX71674 – E-Mail timely delivery – Service restored
Jul 1, 2016 12:08 AM
Final Status
We’ve rolled out the fix and confirmed that service is restored. Any meeting requests created during the outage will need to have the conference room calendar removed and readded to book the room.
User Impact
Scope of Impact
- Start Time: Monday, June 27, 2016, at 6:00 PM UTC
- End Time: Friday, July 1, 2016, at 2:54 AM UTC
Preliminary Root Cause
Next Steps
- We’re reviewing our deployment and provisioning procedures to help prevent this kind of problem in the future.
- We’ll publish a post-incident report within five business days.