Email “Virus” Outage Incident Report

Posted on by

Summary

On Thursday 2 March 2017, email to and from the Internet between approximately midnight and 7am was being incorrectly classified as containing a virus, and this caused some messages to be permanently lost. Inbound email was described as having been quarantined, but this was not correct; the original messages had not been preserved.

Between 7am and midday on the 2nd, the email service was effectively shut down for investigation and repair. By midday, all services had been restored. All email sent from 7am onwards would eventually be delivered normally.

Although not yet officially confirmed by the vendor, the cause of the problem was a corrupt or absent antivirus update to the edge email servers.

Timeline

Thursday 2 March 2017

  • Midnight to 1am : Inbound email is increasingly being marked as [PMX:VIRUS] and notification versions of the originals are being delivered to end-users.
  • 2:30am : outbound email is now being marked as infected, and is rejected (i.e. the senders are being notified that their messages are not being sent out).
  • 6:30am : The Information Security Office becomes aware of the issue, and halts all of the inbound and outbound email services in order to investigate.
  • 7:15am : Vendor documentation describes the error that is being seen, but the recommended fix does not work.
  • 8:10am : Our external support partner pro-actively contacts ISO to inform them that there is a current issue affecting multiple customers globally.
  • 8:30am : First ITS Service Notice published – updated with current information at 10:30, 11:30, 12:30 and 3:30
  • 10:00am : Announcement “Email delivery issues” emailed to all-depts@ and CITSP@
  • 10:20am : Outbound email services are restored, but only by disabling the normal antivirus checks. This is not a suitable choice for inbound email, however; this remains shut down.
  • 11:50am : Vendor supplies a working update to the antivirus; testing confirms that this fixes the problem properly.
  • 12:15pm : Inbound email services restored. All email sent to us since 6:30am will eventually be delivered normally.
  • 3:20pm : efforts to restore original copies of the incorrectly-marked inbound email are unsuccessful, and are halted. A further announcement “Re: Email delivery issues” is sent to all-depts@

Remediation

We will review the vendor’s incident report when this is published in order to identify any improvements we need in our configuration.

We will investigate the failed quarantine action that caused the mis-categorised email to have been lost.

We will discuss this incident within the context of Disaster Recovery and Business Continuity Plans, to see if any improvements need to be made to these.

This entry was posted in Incident Response, Viruses/Malware and tagged , by Jim Cheetham. Bookmark the permalink.

Comments are closed.