Recovering from a ransomware attack shouldn’t be a mysterious process. A sysadmin reveals all the best guidance to get through it.
Ransomware attacks, despite dramatically increasing in frequency this summer, remain opaque for many potential victims. It isn’t anyone’s fault, necessarily, since news articles about ransomware attacks often focus on the attack, the suspected threat actors, the ransomware type, and, well, not much else. Sadly, there’s rarely discussion about the lengthy recovery, which, according to the Ransomware Task Force, can last an average of 287 days, or about the complicated matter that the biggest, claimed defense to ransomware attacks—backups—often fail.
There also isn’t enough coverage about the human impact from ransomware. These cyberattacks do not just hit machines—they hit businesses, organizations, and the people who help those places run.
To better understand the nuts and bolts of a ransomware attack, we spoke to Ski Kacaroski, a systems administrator who, in 2019, helped pulled his school district out of a ransomware nightmare that encrypted crucial data, locked up vital systems, and even threatened employee pay. Kacaroski spoke at length on our Lock and Code podcast, which can be heard in full below, offering several insights for those who may not know the severity of a ransomware attack.
Here are some of the most surprising and insightful lessons that he shared with us.
The first few hours are critical
At 11:37 pm on the night of September 20, 2019, cybercriminals launched a ransomware attack against the Northshore School District, which is north of Seattle in Washington State. The cybercriminals deployed the Ryuk ransomware against the school district, which relied on a datacenter of 300 Windows and Linux black box servers. The district also managed 4,000 staff members’ devices, including Windows, Mac, and Chromebook workstations, along with many iPad tablets.
The morning after the attack, Kacaroski got a phone call from one of the school district’s database administrators about problems with the database server. Shortly after logging into his employer’s VPN and poking around, Kacaroski learned that the server had been hit with ransomware. He saw one, unencrypted file—a ransomware note from the threat actors—and countless .ryuk file extensions nearly everywhere else.
These first few hours after the attack, Kacaroski said, are when he made a crucial mistake.
“If I was to redo this again, the minute I saw the first one [hit], I would’ve just pulled the power on every single box, ASAP,” Kacaroski said. “I definitely cost us probably a few boxes by not doing that quickly enough. But you never think you’re going to be hit by ransomware, so that’s not usually the first thing you consider when somebody reports the system is not working right.”
Kacaroski said that his school district’s cyber insurance provider later told his team that ransomware operators often target only Windows machines in these attacks. That kind of knowledge could have helped Kacaroski prioritize his and his colleague’s immediate reactions, protecting the Windows machines without worrying about any real threats to the Linux and Mac machines.
Your backups may not work
In the immediate aftermath of the attack, Kacaroski said he and his colleague, another sysadmin who works on Windows, were dealing with “an incredible amount of uncertainty.” They did not know what critical services had been hit, they were still trying to figure out which drives were operational by pinging them, and they were still working under the assumption that all of their devices—not just Windows machines—could be threatened.
But at least initially, Kacaroski said he and his colleague were feeling somewhat confident. After all, Kacaroski said, his school district had implemented proper backups. Or so he thought.
“We have a very good backup system, or at least what we thought was an extremely solid, rock-solid backup system,” Kacaroski said. “And then we find out, at about 4 or 5 hours after the attack, that our backup system is completely gone.”
Kacaroski’s situation is, believe it or not, somewhat common. Earlier this year, despite having a backup system in place, the meat supplier JBS still decided to pay $11 million to its attackers to obtain a decryption key after getting hit with ransomware. The biggest mistake that organizations make in setting up their backups, as we discussed in a separate episode of Lock and Code, is that those backups are not properly and regularly tested.
This moment of realization, Kacaroski said, hit him and his colleague hard.
“It started to really sink in that I’m going to have to rebuild 180 Windows servers, and more importantly, rebuild Active Directory from scratch, with all those accounts and groups, and everything in it,” Kacaroski said. “That part really, really hurt us.”
A ransomware attack can be a months-long process
The attack against Northshore School District was not an overnight decision by a single group of hackers. In fact, it wasn’t even the work of one group of hackers.
According to Kacaroski, after both the FBI and the Department of Homeland Security helped investigate the attack on Northshore, employees learned about a months-long process that most likely led to the eventual ransomware infection. The initial breach into Northshore’s servers likely began in March 2019, six months before the final attack, and it involved a group of hackers simply installing Emotet to gain access to Northshore’s servers. Once access had been gained, that first group of hackers then sold its access off to another group of hackers who, according to Kacaroski’s learnings from the FBI, then installed TrickBot to obtain domain credentials. Once those credentials were swiped, the group that deployed TrickBot sold that information to yet another group of hackers, which were believed to be the same group that pushed the Ryuk ransomware onto the school district’s machines.
Interestingly, Kacaroski said that the school district was told that the attack was likely uncoordinated between the three different groups, with the groups acting independently and simply leveraging the prior group’s access.
What also surprised Kacaroski is that the Ryuk ransomware gangs operate like a franchise.
“What we’ve been told is the Ryuk group is a franchise like McDonalds,” Kacaroski said. “There’s the Ryuk group that runs the West Coast, the one that does the East Coast, the one that does something in between, and they don’t actually pay for access to the Ryuk stuff unless they have a successful attack, so they basically pay a fee back to the people that wrote it every time that they have a successful attack.”
There are more ransomware attacks than you’ve heard about—far more
The week after Northshore School District was hit with ransomware, its cyber insurance providers said four additional payments were made to other ransomware victims. That’s just one week in late 2019. With the number of attacks being reported on today, and the recorded, increased frequency of known attacks, we can safely assume that the number of undisclosed ransomware attacks has simply skyrocketed.
In immediate recovery, first prioritize and then look for “surprise” systems
In responding to the crisis of a ransomware attack, organizations need to prioritize what systems need to go back online first. Often, that work is made “easy” for an organization because ransomware will often hit just days—or hours—before crucial deadlines.
For Northshore School District, their ransomware attack happened just days before employees were scheduled to be paid. That’s a deadline that simply can’t be missed, Kacaroski said.
“Payroll has to run—it is a legal thing. You can not not pay people. You have to pay them, which means four days after the attack, we had to have payroll up and running,” Kacaroski said. “That was the most critical thing.”
The school district then prioritized getting Active Directory and the student record system back online, as those systems were used countless times each day to simply help the school run. The student record system, Kacaroski said, was used by teachers, parents, and students themselves, and it needed to go back online quick.
Finally, Kacaroski warned about what he called “surprise” systems—systems that are in place that an organization may not know about or may not understand are crucial until they’re gone. For Northshore School District, that system was for the school’s cafeteria and payment records.
“We had no clue that [the food services system] did 10,000 meals a day and 30,000 dollars… a day. We had no clue if the students had paid for their meals or haven’t paid or they owed us money,” Kacaroski said. “That one took a long time to get up and working because it was a distributed system and it had no backups at all.”
Avoid chokepoints during a long, collaborative recovery
The Northshore School District sysadmins are a small team of two, and in responding to the ransomware attack, there was only so much they could do—literally. Employees need to go home to sleep, and they need time to eat—as simple and basic as that sounds. Further, when recovering from a ransomware attack, there will almost always be what Kacaroski called a “system admin chokepoint.”
Because system administrators know how the systems themselves work, they can often become the single points of contact for rebuilding the entire business, piece by piece. Those system administrators can then get overburdened by too many teams coming to them repeatedly for information, sign-offs, and verifications.
To help move the recovery process forward, Kacaroski said organizations should find ways to free up their sysadmins, either by finding ways to rebuild systems independently, or by adding more sysadmins temporarily.
For Northshore School District, both methods were used.
After the attack, Kacaroski said his school district called up a local hosting firm that had done good work on small jobs that the school district itself couldn’t—or didn’t have the time to—do. Right after getting off the phone, that firm sent three additional sysadmins to help clean up the problem, Kacaroski said.
“We called them up. They gave us… essentially full-time, experienced sysadmins,” Kacaroski said. “We went from two to five. A huge increase.”
Kacaroski said that the beefed-up sysadmin team also gained some valuable breathing room when the school district found a paper-based workaround for its food services system. The school pared down its offerings and began providing only three options for school lunches for children. Each day during this temporary fix, the school could easily mark down, on paper, how many lunch options of each type were purchased by the students, still keeping accurate records while giving the school extra time to rebuild any digital services. Further, the school district decided to move its student record system, which was comprised of 27 Windows servers, to a SaaS solution, Kacaroski said.
“We had a vendor that we had a good relationship with, they dropped everything, and what is normally a six-month migration, they did in six days,” Kacaroski said. “But the most critical part is it didn’t have to go through the system admin chokepoint. That was a whole different group and they could just work on it on their own.”
All along the way, Kacaroski stressed the importance of strong relationships. Aided by local vendors, other school districts, parents, and other teams inside the school district itself, Northshore was able to recover about 80 – 85 percent of its systems and files in just two months, Kacaroski said.
““Like I say,” Kacaroski said, “relationships were the most critical thing.”
Listen to our full conversation on Lock and Code below