GridEx – Is enough progress being made protecting critical infrastructure?

GridEx enables utilities to exercise and demonstrate their response to simulated coordinated cyber and physical security threats and incidents.
Published: Tue 23 Aug 2016

GridEx is a sector-wide grid security exercise which takes place in the US every second year. First run in November 2011, GridEx is coordinated by the North American Electric Reliability Corp (NERC) and designed for electric utilities to exercise and demonstrate their response to simulated coordinated cyber and physical security threats and incidents.

The exercise includes scenario training with players who believe the live operations and/or crisis are real, and include “fake news broadcasts”. The majority of the exercise activity takes place within the exercise website, called the SimulationDeck. This includes scenario injects and scenario media. There is limited interaction via participant email which must be coded with “EXERCISE EXERCISE EXERCISE” at top and bottom of any email transmission in order to differentiate between these and ‘real’ communications.

The majority of participants take part in the exercise from their place of work over a two day period and receive sequenced email messages that detail simulated scenario conditions. Based on this information, participants respond with simulated internal and external information sharing activities. An exercise control cell, based in the Washington D.C. area, manages scenario distribution, monitors the exercise and gathers lessons learned.

Only registered utilities can participate and only utilities can act as observers. No media are allowed to participate or observe. There are two ways for organizations to register for GridEx III: as an active organization or as an observing organization. Active organizations participate in planning conferences, adapt scenarios to meet their local objectives, engage in crisis response and communicate externally to other exercise participants for information sharing and coordination. Observing organizations have access to all planning materials including the scenarios; do not communicate externally during the exercise; and may choose to tabletop or discuss scenario events internally. Utilities have the flexibility to switch from observing to active (or vice versa) as they gain knowledge of how they might best participate and dedicate the appropriate resources. NERC provides planning and support to encourage first-time participating utilities to participate as an active organization.

The event has grown immensely, starting in 2011 with approximately 75 industry and government entities; GridEx III involved more than 300 organizations and close to 10,000 participants. Other participants included the US Department of Energy, Homeland Security and the Electricity Sub-sector Coordinating Council, which includes utility and co-op CEOs as co-chairs. The feigned events included shootings, explosions to destroy human and grid assets, as well as destruction of control systems. The participants reacted as if those were successful destructions, stressing control, communication and backup systems to see how they would respond.

GridEx III objectives

The main objectives for GridEx III were:

• Exercise crisis response and recovery

• Improve communications

• Identify lessons learned

• Engage senior leadership as regards information sharing and coordination

How did GridEx III fare relative to the previous exercise, and were there any major improvements and/or challenges identified?

GridEx III was the largest geographically distributed grid security exercise to date. The participation grew by 133 organizations and more than 800 individuals compared with GridEx II. This increased participation had the net effect of an increase in the level of communication and coordination between organizations as one would expect during a real event of this size. According to the Grid Security Exercise GridEx III report released by NERC in March 2016, the extent to which entities exercised communications processes within their organization increased by 20%. They also recorded a 17% increase in communications processes with their Reliability Coordinator [1] and other entities; a 11% increase in communications with law enforcement, electricity vendors, electricity suppliers and other government entities; and a 14% increase in communications processes with NERC’s E-ISAC [2] and bulk power system awareness (BPSA) systems [3]. The one area of concern was with communication between other critical infrastructures. Communication here remained constant based on 2013 and this raised an alarm, given the increased participation. It would be natural to think that the level of communications between such critical infrastructures would be much higher as a result of their importance within the grid. According to the reportback, this is something that is going to be addressed during the forthcoming exercises. However, Martin Coyne, the NERC Communications Coordinator, indicated that “during the exercise, there was cross-sector communication involving the electricity, so this concern was addressed in the exercise scenario.”

Cyber attack scenarios

The exercise also introduced scenarios involving multiple physical and cyber-attacks affecting the bulk power system (BPS) [4]. Using the SimulationDeck website [5] and inject distribution tool (a tool that allows specific users and lead planners to input information as part of the simulation), utility players were lead to experience the effects of unusual control system operation and received reports of substation break-ins and UAV surveillance. As the simulation progressed, players experienced increased malware intrusions and coordinated physical attacks. These caused simulated communications disruptions, and generation and transmission outages. During the second day of GridEx III, copycat attacks and inaccurate social media reports were fed into the exercise which further stressed the system and players. According to the report, in order to maintain grid reliability, emergency control actions were implemented including load shedding and rotating outages.

When one looks at the figures released, the number of reports received by the BPSA as part of the simulation are quite staggering. It includes 88 physical security events, 23 cybersecurity incidents, 10 suspicious activities, and 47 impacted electricity assets. On the second day, a further 18 physical security events were reported, eight cybersecurity incidents, and six suspicious activities.

Why the quantity, you may ask. Is this really necessary? Well, these types of events are of particular importance and relevance to NERC given that in April 2013, a sophisticated physical attack took place on a Pacific Gas & Electric Co. power station in Metcalf, California. According to American Thinker magazine, “The attackers apparently first slipped into an underground vault and expertly severed six AT&T fiber-optic telecommunication lines in a way that would make repair difficult. ... Then, a half hour later, … snipers began firing at the power station, destroying 17 giant transformers and six circuit breakers.” The result of the attack? The Metcalf power station was down for 27 days and damage was estimated at $15.4 million.

Fortunately the power supply to Silicon Valley was not disrupted because other power sources were used to make up for the loss.

Then there is the worry about viruses like Stuxnet, a virus essentially designed to fool programmable logic controllers (PLCs) into believing that all is well, when in fact the reality is quite different. A malicious computer worm, Stuxnet’s design and architecture is not domain-specific and it could be tailored as a platform for attacking modern supervisory control and data acquisition (SCADA) and PLC systems. It is worth noting that as the industry has taken advantage of the benefits of automation and remote monitoring and control in recent years, the power grid has become increasingly dependent on the use of digital communicating controls and systems to operate. The increased use of IP networks for SCADA and other operational control systems, in particular, creates potential vulnerabilities. When asked about viruses of this nature and whether or not they test for such attacks, Coyne indicated that they “… do not divulge specifics of the grid security scenario”, but he was able to indicate that Stuxnet was not part of GridEx III.

The other concern revolves around that of the smart grid and smart metering. Smart devices may assist with energy demand management, energy efficiency initiatives, and allow for new innovative distributed technologies to be incorporated into the grid like local solar PV, local wind generation, and plug-in electric vehicles. But they also redefine the protection of the asset itself. How do you protect against sabotage of an entire network of distributed assets of this nature? If one of these devices were decommissioned the effect would be negligible, but if a critical mass were incapacitated or unable to operate reliably, the effect could be far worse, especially given that devices are playing an increasingly important part in demand management. Furthermore, the greater the number of these assets in the system, the greater the number of connection points into the grid via communication channels.

Coyne confirmed that “smart meters are part of the distribution system managed by electric utilities and regulated by the states. As this is not in NERC’s bulk power system purview, it was not the focus of GridEx III. However, utilities had the flexibility within the exercise to adapt the scenario to better fit their needs, which could include the distribution systems they operate”. What does this mean? Well, it’s likely that they were not able to incorporate aspects of the smart grid and smart metering into the simulation, if at all.

GridEx III lessons learned

So, what were some of the lessons learned? In truth, one cannot tell too much from the NERC report. The report indicates the number of “lessons learned” reports received – totaling 25 – but nothing specific about each of the reports. They identify actionable lessons learned and acknowledge the value of the programme without going into detail around what each is. This is understandable given that the report is a public document and feedback of this nature has a tendency to indicate your vulnerabilities and problems within the system. The idea behind the feedback is for NERC to primarily improve the planning process for the next exercise and to inform industry on potential vulnerabilities. This way they can help identify physical and cyber threats in order to improve initiatives to combat them and improve future exercises against them.

On the last day, the Executive Tabletop [6] met to discuss the exercise. They were tasked with identifying security and reliability challenges and opportunities to improve prevention, response and recovery strategies. Three key areas were identified: namely unity of messaging, unity of effort, and extraordinary measures.

Unity of Messaging addressed how industry and governments assess a crises event, and how they send and receive information with each other in public. Managing the challenges and opportunities related to social media was of particular importance. The report notes that there is work to be done in ensuring a common view of what is unfolding in order to respond and recover and to assess the impact of delivery of power to consumers. Furthermore, it was recognized that the ability to communicate with the public, so that they can be made aware of the situation, is critical. Given that with any outage it is likely that most traditional forms of communication (television, radio and print) will be disrupted, social media will play an increasingly important part in communication with the public.

Unity of Effort addressed how to improve the coordination of resources available to respond to any crises. This included coordination with local law enforcement agencies to assess the physical risks associated with repairing a site that has been damaged. Workers will not begin to repair any damaged plant or electricity distribution network until they are satisfied that the work environment is safe.

It was also recognized that identification of cyber risks is extremely important and can only be done by visiting the affected facilities. The industry’s ability to analyze and repair malware is limited and requires expertise from software vendors, control system vendors and/or government resources to combat. The ability to restart any power station would be severely delayed or may not begin at all until the nature of the cyber risk is understood and addressed.

Finally, the Executive Tabletop looked at the regulatory framework in which the industry operates. Aspects involving legislative needs and government support that could improve the time and recovery of any outage were addressed. This includes the need to establish priorities for restoring electrical services, the simplification of the electrical system operation under emergency conditions, the need for mechanisms to prevent financial defaults, and the ability to manage personal and corporate liability risks.

The question now is – where to from here? Is this being over-played or under-played? It is clear that we cannot take our power for granted. With the advent of modern systems, smart grids, integrated distributed assets, modern computing and the like, we have created a complex network that is still vulnerable to exploitation and attack. GridEx III has grown tremendously since inception, and goes a long way toward investigating and evaluating the grid as it stands. The numbers themselves bear testimony to this fact. But in order to properly address all the issues, NERC will have to broaden the scope even more. Aspects like the Stuxnet virus and other malware will have to be included. The smart grid, with all its complexities and innovative technologies, will have to be incorporated. And how do we address distributed assets? These are all questions that will have to be taken into account and investigated sooner rather than later.

The threat is no longer only conventional, but far more sophisticated and covert. No longer do we need the brute force of a large explosion to knock out a power plant or disrupt supply. A few clever key strokes on a computer is quite sufficient to potentially do an alarming amount of damage.

As the old adage goes, “Rome wasn’t built in a day”, and GridEx is a young burgeoning program. Exercises like this may not be perfect but it’s a good start in ensuring that when you wake up the next morning and flick a switch, the lights will actually turn on. 

Notes

1 Reliability Coordinator – As defined by NERC’s functional model, Reliability Coordinators are the highest authority responsible for the day-to-day operation of the bulk power system. There are 16 Reliability Coordinators across North America. 2 2

2 The Electricity Information Sharing and Analysis Center (E-ISAC) establishes situational awareness, incident management, coordination, and communication capabilities within the electricity sector through timely, reliable, and secure information exchange. 3

3 BPSA – Is the team at NERC that monitors the daily, real-time operation of the bulk power system. NERC Bulk Power System Awareness collects and analyzes information on system disturbances and other incidents that have an impact on the North American bulk power system and disseminates this information to internal departments, registered entities, regional organizations, and governmental agencies as necessary. Also, Bulk Power System Awareness monitors ongoing storms, natural disasters, and geopolitical events that may potentially impact or are currently impacting the bulk power system. 4

4 BPS – Includes electrical generation resources, transmission lines, interconnections with neighboring systems, and associated equipment, generally operated at voltages of 100 kilovolts or higher. The bulk power system generally does not include distribution system facilities, which are regulated by state or local authorities 5

5 SimulationDeck website – a webbased crisis simulation platform that supports exercise planning and execution. SimulationDeck included social media functions that simulated Facebook, Twitter, YouTube, blogs, and traditional media such as TV, newspapers, and radio.

6 Executive Tabletop – The executive tabletop was facilitated by a former utility senior executive who advises the industry on critical infrastructure issues and serves on a presidential advisory council. The electricity industry participants included chief executives from investor and publicly owned utilities, cooperatives, and independent system operators from the US and Canada. The US federal and state governments were represented by senior officials. In addition, approximately 70 individuals associated with the participants attended the tabletop as observers to provide feedback.

This article first appeared in Metering & Smart Energy International, Issue 2 2016.

Related Webinar