How to create a CSIRT: 10 best practices

Any security practitioner will tell you that incident response is important — arguably the most important thing security practitioners will ever have to do. It’s important to plan, which means map out who does what — i.e., roles and responsibilities — decide how and when critical communication should occur, and think through when or if to bring in external people like legal counsel, law enforcement and forensic specialists.

These steps are all critical. But when it comes to how to do that planning — how as a practical matter to make sure all these steps are defined and accounted for ahead of time — it can be a challenge to get started. Why? First, we won’t know whether our presuppositions were correct until there’s an actual event. Second, it’s harder to find practical advice than you might think. This isn’t to say advice doesn’t exist — to the contrary, there’s so much guidance that distilling it into practical steps can prove difficult.

With this in mind, let’s look at some practical steps that you can take today when it comes to making sure that your response team is appropriately staffed and to make sure your response plan includes the right stakeholders that might be external to your organization. With a number of things to pay attention to, incident response team staffing is one of the first things you’ll need to make decisions about: what people you need, who you have available and how best to empower them.

Why you need a team

The first thing to note is that any response effort will require a team with different skills. Meaning, it takes a team: No single individual or one functional area — no matter how well-intentioned — can do it all. There are two reasons why this is so. First, the team needs to be empowered to take action. You may need to rapidly decide whether and how to spend money — on external specialists for example — to bring in law enforcement, to inform the media or to implement specific technical measures. Being empowered means bringing in stakeholders instrumental to those decisions and the decision-makers who will affect them. Consider something like the decision to bring in law enforcement. Spending two weeks tracking down someone in legal so they can make a decision about whether, or when, to do that eats into time you won’t have during an incident — particularly when a disclosure clock might be ticking. Therefore, having those people involved from the beginning, or mapping a path to reach them rapidly, is an important consideration.

This article is part of

Secondly, there’s the question of diversity of skills. The only thing you can predict about an incident before it happens is that you won’t be able to predict it. Because each incident is different, the specifics won’t be known until one happens. This means that you won’t know with specificity what skills you’ll need during an incident until you need them. Consequently, having a diverse skill base to begin with — and being empowered to rapidly bring in folks with different skills when you need to — is prudent.

Creating a cross-functional incident response team

Keeping these principles in mind, it’s probably clear there’s a need for a variety of personnel and skills during an incident. But how do you organize them? How do you prepare the way to gain access to them ahead of time? Ultimately, you can do this a few different ways, but as a practical matter, it’s almost always a good idea to start with a small, nimble group of stakeholders as the core team. This core group represents the individuals in the organization with direct responsibility for managing the incident as it unfolds. Why a small team? A small, empowered team can be more agile and respond faster than a large, bulky committee; it can make decisions quickly and communicate fast-moving updates rapidly, while a larger group takes longer to get resources marshalled and everyone on the same page. So, while it’s by no means a universal requirement, it can be a good organizing principle to maintain a small, nimble team at the core and establish external connections to other groups for those times when additional skills, stakeholders and decision-makers are necessary.

There are no hard and fast rules about who to involve. Your organization’s line of work, business culture and internal structure can heavily influence personnel choice here. For example, a healthcare company might choose to include representation from IT and clinical environments since clinical systems — e.g., biomedical devices and imaging modalities — might be affected differently by attacks, have different security requirements and require different skills to remediate and do further investigative activity. By contrast, a broadcasting company might include specific groups that oversee the broadcast network, while an electric utility might include engineers responsible for the industrial control systems. The functional members of the team will vary based on the organization, technology landscape, overall organizational model or hierarchy, business context, risk landscape, corporate culture or any other relevant factors about the organization.

It’s also important to think through what relevant external groups to include beyond the core team. NIST SP 800-61 rev. 2 (“Computer Security Incident Handling Guide”) section 2.4.4 (“Dependencies within Organizations”) provides a good starting point for thinking through what those external groups might be. NIST recommends considering the following groups:

management and executives
IT support
information security
legal or counsel
HR
media relations or public relations
business continuity or disaster recovery teams
physical security and facilities

This is just a starting point, though. As mentioned above, depending on your organization, you may wish to include others.

You can use two approaches to bring others into the team. One is to include representation from other teams directly on the core team. The advantage is that the appropriate representation is present throughout the entirety of the incident — i.e., these stakeholders and groups are within arm’s reach should they be needed and directly looped into all the phases of the response activity. The downside is that the more people on the core team, the more unwieldy activities can become and the more organization required to ensure things run smoothly. Additionally, the more people are involved, the more difficult it can be (should it be required) to sequester or contain information about the incident.

An alternative is to define pathways of reporting and communication to allow quick consultations with those who need to stay informed and to speed up decision-making. This approach also helps unlock resources so that the right skills are available when needed.

Incident response team roles and responsibilities

The next question you’ll need to address is the internal organization of the team itself. In other words, who among the core team members and cross-functional team relationships are responsible for what. This is important for the incident response team just like for any other project, program or joint effort. Keep in mind, a lot of different things need to happen during an actual event. You’ll need decision-makers, technical staff (to gather additional data and to research issues) and people to communicate (to other teams, to management and even in some cases externally to law enforcement, the press, customers, business partners and others). You’ll need people to connect with outside parties and carry out numerous other key activities. It’s a good idea to define all these responsibilities ahead of time.

One effective strategy is to create an agreed-upon responsibility assignment matrix. Doing this formally, collectively, collaboratively and in writing can be a huge benefit. Keep in mind that some time may elapse between when you prepare the plan and when it’s actually used as the playbook for a real event. Having the formal artifact both reminds participants of their responsibilities and ensures that there’s no ambiguity about who’s doing what.

You need to answer a few questions as you assign responsibilities. You’ll want to decide who’s leading the group. The time for team friction isn’t during an incident. Defining a single point of accountability — an incident response team manager or team leader — as you plan out your team avoids such friction. This leadership role provides an unambiguous point of contact to executives, rapid decision-making and a clear arbiter of disputes. Likewise, having appropriate technical staff — both those with the skill sets to understand the technology, applications and environments in the organization and those who can research the threats, tradecraft, attacks and indicators of compromise — is important.

Lastly, remember that during an incident you might discover that you need new skills not available to you either on the team or even in-house. For example, you may lack a required specialist in a certain application or system tool, in forensics, in reverse engineering specific types of malware or in some other key area. This means that the team needs to be able to rapidly tap the necessary personnel, drawing either from other teams within the organization or from consultants or external specialists.

This means the team must have the ability to communicate with the rest of the organization to locate the necessary resources and get quick access to those resources when needed. For skills you don’t already have in-house, the time to figure out how to budget for, contract and incorporate people with those skills is not during an incident. Therefore, it can be beneficial to think through possible relationships with external parties — such as consulting teams, managed security service providers (MSSPs), forensic specialists, etc. — so they can be brought to bear with minimal lag time when needed.

Force multipliers that amp up your team

If you’ve thought through and planned using the above suggestions as a guide, you will have a good idea of who should be on the team — both in terms of the core team and relationships with external groups, including some service providers (consultants or contractors). You will also probably have a good idea of what the roles and responsibilities of the members of the team should be and, ideally, you will have worked to make sure they are empowered to do what they need to do with minimal delay.

The next thing to think through then is how to enable the team most effectively and what other resources will be available to it when the time comes for the team to actually do its work. In other words, in addition to thinking through who is going to do the work, you’ll want to spend some time thinking about how they’ll do their work so you can optimize around them being maximally effective. If done well, the “how” aspect of the operations of the team can act like a force multiplier — a combination of factors like personnel and tools to help the team be more productive during an actual incident.

First of all, consider the model of operation: Will the team only be called together under certain conditions — for example, when an incident is officially declared — or will it exist in some form, either fully or partially staffed, at all times? For example, some organizations might choose to maintain resources within a security operations center (SOC) while others might choose to employ a computer emergency response team (CERT) or computer security incident response team (CSIRT) model. What’s the difference? A SOC is a consolidation of security operations responsibility under one umbrella. It can be either centralized or virtual and distributed. The role of the SOC can include incident response, but it usually includes other aspects of security operations as well — for example: security monitoring, vulnerability scanning, forensics and other key security efforts. The SOC might oversee multiple different environments — such as cloud environments, on-premises networks, data centers or any other relevant environment within the organization.

The CSIRT or CERT models, by contrast, focus specifically on responding to incidents. These can either operate as part of the SOC, if there is one, or exist independently of it. They can be either spun up in an ad hoc fashion — i.e., pulled together from various resources to respond to a particular event — or exist as a fully-staffed, separate, permanent operational group. Which one you choose may depend on the size of your organization and its business context.

[embedded content]

In addition, think through how the group will function. How and where will it meet? Are participants all in one location, or are they geographically distributed? What tools will they have access to, and how will they communicate and collaborate?

For example, a team that exists as part of a broader SOC might be able to employ existing dedicated space and prefer in-person, physical meetings. They might use tools and software that the SOC already has access to as the backbone of their workflow — for example, using existing ticketing and workflow tools. An ad hoc CSIRT, where team members are all in one place, might choose to carve out a war room in the facility where those team members reside; a geographically distributed CSIRT might prefer to communicate primarily via collaboration tools like Slack, Zoom, Microsoft Teams or purpose-built IR collaboration tools.

The important part isn’t to get hung up on the specifics of what is the right way to organize, operate or communicate, but instead to think it through ahead of time in a systematic, workmanlike way so you don’t run into surprises in the middle of a critical event. Consider what organizational model and communication methods make the most sense given the number of personnel you have involved, the context and budget available, the needs of the organization, the threat landscape and so on.

The point is, thinking these things through ahead of time saves time, frustration and risk down the road. Put in the work to make sure you’re doing what makes the most sense for you and your organization.

Categories