Automating incident response
Last updated
Last updated
NIST provides a comprehensive Incident Handling Framework, which serves as a blueprint for organisations aiming to improve their cyber posture. This article will look at possible opportunities to automate incident responses around this standard.
Preparation Phase:
Playbooks: Expected incidents and known security findings / alerts can be automated. For example, building an AWS incident response runbook using Jupyter notebooks and CloudTrail Lake, or using Azure's Sentinel workbooks.
Asset Inventory: Begin by automating the process of maintaining an up-to-date inventory of assets, including hardware, software, and network components. Use tools for continuous monitoring and asset discovery to ensure comprehensive coverage.
Simulations: Ensure you test your IR capabilities by running simulations. Simulations can be table top exercises, purple teaming, or red teaming. The 2 latter options that could include running scripts / deploying test malware could be automated using tools and custom tools. This can help test readiness over time, rather than only during exercises.
Threat landscape: Ingest and enrich your environment / incidents with threat intelligence. Automating this enrichment can give responsers context to incidents which can help save time.
Consider creating further actions to take actions if the confidence of the incident being a true positive is high. For example, a known bad URL or sender in an email is found, so this can automatically be resolved.
Automating forensics: Avoid human error and time consuming evidence collection by automating forensics. Base this on Containment > Acquisition > Examination > Analysis. See https://github.com/awslabs/aws-automated-incident-response-and-forensics
Detection and Analysis Phase (Operations):
Detection rules: Consider ways to automate the creation or detection rules. If you are creating detections based on TTPs, do you rely on manually researching new TTPs or is there a way to automate this?
Detection: What happens when a detection is triggered? A ticket is created, or a notification is sent to the SOC. Then what? How can you initiate an action to the next phase - contain, eradicate, and recover.
Containment, Eradication, and Recovery Phase:
Validate, scope, and assess impact of alert: Check the origin / environment of the incident to validate the incident.
Enrich: with threat intelligence to help make quick decisions about containment. Implement an automation to query logs (such as CloudTrail) for relevant (such as API) activity performed by the alert body’s identity or resource around the time of the alert, providing additional insights.
Forensics: Again, using automation and having automatic documentation of this collection generated can help, in addition to storing the artifacts in read-only repositories.
Eradication: Based on the decision for (need of) eradication, you can automate a handful of actions such as IAM, key deletion/rotate, deleting resources. This will cut the time and limit impact.
Recovery: Rebuilding resources can take time, so using templates like IaC (Terraform, Azure templates, CloudFormation, etc) could help quickly rebuild safe and clean infrastructure.
Post-Incident Activity Phase:
Incident Reporting and documentation: Automate the generation of incident reports and documentation to maintain a comprehensive record of security incidents, response actions taken, and lessons learned. This documentation is invaluable for post-incident analysis and compliance purposes.
Metrics: To help improve IR capabilities, automate the collection of metrics such as mean time to detect, acknowledge, respond, recover, contain, etc. This can automatically be put in a post incident report.
IOCs: Some of the most valuable IOCs could come from within your organisation from known bad activitiy. Sort of like an internal honeypot - we know traffic on there will be bad, so collect that information, enrich, and create detections against them. Automate it.
Visualising your SOC processes or incident flows could help identify areas which can be automated. Taken from ISF's "Building a Successful SOC" report:
AWS have some detail and Cloudformation templates on automating IR and forensics:
SOAR or XDR solutions already exist. Avoiding custom scripts helps with scalability and maintenance. I'll give an example of how Microsoft's Defender can automate incident response:
Alert creates an incident
Start an automated investigation
Result (verdict) in malicious, suspicious, no threat found.
Remediation actions for malicious or suspicious entities are identified and can be quarantined, process stopped, device isolated, URL blocked, etc. This can be automatic or approved depending on configuration.
Related alerts are added during investigation.
Signals correlated and added.