Vikram Mehta, Associate Director - Information Security, Make My Trip
Attack vectors are getting complicated with technology advancements; with this, triage, intelligence gathering and response procedures also becomes tedious and time consuming. We are increasingly seeing the use of automation to launch attacks or detect vulnerabilities in systems; in-fact most reconnaissance are run un-attended. And there’s more–compute is getting cheaper, faster, easier; exploitation toolkits are sophisticated; phishing is managed. Well, I guess it’s time we start responding to attacks even faster than before, machine to machine!
There has been significant advancement in SOC and incident response/remediation technology, with next-gen SOC implementations, SOAR etc. I don’t intend to re-invent the wheel with this write-up, instead, call out the overall approach an organization could take to stitch a few open source implementations together and build their very own cutting edge & scalable next-gen SOC. The setup would not only detect advanced or un-known attacks, but would also possess the capability to respond to them in an automated/semi-automated manner.
Let’s take a look… 1. The Right Skill-set: It takes more than just security know-how to build a next-gen SOC. The first aspect to consider: a next-gen SOC isn’t just an event processing system with a persistent data store and event correlation capabilities. It is a big data backed threat detection powerhouse and is best run on a full-fledged big data platform consisting of components such as Apache Hadoop, Storm, Kafka, Spark, ElasticSearch, MapReduce, Hive, and many more. It would be beneficial to on-board (or train) a set of folks to implement & manage the platform itself.
2. Building the Pipeline: How does one manage diverse data sources and a high throughput data pipeline at the same time? Some brilliant work has gone into “Apache Metron” just for this purpose (and more as you will see). There are a few options we have in order to bring the required telemetry into Metron’s parsing topology. Depending on your data sources (and potential throughput) one could choose from tools like Apache Nifi to ingest events from diverse data sources and build complex processing work-flows; or Kafkacat which supports extremely high throughput with simplistic work-flows. Once the events are pumped into Kafka from these systems, you can leave it to Metron’s parsing topology to convert them into neat JSONs for you and forward them for further processing.
3. Building the Intelligence: I would like to break intelligence across two contexts 1) events and 2) alerts; and here’s why it’s important. The more contextual intelligence the merrier right? Everything however, comes at a cost. Gathering intelligence and enriching thousands of events per second can get expensive. It becomes important to decide what enrichment or intelligence attributes you would like to tag at the event context (for example Geo, AD, DHCP, IP / domain based ThreatIntel) and at the alert level (reverse DNS, prior alert / event history, WhoIs, external API based threat intel).
It takes more than just security knowhow to build a next-gen soc
As you can see, I’ve tried to classify any static lookup based enrichment in the event context, and anything more real-time into the alert context. Metron gives us the enrichment topology that’s backed by an HBase data store, which is the perfect fit for event enrichment. As for alert level enrichment, at MakeMyTrip we decided to create “blitz” (available at MakeMyTrip’s github repo) with built-in plug-in that help enrich an alert with virtually anything that has an API; there are other publicly available tools as well.
4. Profiling & Detecting the “Un-known”: Apache Metron also gives us a powerful utility called the “profiler”, which enables a SOC to build profiles out of any attribute that is available as an output of the #3 above. If designed well, profiles can empower a SOC with meaningful insights and can be used to detect deviations from normal behaviour/or trends. For example, a profile can be built to power the following use cases:
a. Un-usual user logon basis device or Geo location
b. Abnormal traffic to/from a Geo location
c. Abnormal traffic on a URL, from a user agent, or from an IP
d. Abnormal server activity basis ports, user agents, or connections
e. Abnormal volume exchange from a client or a server
One can take profiling to the next level by leveraging machine learning capabilities offered by either Metron itself (using it’s Model-as-a-Service / MaaS module) or “dataShark” (an open source offering by MakeMyTrip). Again, there are numerous other alternates to these as well.
5. Death by Automation: We have detected an adversary, great! So what? It is equally, if not more important, to respond to an attack than it is to detect one. There will always be a human SOC analyst, unless AI replaces them, right? Let’s, take the mundane load off the analyst by automating/semi-automating routine tasks or actions wherever possible, and that’s where security orchestration becomes most relevant. Here’s where one could use SOAR frameworks or “blitz” again. Some orchestration use cases could include:
a. Gathering the right intelligence in real-time (using API / DB calls, WhoIS, RDNS, etc)
b. Automated blocking of web attack sources using pre-fed intelligence
c. Semi-automated blocking of web attack sources using a single click
d. Single click endpoint remediation using AV / stingers
e. Single click endpoint quarantine using network devices
f. Automated incident ticket creation, updates, and closure
Again, security is never 100 percent effective, and neither is a SOC. What matters is, how much a SOC can detect or respond to timely within the given resources and limitation; minimize manual intervention; or automate routine/obvious tasks and actions.
1. Remember, the entire platform we discussed is available as open source software, it’s just about the right skill-set and mind-set!
2. Keep your friends close and your enemies closer! A good understanding of your organization threat landscape is the first building block in drafting SOC use cases, build them well!
3. Start slow: if you decide to embark on this journey yourself, take your time to digest the setup and it’s numerous moving parts. It can get tricky working with.
4. Sky is the limit! We have worked with these platforms for a while now, and trust me, we have just scratched the surface. Once deployed well, you will see a plethora of opportunities that await you, it’s totally worth it.