Data Architecture Considerations Involving Cybersecurity and Data Analytics
One would be hard pressed to find two enterprise services that require more data than Cybersecurity and Data Analytics. According to informatica.com, “Data analytics is the pursuit of extracting meaning from raw data using specialized computer systems.” Data analytics doesn’t exist without access to data, which it turns into information. Without data, cybersecurity services such as vulnerability scanning, threat hunting and incident response are impossible. This is partially the reason why IoT (Internet of Things) is such a challenge for many organizations—often there is a lack of data about these devices resulting in an inability to secure them. Data architecture is what allows data to be collected, transformed and eventually consumed as actionable information.
1. While the Coin is the Same, How You Spend It Might Be Different
While data is critical for both Cybersecurity and Data Analytics how each discipline uses it is potentially different, depending on scope and maturity of the service. When cybersecurity tools, such as SIEMs (Security Information and Event Management) are set up, attention is given to what data sources are ingested for dashboarding, querying and reporting. Part of the reason is to avoid creating too much ‘white noise’ when reviewing operations or investigating suspicious behavior. White noise is essentially false positives, leading down digital rabbit holes.
2. Cybersecurity and Data Analytics Can Co-Exist Through Smart Data Architecture Planning
Between these disciplines, we are potentially talking about a lot of data. In an ideal world, the data architecture within an organization would serve both needs. In many cases, this is feasible. Let’s take the initial collection of data prior to consumption if you are already forwarding security, event and application log data from servers and workstations to your SIEM, why not forward what data is needed for your overall analytics efforts? Better yet, if you need a wider swath of data for enterprise analytics, consider collecting all the data available from your end points and then parsing the feed down to only what you need for your SIEM. This might require you to design an aggregation point where you can create multiple feeds of varying data content to go to different locations (e.g. Hadoop, SIEM index servers, etc.).
With data analytics, the more diverse the data sources, the ‘better’ the analysis
One important distinction you may find between Cybersecurity and Data Analytics is how much manipulation the data requires. In the case of SIEMs, you are typically just digesting log data. Unless the logs are unsupported by your tool, it should just be a matter of collecting the data, which may be easier said than done. Depending on your analytics use case the sources of data may go well beyond just collecting log information. If this is the case there may be serious ETL (extend, transform and load) work needed. This should factor into what your ultimate data architecture looks like.
3. Look at the Checkbook
Most of us exist in organizations with finite budgets. Understanding how tools for Cybersecurity and Data Analytics leverage resources is critical to ensuring your data architecture meets your needs. For example, with many data analytic tools, it isn’t a simple matter of dumping all your company’s data into an SQL data warehouse. How large are the data sets? Is the data generally structured or is it highly unstructured? Where does the data modeling occur? After answering these and other questions ask yourself if your cybersecurity tools can leverage the same data architecture. Based off the security use cases can you afford the architecture? A great example of this is how SIEM vendors are encouraging organizations to perform all enterprise data analysis, and not just cybersecurity investigations, with their tools. What does the license model for the SIEM look like? If it is consumption-based can your company afford this approach? Reading above, data professionals tend to want access to all the data available and explore the relationships between data elements. This may be contrary to the model where you only digest what you need for your SIEM to properly monitor security of your environment and keep costs down at the same time.
If your organization has a data architect or similar position, it is imperative that they take into consideration your analytics and cybersecurity functions and requirements. If planned properly, there are many synergies to be had, especially as it relates to the investment, implementation and execution of a mature data architecture.