Level Up: Data Classification

Doing anything fun, exciting, and meaningful requires risk. Discovering beauty of any kind most certainly requires risk. Doing business in good times or a pandemic requires risk. It would be lovely to lower our risk model to zero but having a framework for determining risk levels for what’s important is as practical as it is critical but often missing from organizations and their leadership’s collective mental model.

Data drives business success and also incentivizes criminals to find new ways of compromising and exploiting it. This is why compliance standards exist, so every organization has a roadmap to protect or at least show due diligence protecting their own data and that of their clients, partners, and employees. Data is valuable but not all data is valued equally.

That’s why data classification is the killer app for a few crucial things, from spending less time filling out security questionnaires for your clients’ requirements related to emerging data protection and privacy laws, to leveling up yourself, your organization and its resilience to a broad scope of unplanned events, including fraud, identity theft, global events, and even pandemics, that can impact your productivity, reputation, and the bottom line.

What’s data classification?

Data classification is creating categories for specific types of data along with individual and collective risk factors. Generally, this begins using four categories:

  • Critical
    • Financial accounts (banking, credit cards, etc)
    • Encryption keys
    • Login info (usernames and wasswords)
    • Protected health information (PHI)
    • Proprietary Secrets
  • Highly Confidential
    • Personally Identifiable Information of clients, employees, etc (PII)
      • Personal email addresses
      • Physical home addresses
      • Personal phone numbers
      • First and last names
    • Geolocation data
    • Source dode
    • Intellectual property (product details)
    • Company financial processes
    • Tax information
    • Health benefits and usage (ID thieves often file bogus insurance claims)
    • Employee payroll details
  • Confidential
    • Internal workflow/processes
    • Internal domain names
    • Single components of PII without context
    • Internal tooling data (Slack, Teams, etc.)
    • IP addresses, network and endpoint logs, etc
  • Public
    • Employee names and emails
    • Company Address and Phone Numbers
    • Public Domain Names
    • Public Statements

Each data point is categorized by a risk rating. Guess what happens when this information is available to your team and organization? They start to think about it. They begin to treat data with more care and intention, minimizing preventable risks to themselves as individuals and also as a collective. The impact over time is demonstrable to your board, your clients, and even your new business opportunities.

That’s right. Risk Management is sexy to potential new clients who might find it comforting to know you value their data enough to actively build a culture that values protecting it.

How do you begin?

First, identify what types of data the organization requires. Not every organization needs to collect and store customer payment information. Some only need customer email addresses. Most organizations need to handle employee data. Some have needs like Social Security Numbers (SSN), geolocation or voice-activated data. Start by identifying all the data types your organization requires.

Next, choose your classification categories. Begin with general categories like the ones above. You’ll likely find you’ll need some customized ones. It’s not the same for every organization.

Then, match each data type with one data classification. Each classification can have many data types associated with it. This can be a challenge if some data types don’t have a clear risk profile. For example, the last four digits of a credit card number by itself can be low-risk but when combined with other data (a first and last name and birthdate) it can be used as a way to confirm someone’s identity legitimately or illegitimately.

You’ll no doubt come across data types that are hard to classify. In those situations, ask yourself: ‘Does this data exist in bulk?’ or ‘Is there no other context?’ If the data is in bulk, then it’s typically a higher risk tier, particularly if it’s all stored in one platform. If the data is incomplete, imagine customer IDs, zip codes, and order numbers, which may be classified in a lower risk tier.

It’s critical to understand how and where your data is being stored because how distributed the data is across other systems can determine how it’s classified.

“Can I just copy another one?”

It’s tempting to download a Data Classification guide from another organization and copy it for your own. If you do, be careful. These are unique to organizations. A templete may seem like a good starting point but often creates more work than properly starting from scratch. It’s easy to not capture all of the data types that your company should be concerned with and you’ll most often end up classifying data types that your company is not at all concerned with.

Also, keep in mind the categories for Data Classification may not mirror the context for your organization’s business model and/or existing policies and procedures. This is fine while things are awesome. However, when an unplanned event raises the level of scrutiny, you’ll regret making shortcuts that increased your liability.

There is no shortcut to making time to chat with people in your organization to understand what data drives your success and is worth understanding. Then build trust and rapport with key stakeholders.

Next Steps

Creating a Data Classification guide for your organization minimizes the time you’ll spend on security questionnaires and reviews, building better onboarding guides for FTEs and freelancers, identity and access management, and so much more besides just compliance with GDPR, CCPA, CPRA, etc. You’ll be able to quickly prioritize things, especially things that don’t pose much risk to your organization.

Here are the steps in a nutshell:

  1. Catalog all the data types your company handles
  2. Build your own, relevant classifications
  3. Assign those clasifications to each data type
  4. Document and communicate this to the culture-at-large
  5. Consult it when you need to respond to security audits/questionnaires
  6. Review it annually and make adjustments as needed (maintaining it is easy once the heavy lifting is done)