Just Because You Can Keep It Doesn’t Mean You Should

by Tim Marley

Over the last few weeks, we have talked about knowing what data you have and who has access to it.

CIS Control 3.1 – We discussed the importance of creating a data management program. The importance of the overall program before breaking down key elements of that program.

CIS Control 3.2 – We broke down and described the process and deliverables of a data inventory.

CIS Control 3.3 – We learned about data access control lists, better known as access permissions.

The next logical question is how long that data should exist at all. CIS 8.1.2, Safeguard 3.4 helps us do exactly that. “Retain data according to the enterprise’s documented data management process. Data retention must include both minimum and maximum timelines.”

Data retention often gets treated as an administrative exercise. In reality, it is a risk decision. Most organizations default to one of two positions: keep everything forever because storage is inexpensive, or delete things inconsistently when someone runs out of space. Neither approach reflects intentional governance.

Retention requires balancing two competing obligations. On one hand, you cannot delete data you are legally required to keep. Regulatory requirements often define minimum retention periods for specific types of records such as healthcare information, employment documentation, tax records, and financial reporting materials. On the other hand, keeping sensitive data longer than necessary increases exposure. If a breach occurs, you are responsible for everything you still have, not just the data you actively use. A minimum retention requirement does not mean indefinite retention.

Retention timelines can also be suspended when a legal hold is issued. When litigation, investigation, or regulatory inquiry is reasonably anticipated, the organization must preserve relevant data until the hold is lifted. In those circumstances, deletion pauses. Retention is no longer governed solely by the schedule, but by legal obligation.

One of the most practical concepts in managing retention is identifying the document of record. For each category of information, there should be a single authoritative source, the official system or repository that governs how long that data is retained. Everything else is a convenience copy.

Consider a hiring process. Human Resources may serve as the system of record for resumes and application materials, with a defined retention period based on legal requirements. That is appropriate. But if hiring managers download copies of those resumes, store them locally, or retain notes indefinitely, those copies become part of the organization’s data footprint. In the event of litigation or a subpoena, convenience copies can complicate matters and expand exposure beyond what was required. Good retention practice means limiting unnecessary duplication and ensuring that only the official source is retained according to policy.

Retention also intersects directly with cybersecurity risk. It is common to see sensitive information handled appropriately in a centralized system and then exported into spreadsheets, emailed to colleagues, or stored on individual workstations. Even if the data is no longer actively used, its mere presence creates risk. I have encountered situations where individuals insisted they had not handled Social Security numbers in years, yet those records remained on their machines. Whether the files are old or current is irrelevant. If they still exist, the organization is responsible for protecting them.

Even large institutions struggle with this control. Detailed retention schedules may exist, sometimes spanning dozens of pages and covering extensive classifications of data types. But having a policy is not the same as operationalizing it. Without clear ownership and consistent enforcement, records linger well beyond their intended lifespan. Decentralized organizations often face additional challenges, as business units manage their own repositories without centralized oversight.

At a practical level, a retention program requires more than documentation. It involves identifying categories of data, understanding applicable legal and regulatory minimums, defining appropriate retention periods, and establishing processes for secure disposal. It also requires clarity about what constitutes the official record and discipline in eliminating unnecessary copies. Retention and backup should not be confused. Backups are designed for recovery. Retention defines how long data should exist in the first place.

If you have already developed a data inventory, you have the foundation to begin addressing retention. For each category of sensitive data, determine whether there is a legal obligation to retain it, how long it should be kept, where the official source resides, and whether uncontrolled copies exist elsewhere. Just as importantly, define what triggers deletion when the retention period expires.

Deletion, when done intentionally and consistently, is not reckless. It reduces exposure, limits breach impact, and puts you in a stronger position if something goes wrong. Keeping everything forever may feel safe, but in practice, it increases both legal and cybersecurity risk.

Data retention is not about preserving as much information as possible. It is about keeping what you need, for as long as you need it, and no longer.

Let's Connect. Make Better Technology Decisions with Forthright.

Understand your current environment and get a clear path forward. Let's connect.