1. Overview
Data Leak Prevention (DLP) software detects potential data leaks or data-leak transmissions and blocks them by monitoring, detecting, and preventing sensitive data in use (endpoint operations), in motion (network traffic), and at rest (data storage).
The terms “data loss” and “data leak” are related and often used interchangeably. If media containing sensitive information is lost and subsequently acquired by an unauthorized party, a data-loss incident becomes a data-leak incident. However, data leaks can also occur without the data ever leaving the sender’s environment. Related terms include ILDP (Information Leak Detection & Prevention), ILP (Information Leak Prevention), CMF (Content Monitoring & Filtering), IPC (Information Protection & Control), EPS (Exfiltration Prevention System), and IPS (Intrusion Prevention System).
2. Measures
Techniques for handling data-leak incidents fall into four categories: standard security measures, advanced/intelligent security measures, access control & encryption, and designated DLP systems—only the last category is formally recognized as DLP today.
A common DLP approach is automated detection & response, which discovers malicious or unwanted activities and reacts mechanically. Most DLP systems rely on predefined rules to identify and classify sensitive information, helping administrators pinpoint risk points and deploy additional safeguards.
2.1 Standard Measures
Standard security measures—firewalls, intrusion-detection systems (IDS), and antivirus software—protect against external and internal attacks.
- Firewalls block external access to internal networks.
- IDS detects intrusion attempts.
- Antivirus scans for Trojans exfiltrating confidential data.
- Thin-client architectures (no sensitive data stored on endpoints) mitigate insider threats.
2.2 Advanced Measures
Advanced measures use machine-learning and temporal-reasoning algorithms to detect anomalous data access or abnormal email exchanges. Techniques include honeypots for malicious insiders, keystroke-dynamics authentication, and user-activity monitoring to flag suspicious access patterns.
2.3 Designated DLP Systems
These systems target authorized users who intentionally or accidentally attempt to copy or transmit sensitive data without permission. They classify information as sensitive via exact-data matching, structured-data fingerprinting, statistical methods, regex rules, published dictionaries, concept definitions, keywords, and contextual data (e.g., data origin).
3. DLP Types
3.1 Network DLP
Installed at network egress points near the perimeter, Network DLP analyzes traffic for policy-violating transmissions. Endpoints can report activity to a central server. NGFWs or IDS may provide DLP-like functions, but encryption or compression can evade Network DLP.
3.2 Endpoint DLP
Runs on user workstations or servers, controlling both internal and external communications. Advantages:
- Monitors access to physical devices (e.g., USB drives).
- Intercepts data before encryption.
- Provides application-level blocking and real-time user feedback.
Requires agent deployment on every workstation; cannot protect mobile devices or unmanaged endpoints (e.g., internet cafés).
3.3 Cloud DLP
With organizations adopting cloud-native collaboration, Cloud DLP monitors, audits, and enforces access & usage policies for data stored in the cloud, ensuring end-to-end visibility against external attacks, accidental leaks, and insider threats.
3.4 Data Identification
DLP technologies identify sensitive information. Unlike discovery, identification determines what to look for.
- Structured data resides in fixed fields (e.g., spreadsheets).
- Unstructured data includes free-form text in documents, PDFs, or videos (~80 % of all data).
3.5 Data-Loss Protection
Sometimes data distributors inadvertently expose sensitive data to third parties or unauthorized use. When such data later appears in unauthorized locations (e.g., on a network share or a laptop), the distributor must trace the leak’s origin.
3.6 Data at Rest
Refers to non-moving data in databases or file shares. Risks increase the longer data remains unaccessed. Protection methods include access control, encryption, and retention policies.
3.7 Data in Use
Data actively interacted with by users. DLP systems monitor & flag unauthorized actions: screen captures, copy/paste, printing, faxing, or transmission via communication channels.
3.8 Data in Motion
Data transmitted across networks (internal or external). DLP systems monitor sensitive data traversing email, web, FTP, or other channels.