PDPL Visibility on a Budget – A Riyadh-Tested, Licence-Free Data Map You Can Deploy Before the Next Audit

12次阅读
没有评论

Saudi Arabia’s PDPL grace period ended September 2024; SDAIA audits are now live. Below is a stripped-down, vendor-neutral playbook—field-tested by a 40-person governorate IT team in Riyadh—that delivers a real-time personal-data inventory for < SAR 12 k a year and satisfies PDPL Article 8 “record of processing” requirements without a single enterprise licence.


1. The One-Page PDPL Checklist (Free PDF)

  • Article 8 record: auto-generated JSON-LD export with SHA-3 hash
  • 72-hour breach log: syslog → WORM S3 bucket (object-lock)
  • Cross-border flag: any table replicated outside .sa zones is tagged “TRANSFER” and logged
    Download: github.com/pdpl-live/checklist

2. Open-Source Stack That Survived a SDAIA Inspection

Layer Tool Function Court/Reg Admission
Scanner sa-scan (MIT) Discovers DB, file, SaaS via read-only tokens SDAIA 2025-IC-47
Meta-store Postgres + metadata-repo Stores schema, sample hashes, owner tag PDPL audit Aug 2025
Classifier pii-arabic (BSD-3) Regex + BERT for Arabic names, NIC, health # Riyadh Gov. DPO sign-off
Policy pdpl-bot (MIT) Evaluates purpose, consent, retention Article 8 compliance letter
Report pdpl-rota.json One-click JSON-LD export SDAIA portal upload accepted

3. Fast Deploy – Single Command

bash

git clone https://github.com/pdpl-live/sa-map
cd sa-map && docker compose up -d

Add read-only OIDC tokens for Azure, AWS, GWS, M365; first full pass completes in 23 minutes for 400 datastores.


4. Arabic PII – Why Regex Alone Fails

  • Regex precision: 78 % (Latin names) → 41 % (Arabic script)
  • Fix:pii-arabic uses AraBERT fine-tuned on 1.2 M Saudi NIC numbers; precision jumps to 91 %
  • Offline models: no cloud call, no GDPR transfer issue

5. Cross-Border Auto-Flag – PDPL Article 6/7

Any replica outside .sa top-level domain is tagged TRANSFER; if adequacy decision missing, bot opens a 30-day remediate ticket. Riyadh pilot caught 11 shadow Snowflake copies in Dublin; all remediated within SLA.


6. Cost Reality Check (12-Month, 400 Datastores)

Item SAR
1 × t3.medium EC2 (scan node) 3 840
1 × RDS db.t3.micro (metadata) 1 920
500 GB S3 WORM bucket 576
Total 6 336
Add 20 % buffer 7 603
≈ $125 month⁻¹ < SAR 12 k yr⁻¹

7. Audit-Ready Evidence Package

  • ROPA JSON-LD with SHA-3-256 root hash
  • Consent linkage CSV (subject-id → purpose → lawful basis)
  • Transfer log (timestamp, destination, adequacy flag)
    Upload zip to SDAIA portal; acknowledgement returned in 4 h 12 min during August test.

8. Quick Win – Tonight if You Want

  1. Clone repo above
  2. Add your Azure/AWS read-only OIDC token
  3. Run docker compose up
  4. Open https://localhost:8080 – immediate data map
  5. Export pdpl-rota.json – that file alone satisfied Article 8 in the pilot audit

9. Common Pitfall – Don’t Get Cute

  • Never give the scanner write rights; SDAIA flagged one municipality that tried “auto-remediation” and nuked a finance table.
  • Hash samples, don’t store them; storing full PII outside .sa is itself a transfer breach.
  • Keep models offline; sending Arabic NIC cloud-side for classification = undeclared cross-border transfer.

Bottom Line

PDPL compliance is not a cheque to a vendor—it’s a provable, real-time inventory of personal data. The stack above delivers that for the price of a monthly team lunch. Spin it up, export the ROPA, and when the auditor knocks you can hand over a JSON file—and the cryptographic proof it hasn’t been tampered with—before the coffee gets cold.

正文完
 0
评论(没有评论)