Background
The 2024 E-Discovery Day survey closed on 3 December with 1 842 valid responses from 27 countries. The dataset, stripped of personal identifiers and released under CC-BY 4.0, provides the first longitudinal view of practitioner sentiment since 2019. This article summarises key findings, links to reproducible notebooks, and announces the 2025 open-mobile-images corpus designed to benchmark next-generation collection tools.
Methodology
Questions were delivered through an OSS LimeSurvey instance hosted on a FIPS-140-3 encrypted volume. No third-party cookies or tracking pixels were used; IP addresses were discarded after geo-location to NUTS-2 level. Only respondents who completed > 85 % of items were retained. Statistical analysis was performed with Python pandas 2.2 and SciPy 1.13; visualisations use matplotlib 3.9 exported as SVG for accessibility.
Key Quantitative Results
- Dwell time of ESI in unmanaged repositories: median 9.4 days (↓ 3 days vs 2022)
- Firms using transformer-based privilege detection: 61 % (↑ from 18 % in 2022)
- Error-adjusted cost per gigabyte reviewed: USD 146 (↓ 41 %)
- Practitioner-reported job satisfaction (Likert 1–5): mean 4.1 (↑ 0.3)
- Share of collection budget allocated to mobile devices: 28 % (↑ 9 %)
Emerging Themes
Deep-fake Authenticity
Seventy-four percent of respondents encountered audio/video evidence that required authenticity validation; only 29 % have a documented workflow. A public reference set of 400 synthetic and bona fide media files, each labelled at frame level and signed with CRYSTALS-Dilithium, is now available on IEEE DataPort (DOI: 10.21227/xxx).
Hyper-linked Modern Attachments
Forty-two percent could not define a consistent legal-hold boundary for links inside Teams and Slack. A PowerShell cmdlet that expands bit.ly, t.co and SharePoint-generated URLs and exports a load-file ready for Relativity has been released under MIT licence; initial pull-requests add support for Notion and Miro.
Mobile Encryption Hurdles
Average logical acquisition time for iOS 18 devices was reported as 2 h 18 min; for Android 14 with File-Based Encryption the figure rose to 4 h 07 min. To accelerate research, organisers published 50 imaged phones (25 iOS, 25 Android) along with ground-truth chat databases. SHA-256 hashes are immutably logged in sigstore/rekor to enable post-quantum chain-of-custody verification.
AI Adoption Curve
Among firms with > 50 attorneys, 78 % deploy large-language-model summarisation; only 31 % of boutiques follow suit. The primary barrier cited is “lack of transparent training data” (54 %), followed by “judge acceptance” (22 %). A draft affidavit template that satisfies Fed. R. Evid. 702 and Daubert standards has been uploaded and will be updated quarterly via pull-request.
Community Recognition
The Gayle O’Connor Spirit Award was presented to Maribel Rivera for sustaining a mentorship network that placed 312 newcomers into e-discovery roles since 2021. Rivera open-sourced the mentor-matching algorithm; the Python package passes 97 % unit-test coverage and is translated into four languages.
Open Data & Reproducibility
Survey CSV, Jupyter notebooks and figure source files are archived on Zenodo under DOI: 10.5281/zenodo.xxxxx. A pre-commit hook ensures that any future update triggers automatic reproducibility checks, including pandas-profiling and CSV-schema validation.
2025 Longitudinal Study Design
The next survey will add questions on post-quantum readiness, carbon footprint per gigabyte reviewed, and neuro-diversity hiring practices. The instrument is hosted as a public GitHub repository; issues and pull-requests are welcome. Data collection opens 1 November 2025 and closes 30 November 2025; results will be published within six weeks under the same open licence.
Conclusion
Transparent metrics confirm that algorithmic review, wireless mobile acquisition and authenticity-validation pipelines are moving from pilot to mainstream. Continuous release of reference data and reproducible code will be critical as synthetic media and double-extortion tactics raise the stakes for the next decade of digital discovery.