Last week, The Daily Beast reported the Jeffrey Epstein criminal trial will have a million pages of evidence, which will include materials seized from several devices.
A million pages of evidence makes for a great headline. It feels overwhelming! However, after reading the article from The Daily Beast, I began to wonder if a million pages of evidence is a lot or a little? How many files are stored on a standard laptop or cell phone? How will the prosecution and defense identify those files admitted into evidence? These questions, obviously, got me thinking about digital forensics and eDiscovery issues present in the Epstein sex abuse trial.
Now, if you read the blog post from last week, you’re probably wondering if I’m going to constantly write about sex abuse issues. The answer is, no. However, when these topics fill our news and I have the ability to reach out to qualified expert witnesses to provide insights on issues of public import, I’m going to do so.
As of this writing, the Florida Governor has ordered a state criminal probe into the handling of the 2008 Jeffrey Epstein investigation. This new probe was reported by The Miami Herald, yesterday afternoon. Some credit for Epstein’s current predicament, is due to the “Perversion of Justice” exposé series, from Miami Herald reporter Julie K. Brown. She detailed the 2008 sex trafficking investigation and settlement. The series is worth a read!
Now, back to the million documents of evidence. I’ve been working with digital and ediscovery experts for nearly 10 years. That said, I’m a novice on their areas of expertise. I’m able to issue spot when an attorney needs a particular type of expert. With that said, I posed some foundational questions to one of our members.
Questions & Answers for expert witness C. Matthew Curtin, CISSP:
C. Matthew Curtin, CISSP, founder and CEO of Interhack Corp., is a Certified Information Systems Security Professional. An expert in computers and information technology, Mr. Curtin and his team at Interhack help attorneys and executives use data and computer technology in high-stakes situations.
NR: According to The Daily Beast article, the Epstein trial will have more than 1 million pages of evidence, found on multiple devices. How will the prosecution and defense retrieve all of these documents and collate them into usable evidence?
CMC: One million pages of computer evidence is no big deal. Consider that in a typical computer system you’re looking at anywhere from 100,000-500,000 files, including all of the software, operating system, and user data. By the time you get through to the things being used by the prosecution and defense as evidence, the vast majority has been thrown out, but if you’ve got a phone or two, a couple of computers, and a few online services, it’s pretty easy to get into those numbers. Ultimately it depends on how they’re counting, of course: Are these bates numbered pages for presentation, or are they the raw input? If these are the results that are turned into exhibits and so on, that’s pretty big but not huge.
NR: What is the process for identifying the usable documents from those that are unrelated to a litigation?
CMC: Finding relevant documents and conducting a forensic examination are two fundamentally different processes. Finding relevant documents is typically a matter of “indexing” (reading the files for their contents) and then making “queries” of the “index” to return the documents and pages that are responsive to the search. Typically an attorney will then look at the responses and make a decision as to whether something is material. It’s basic data processing: data in, data out for a lawyer to use.
In the case of a forensic examination, the raw data will be subjected to various tests and analysis, ultimately resulting in reports that will be submitted as evidence. For a phone, a complete “extraction report” can easily produce a 5,000 page PDF document, and many get much, much larger. In any case, all of these things will wind up going into some kind of expert report that will outline opinions and findings that might be challenged and should be subjected to scrutiny. This is expert data analysis, where the data processing is performed to be consumed by an expert to form a technical opinion or finding.
NR: How much time would it take a forensics expert to comb through multiple devices to determine which documents are appropriate for discovery and evidentiary purposes?
CMC: Methodology and the size of the source matter for how long it takes. Generally speaking, I tell people to figure that to run through a forensic image of a raw computer hard drive and prepare it for human review, you’re looking at three days if you want to recover deleted files, compute the mathematical “hash” values that allow us to distinguish among files, and so on. A human will then need to go through the results and that can take anywhere from another day to another week or more, depending on what’s found, and how much work needs to be done without automated tools to manage the process. In some cases, no one cares about deleted files. In other cases, they’re critical. The only rule of thumb that applies generally is that the time it takes to do the job is between two and eight times what a lawyer thinks it should take.
NR: Is a million documents a lot of digital documents for a trial? Or is that common when dealing with digital files?
CMC: I addressed this a bit in my first answer, but one million computer files isn’t a big deal.
NR: I’m sure many of my questions are rudimentary, please feel free to provide any additional information you think the public should know about digital forensics and e-discovery in this type of matter…
CMC: Something to add: when conducting forensic examination, we often see a law-enforcement view put forth: Suspect that X happened, so go search for evidence of X. Fail to find X, and you add “tampering” to the list of charges. The reality is, though, that it isn’t sound scientific process to go in search of confirmation of what you think is already happening. Various cognitive biases interplay to create serious problems with the results extracted this way. Far better to construct tests to look for the “null hypotheses,” the things that would disprove what you think is happening. At the very least, alternate theories of the case deserve exploration and there are plenty of cases that would not take the time and money put into them if they were given greater scrutiny.
For example, if someone is suspected of having illegal pornography on a computer—that is, possessing the material, knowing the character of its content—law enforcement will typically reconstruct deleted files, look at thumbnail image databases, and loose files found in caches and elsewhere on the disk managed by the computer operating system rather than the user directly. If they find material that looks like what they thought was there, in many places a prosecutor will go forward with charges. On the other hand, what if someone did get the files and not mean to have them? What other course would there be but to delete the material? If the material has been deleted, why would it be brought up in a prosecution? There are cases where it can be relevant to a legitimate legal question but we’re only in the last few years starting to see some sophistication in consuming these results and moving forward sensibly with discretion informed by understanding.
A huge thanks to C. Matthew Curtin for taking time to provide us with these excellent answers. Please check out his company at http://web.interhack.com/.