Don’t Drown in Data! Filter Your Documents for Relevancy
In the mid-1990s, a case that produced over 20 million documents was considered huge. Nowadays, cases routinely produce over 20 million e-mail messages alone! Electronic discovery has washed away earlier paper-based case numbers and become a tactic to drown the opposition in irrelevant data.
The reason is simple. Nearly all business documents are now created and stored electronically. When producing documents for litigation, it is fast and inexpensive simply to copy all the e-mail or correspondence related to a certain department, author, recipient or dates.
Of course, your opposition’s strategy may be to drown your side in an ocean of data. But fortunately, you have a life raft—document filtering. To take advantage of it, though, you must find a reputable litigation support provider who offers efficient electronic file processing (EFP)—beginning with thorough and accurate filtering. Filtering segregates the relevant documents so that only they are the ones processed for the case database.
Filtering Strategies
“Filtering” is a relatively simple concept. Electronic data filtering can quickly and inexpensively cull the documents to a workable, relevant population. The most important factor is the professional litigation support provider’s ability to work with the client law firm or corporate legal department to create a filtering strategy for the case. The litigation support provider must work with your litigation team to design key terms and values to filter the documents to a progressively finer level. Filtering tiers can migrate from very general to very specific:
Tier One: Apply “gross” filtering to identify and eliminate certain types of documents from the CDs. Raw data is run through culling for batches of documents that are not applicable to the case.
Tier Two: Search for key terms that further sort for the relevancy of the documents.
Tier Three: Create groups of document sets within the relevant population. For example, a search for specific law firms and attorney names can segregate privileged documents.
After the raw data is reduced to the relevant document population, the litigation support provider can convert the documents to the electronic document repository and code them. Then the client can run regular database searches to prepare for depositions and trial.
The Planning Meeting
The first step a litigation support provider should take is to set up a design meeting with the client to discuss the variables of the case to establish criteria for filtering and indexing (bibliographic coding and/or subjective coding). Coding specs should be created. Since there is a lot of new territory to be covered, the litigation support provider should be experienced and knowledgeable in document strategies, ask probing questions and guide the law firm or legal department through the filtering process. Each litigation team organizes its information differently. In some instances, a corporation that is a party to a suit may already have a document retention policy. Their legal department may perform the first culling pass by going directly to the corporate network.
Predicting the Case Population
There is no foolproof way to predict the size of the relevant population within the gross total, although it is not uncommon for them to be less than 10 percent of the total. The document litigation support provider may be able to give a rough estimate based on the “types” of documents and media that are produced. For example, if backup tapes have been produced, these are generally duplicates and may likely be eliminated up front.
Creating a Directory
Once the litigation support provider begins receiving the electronic documents, it can begin creating an electronic index that records incoming documents by type identified by file extension. Using this index, the provider can narrow the population by identifying “junk” in the files. The litigation support provider should give the client a list of this index and update it with each shipment of documents.
“De-Duping” Documents
Eliminating duplicates is a further step the litigation support provider must take to cull the case population. The relevant documents will contain a number of exact duplicates, which must be eliminated by de-duping before the electronic documents are converted to the database. If the litigation support provider fails to perform this step, the client will pay unnecessary fees for processing the same document more than once. The litigation support provider should have an electronic process in place for finding duplicates.
E-mail produces the most electronic data and the most duplicates. A person may copy an identical e-mail message to many recipients. Attachments can be exact duplicates. However, weeding out duplicates can be tricky, and a document should not be eliminated if it contains even the smallest difference(s). Such differences may include handwritten notes, electronic annotations, or simply forwarding an e-mail message onto another person; so the electronic file has different metadata (statistics such as when it was sent, to whom, by whom and whether it was opened.) Professional EFP providers include in their service recording and processing the metadata attached to each file.
Don’t Panic!
Remember, when you are facing a tsunami of potential case documents, the actual relevant documents may only be a small percentage of this total. Even if you have 40 million documents that you estimate will be reduced to four million, today’s professional litigation support EFP providers can filter and process the documents quickly and inexpensively using electronic processes. Your major challenge is performing due diligence in selecting the litigation support provider. Be sure to ask them about:
Electronic document filtering services
Strategic planning process in creating the culling and indexing strategy
Electronic de-duping processes
Electronic file conversion processes
Online repository services and/or the ability to load data to an in-house ALS database.
Don’t Sink—Swim!
Do your due diligence carefully and pick the litigation support provider that’s right for you, and I confidently predict you will be able to brave the wave of electronic documents.
About our author . . .
E. Jane Warhshuis is one of the most experienced experts nationally in electronic discovery and all discovery document management. She is the founder (in 1979) and president of Compulit in Grand Rapids, Michigan. (www.compulit.com)