Is Predictive Coding a Cure for Out-of-Control Discovery Costs?

Business Litigation Update

Date: November 28, 2012


The prevalence of electronic documents, including emails, has made business and personal communications less expensive and more efficient. But where litigation is concerned, quite the opposite is true. In an earlier era, a big litigation case might have involved a few dozen boxes of paper documents and a small group of associates to eyeball each document; now, large teams of contract lawyers are pulled together to cull through terabytes of electronic data. Consequently, the time and costs involved in discovery have skyrocketed, sometimes completely overwhelming the normal cost-benefit analysis in litigation.

Is there a technological solution? Maybe a partial one, in certain circumstances.

Over the last several years, a number of companies have developed software algorithms they claim can provide a significant shortcut in large document reviews, therefore providing considerable cost savings. These techniques, known as predictive coding or technology-assisted review, have come a long way. Parties on both sides of a case in litigation who face substantial document review costs are increasingly and mutually considering trying these solutions.

In general terms, predictive coding is a way of using technology to make sorting decisions on large sets of data by extrapolating the results of human coding decisions on a subset of that data. Rather than having a large number of lesser experienced attorneys who are familiar with the case review documents, a fewer number of more experienced lawyers most familiar with the issues in a case or with a set of document requests review the subset and generate a "seed set" that is coded for relevance, privilege or other criteria. The seed set will include documents that are deemed both relevant and irrelevant. Those selections are then used by the computer to generate relevance rankings for the larger group of documents. Reliability is honed through an iterative process involving quality assurance by lawyers and refinement of the computer analysis. Many analogize the process to a spam filter whereby the lawyers and the computer interact to achieve a level of certainty as to what is relevant. Some published studies maintain that the results of this approach are more accurate than an entirely human review of the results of keyword or Boolean searches. While the initial setup of this kind of technology-assisted review may be more expensive than simply developing and running a set of search terms, the savings in cases involving millions of documents can be substantial.

Parties are free to agree to this or other shortcuts to extensive human review of the results of keyword searches. Where no agreement is present, courts recently have entered the fray on predictive coding. In the last year, at least four courts have entered orders that address the use of technology-assisted review as a means of identifying responsive documents requested in discovery. Parties in large matters who are faced with significant costs for document review would do well to review these decisions. They are:

In these cases, the parties advocating technology-assisted review laid out for the court the anticipated costs of complying with discovery requests using search terms and human review. They cited studies documenting the accuracy of properly constructed technology-assisted reviews and estimates of the anticipated cost savings of using such approaches. Opponents of the proposed approaches questioned the soundness or applicability of the studies, and objected that not every document is actually reviewed by a person. In addition, since the seed set that includes both relevant and irrelevant documents will generally be turned over to the receiving party, the producing party may be reluctant to produce documents that it would not have turned over in a manual review. Finally, concerns about the costs of resolving disputes and engaging in motion practice over the discovery process itself - given the lack of consensus on the appropriateness of technology-assisted review - may be a barrier to implementing the process.

In the four cases listed above, the courts in all but the Kleen Products case endorsed the use of technology-assisted review with varying degrees of specificity as to the protocol for the review. In Kleen Products, the court did not order a particular approach and encouraged the parties to work together to come up with an appropriate solution within the parameters of the discovery it ordered. An example of a detailed protocol can be found in the Actos decision.

Technology-assisted review is worth serious consideration in any case that involves the review of at least 100,000 documents or where the volume of data is likely to take a dozen or more lawyers two weeks or longer to review the material. Of course, the more document custodians that are involved, the more likely it will be that technology-assisted review might add efficiency and save costs. Class actions or large-scale investigations may be good candidates for this approach, whereas smaller cases with few custodians and limited data generally are not.

In any case, technology-assisted review is not a substitute for careful lawyering, sound judgment and good planning. Lawyers must become familiar with the issues in a case and know their client's data well in order to construct a proper protocol and either work with opposing parties to agree on that protocol or prepare to apply for court approval. Particularly because most courts are not likely to have a great deal of familiarity with technology-assisted review, it is important that the process designed be transparent and open so that it appears fair and will be easily understood.

Finally, even in cases where technology-assisted review is beneficial, it may not be appropriate for all categories of documents. For example, where the conduct or knowledge of a particular custodian is at issue, it may be advisable to conduct a full-scale human review of all of his or her data. In the end, technology-assisted review should be viewed as a promising tool in the arsenal of lawyers and clients who are looking to handle large-scale document productions in the most efficient and economical manner.