Saturday, December 20, 2014

Using Analytics to Clean Out the ESI Garage

Using Analytics to Clean Out the ESI Garage
 by Robert D. Brownstone and Gabriela P. Baron

As time passes and we acquire more "stuff," it gets harder to winnow down our possessions. Who longs to spend the weekend cleaning the garage? It's easier to keep piling up things up, with no discipline for storage or removal. Thus, many a garage keeps getting fuller and more chaotic until two distinct problems emerge: difficulty finding a particular object and increased risk of hidden hazards.
The same issues pervade the electronic information management (EIM) environment. The overwhelming volume of data generated daily leads to a similar approach to electronically stored information (ESI) for organizations of all shapes and sizes. Especially as companies grow, they are lulled into the sense that it is easier and better to focus on the urgent matters at hand and let the emails, electronic files and database contents keep stacking up.
But there are many risks inherent in saving all ESI forever. Potentially harmful content resides all over the place, whether it's a "smoking gun" message, or something written and kept so long that it becomes susceptible to misinterpretation when taken out of context years later.
Within an organization that has a save-everything policy, there likely are redundant copies of information, resulting in sourcing and paying for extra storage space. These costs are multiplied by the "rule of three," by which all live data is backed up in at least two places. Moreover, the search for particular information becomes a near impossible and expensive chore. Additionally, more personally identifiable information (PII) and sensitive confidential data (i.e., intellectual property and trade secrets) stored at more locations means big risks.
The ESI garage model is the information governance (JG) strategy upon which organizations have traditionally relied. Even when an IT department is tasked with the responsibility of managing the data, this strategy falls flat. The primary reason: IT focuses on what it does best, maintaining access to data rather than extracting the most value from data.
The notion of IG is vague, and no panacea but organizations need to start somewhere. An IG initiative should entail the use of advanced analytics and intelligent automated assessments of big data sets to cull out irrelevant data, keep relevant data, and identify PII, intellectual property and other sensitive data that must be kept and segmented in order to ensure data security and privacy.
Savvy C-suites are adopting sound IG policies to not only promote efficiency when locating information, but to facilitate greater compliance with electronic discovery, data security and privacy legal compliance. IG can help contend, for example, with thorny international legal issues in cross-border data transfers day-to-day, as well as in e-discovery. Furthermore, as corporations look toward the next generation of technology and archiving systems, a solid JG program make moving data from one system to another and retrieving it easier.
Companies implementing effective JG will also benefit from enhanced visibility of corporate data, enabling the use of more in-depth analytics and the discovery of valuable insights and trends to maximize the value of retained data. If data is a crystallization of a moment in time, then IG is the storyteller, piecing together facts and information into a narrative.
Even more significant, IG enables multiple cost savings. In proactive mode, JG-savvy organizations experience lower storage costs for live and backed-up data. In reactive mode - for example, addressing a lawsuit - they will see reduced e-discovery costs. Indeed, because IG and e-discovery have parallel workflows (finding relevant data is always the first step), JG-strong corporations will be in a stronger litigation posture.


Embarking on an JG program is daunting for any organization. In a packed garage, one would start by manually reviewing and organizing what's been tucked away, shelf by shelf, until the space is neat and tidy. With ESI, the concept is similar. Cull through the data in discrete chunks until all of it has been reviewed, and a system is in place for future storage. By tackling small portions at a time, organizations will see results and a return on investment.
A careful, considered approach is key when starting to parse organizational data via this "data remediation" process. As a first step, Legal and Compliance should ensure the organization's IG policies and procedures are sound. Some organizations may need to start by designing and implementing a corporate governance framework, while others will need to update their existing records retention policies and procedures.
This first step is critical from a risk and compliance standpoint because it can guard against future spoliation allegations. The organization's data deletion project must be defensible, meaning it has memorialized reasons for the data destruction, covering what, why, how, by whom and when the ESI was destroyed.
Defensible deletion involves careful consideration of what ESI the organization intends to exercise its discretion to retain or purge, bearing in mind the nuances and contents of ESL Different file types are used for different job functions. In addition, Legal should ensure retention of ESI that may be subject to a litigation hold or relevant to issues in litigation or government inquiries.
Once the process is clearly defined and memorialized, there are two approaches for data remediation. The first approach is akin to damming a stream. With this approach, the organization must adopt a disciplined plan for newly generated data and information. The second approach is akin to cleaning a swamp. With that approach, companies must cull through existing data troves and purge the excess.
Interestingly, some organizations find the latter approach the easier to implement because most already have at least some applicable e-discovery tools in place. These work to automatically classify ESI using specified criteria, such as date and keywords.


Using the right tools is essential for maximizing efficiency and cost-effectiveness. Some e-discovery analytics can be applied to IG simply by being deployed upstream in the process. Those analytical tools, usually used for making sense of large data sets in incident-response scenarios, include:

• De-duplication: identifying exact copies or similar versions of documents and messages.
• Concept analysis: clustering of e-documents, messages, etc. under substantive topics chosen/created by software.
• Email redundancy: separating last message from each string.
• Relationship analysis: graphically depicting who knows/communicates with whom.

Another key e-discovery analytical tool is artificial intelligence-based, technology-assisted review, often called "predictive coding," which uses statistical modeling and machine learning. The technology underpinning predictive coding software functions like spam filters and targeted advertising. Predictive coding leverages machine learning and human review of samples in an iterative process, until the team is comfortable with the system's decision-making.
In e-discovery, that person-plus-machine process parses relevant from irrelevant documents. In IG, that same process can parse to-be-retained from to-be-deleted documents.
Lawyers and records managers should stay abreast of ESI technologies. Pertinent innovative technologies are evolving from the e-discovery and enterprise content management fields. Savvy e-discovery providers will incorporate ECM technology into their existing review and analysis tools to help organizations save money by tackling ESI for both IG and e-discovery.


A data remediation program can begin anywhere the organization prefers. Tools can be deployed as part of a legacy data clean-up project, a litigation hold tracking system, a data loss prevention initiative, a big data analytics project or an enterprise-wide archiving migration plan.
Many organizations prefer to start by tackling unstructured data (i.e., email or instant messaging), because it is riskier than structured data (i.e. database-stored). Individuals often feel freer to express themselves in informal, unstructured environments, and unstructured ESI is more difficult to parse than already automatically-classified information.
                No matter where the process begins, cull through the ESI first, then move data to new locations after remediation. Before you get to the details of deployment, vet any e-discovery or ECM platform for sufficient scalability to your IG initiative.
Ensure that IG becomes part of the corporate culture. Employees need to be aware of the corporation's records retention and information-management policies just as they are mindful of corporate expectations regarding HR practices, regulatory compliance or confidentiality requirements. Like violations in those areas, amassing large ESI volumes companywide can have a very high ultimate price.
                Training on IG should teach managers and staff to rethink how they use data, so that they keep only what is required or needed, and no more. Individuals should be guided by the Legal and Compliance specialists as well as e-discovery specialists conversant in defensible deletion. Training contemporaneous with regime change also provides an opportunity to emphasize the importance of litigation holds.
Once IG becomes embedded in the fabric of corporate culture, organizations will reap the rewards from a cost-savings, risk-mitigation and business-value perspective. While cleaning up decades of ESI is daunting, it only becomes more so as more data is stuffed into the company storage bin. The time to start the clean-up is now.


Robert D. Brownstone is Technology and E-discovery Counsel, Litigation, and co-chair of the Electronic Information Management group at Silicon-Valley headquartered Fenwick & West LLP. He advises clients on a wide range of legal and IT issues. He has also taught e-discovery law and process as adjunct professor at a number of universities, and in 2015 will teach the course at the Brooklyn and University of San Francisco schools of law.

Gabriela P. Baron is the Senior Vice President of Xerox Litigation Services (XLS). She has assisted clients with regulatory investigations, major class actions, employment matters and commercial cases filed in federal and state courts.

Today’s General Counsel, Nov 2014, 22.

Developments In Mobile Device Electronic Discovery

Developments In Mobile Device Electronic Discovery
 by Michael Weil And Mark Michels

Legal counsel and their supporting forensic teams face vexing challenges when it comes to preserving and collecting mobile device data. Smartphones and tablets frequently contain unique data that must be preserved, collected, processed, reviewed and produced in litigation just like any other form of electronically stored information.
Mobile device data is often critical for internal and regulatory investigations, as well. Unlike personal computer data that can often be collected remotely with relatively little impact on custodians, mobile device data collection usually requires separating custodians from their phones, sometimes for a very long time. Fortunately, there have been some important breakthroughs that may allow for remote, over-the-air, data collection from mobile devices, permitting a more efficient and less disruptive process.
It is not uncommon for a litigation matter or investigation to involve a large number of custodians, sometimes into the hundreds. In general, computer forensics professionals can gain access to the mobile device ESI only by physically connecting specialized forensic collection tools directly to the smartphone or tablet. This is unlike personal computer or server data collection, where they can remotely access hard drive files, or export email from a server for preservation, collection, processing and hosting.
Since physical access to the mobile device is the only way to collect email, text messages and other ESI, the custodian must part with the phone, causing serious "separation anxiety," and loss of a business tool and a personal lifeline. In some cases, companies have found that they must immediately issue new phones to custodians.
Mobile device management (MDM) systems allow IT teams to provision devices, maintain some level of security, and otherwise track mobile devices over-the air. Some MDMs also enable recording of SMS messages, not other text messaging applications. MDMs cannot access all of the files on the device because the mobile device operating system's security scheme does not allow remote level of access to some critical data. For example, mobile devices may hold SMS messages that have not been logged, third party text messages and other application data that cannot be accessed remotely through the MDMs.
There is some cause for hope, however. At the 2014 Barcelona World Mobile Congress there were a few companies that showcased some remote collection concepts. Furthermore, through some of our R&D efforts we have completed a proof-of-concept that demonstrated viable over-the-air remote data collection for most of the data on a smartphone.
While these remote-collection developments are encouraging, it will take some time for the operating system owners and the forensic tool developers to create protocols for complete remote over-the air mobile device data collection. Until they do, counsel and their forensic team will need to contend with in-person device collections or cumbersome mobile device backups. 

Michael Weil is a Chicago-based director for Deloitte Discovery in Deloitte Financial Advisory Services LLP, where he leads the Computer and Cyber Forensics Market Offering. He has 16 years of computer forensic examination experience, including criminal, civil, and national security matters.
Mark Michels is a San Jose-based director for Deloitte Discovery in Deloitte Transactions & Business Analytics LLP. He has 15 years of experience managing corporate discovery issues as well as 8 years of experience in patent litigation, pre-merger reviews and internal investigations.

Today's General Counsel, Nov 2014, p34.

Best Books of 2014 - NYTimes recommendations

Redeployment by Phil Klay

Little Failure: A Memoir by Gary Shteyngart

The Dog: Stories by Jack Livings

Lila by Marilynne Robinson

All Our Names by Dinaw Mengestu

We Are Not Ourselves by Matthew Thomas

Foreign Gods, Inc. by Okey Ndibe

Being Mortal by Atuil Gawande

Fourth of July Creek by Smith Henderson

Art in America, 1945-1970 edited by Jed Perl

The Empathy Exams: Essays by Leslie Jamison

10:04 by Ben Lerner

How to Build A Girl by Caitlin Moran

My Struggle: Book Three: Boyhood by Karl Ove Knausgaaard

Slant Six by Erin Belieu