Version Control in Source Control: When IP Investigations and Dev Tech Collide.
Investigations and e-discovery exercises in response to IP theft are fraught with challenges...
IP disputes and investigations are a growing concern among companies that rely on constant innovation to drive their business. A recent Norton Rose Fulbright survey found that technology companies were more likely to be concerned about IP disputes than any other type and 46% listed them among the most concerning. Moreover, Skadden recently reported a “burst of arbitration activity involving IP licensing, trade secrets and disputes.”
Investigations and e-discovery exercises in response to IP theft, trade secret misappropriation and other related disputes are fraught with challenges, writes Stu Craft, Managing Director, FTI Technology.
Finding evidence of wrongdoing, calculating the value of lost IP, proving ownership and recovering sensitive files are only a few examples of the issues that can arise in these complicated matters. They often involve uncommon data types, large groups of custodians, highly sensitive material and significant time pressure.
As an example, our team was recently engaged as an independent data expert to help resolve a trade secret case in Dubai involving digital forensic collection, analysis and remediation across six different jurisdictions, 90 custodians, numerous systems and more than 100 devices.
Additional factors behind the rising challenges in IP disputes is that the pace of innovation is accelerating all the time and data sources are becoming increasingly complex. To power competitive progress in this fast-paced environment, organisations are increasingly utilising development technology and source control platforms to support collaboration among their research and development teams. In turn, these platforms are creating new technical difficulties and discovery obstacles in IP-related investigations.
Source control platforms such as Git are used by development teams to track changes and versions (typically in code and application development, but can be used for any set of files) between one or more individuals. At a high level, most are cloud-based and similar to SharePoint, in that they save a version history as changes are made and allow users to see the changes their colleagues are making. New changes can also be merged into existing files and users can save versions or branches locally on their own computer to conduct additional work separate from the activity tracked within the shared system.
See also: Investment Bank boards urged to get a grip on technology risk
Version history alone presents challenges in the discovery and investigations process — it can become very difficult to pinpoint which versions are the “correct” version for evidence preservation and fully understand when and how any given individual interacted with any given version. Where source control platforms become even trickier is in their ability to support branching, which allows multiple versions of a file to exist in independent branches. Further changes and comments can be inserted into branches or versions within branches. As developers work and conduct testing on their projects, they can create branches upon branches from a single original file, all while their teammates do the same, leading to hundreds or thousands of unique instances of one item. Branches can also be merged, so two or more sets of changes may be combined within one file.
In a dispute, especially one centered around IP created, stored and modified within a source control platform, how can the legal department determine which version or branch is relevant to the matter? Which contains evidence of IP ownership? Which branches and comments capture the activity of a custodian suspected of infringing on or stealing IP?
It's a complex process that begins with forensic collection of files within the source control platform, and often, from devices that contain corresponding files and versions. In some scenarios, counsel may request that only the latest version in the system be collected. However, the latest version in the platform may not be the same as the latest version that’s been saved to the lead developer’s computer. There are dozens of potential nuances that could affect which version may need to be reviewed by legal or preserved for evidence in an ongoing dispute. Our team has also encountered scenarios in which we must review differences between each version, which requires a separate collection for each version.
Throughout all decision points of which versions to collect and review — and the technical steps of doing so — defensibility is a key issue. There are several best practices to follow to ensure any collection of material from a source control platform is done in a repeatable, defensible and forensically sound manner. These include:
- Careful evaluation of which versions may need to be collected and whether any developers may have separate versions stored locally on their individual computers. This includes examination of how the platform is configured in terms of user access to projects, groups and versions.
- Determine whether the platform in use has search capabilities that support discovery and investigations workflows and to what extent.
- Make sure the repository is set to the correct version and branch before collecting and processing the information within it, otherwise the correct copy may be missed in the collection.
- Close monitoring of collections undertaken by internal legal, IT or other stakeholders. Even if the organisation is unwilling to allow an outside provider with access to the platform, these efforts must be assisted and overseen by an investigator with technical knowledge of the source control platform in use and the collection methods that are compatible with it.
- When multiple versions and branches have been collected, utilise visual analytics to make sense of the full scope of versions and how they differ from the others.
Popular opinion suggests that organisations within innovative industries are going to see an increase in disputes and investigations relating to their IP. With this, legal teams within organisations that are centered around research, development and technology advancement are going to see more matters that require the collection and review of evidence from source control platforms — another of many emerging data sources that complicating investigations and e-discovery processes. It will be key for counsel to recognise the nuances of collecting, processing and reviewing files from emerging data sources and plan accordingly at the outset of a matter to ensure all necessary evidence is collected and that every step is defensible.