Sign Up for Vincent AI
Doe v. Github, Inc.
Cadio R. Zirpoli, Christopher Kar-Lun Young, Steven Noel Williams, Travis Luke Manfredi, Joseph R. Saveri, Elissa Amlin Buchanan, Joseph Saveri Law Firm, LLP, San Francisco, CA, Matthew Butterick, Matthew Butterick, Attorney at Law, Los Angeles, CA, for Plaintiffs.
Alyssa M. Caridis, William Wayne Oxley, Orrick, Herrington & Sutcliffe LLP, Los Angeles, CA, Annette L. Hurst, Daniel Denny Justice, Orrick, Herrington & Sutcliffe LLP, San Francisco, CA, for Defendants GitHub, Inc., Microsoft Corporation.
Michael A. Jacobs, Tiffany Cheung, Joseph Charles Gratz, Melody Ellen Wong, Morrison & Foerster LLP, San Francisco, CA, Alexandra Marie Ward, Los Angeles, CA, Allyson Roz Bennett, Rose S. Lee, Morrison & Foerster LLP, Los Angeles, CA, for Defendants Openai, Inc., Openai, L.P., Openai GP, L.L.C., Openai Startup Fund GP I, L.L.C., Openai Startup Fund I, L.P., Openai Startup Fund Management, LLC.
ORDER GRANTING IN PART AND DENYING IN PART MOTIONS TO DISMISS
Re: ECF Nos. 50, 53
Before the Court are motions to dismiss filed by Defendants GitHub, Inc. and Microsoft Corporation, ECF No. 50; and Defendants OpenAI, Inc., OpenAI, L.P., OpenAI GP, L.L.C., OpenAI Startup Fund GP I, L.L.C., OpenAI Startup Fund I, L.P., and OpenAI Startup Fund Management, LLC (collectively "OpenAI Defendants"), ECF No. 53. Court will grant the motions in part and deny them in part.
Plaintiffs are software developers who challenge Defendants' development and operation of Copilot and Codex, two artificial intelligence-based coding tools.1 For the purposes of the present motions, the Court accepts as true the following facts in the operative complaint.2
GitHub, which was acquired by Microsoft in 2018, is the largest internet hosting service for software projects stored in Git, a widely used open-source version control system for managing software source code. Using GitHub permits software developers or programmers to collaborate on projects stored in repositories. Repositories may be private or public; anyone can view and access code stored in public repositories.
All code uploaded to GitHub is subject to the GitHub Terms of Service, which provide that users retain ownership of any content they upload to GitHub, but grant GitHub the "right to store, archive, parse, and display [the content], and make incidental copies, as necessary to provide the Service, including improving the Service over time." No. 22-cv-7074-JST, ECF No. 1-2 at 27. This "includes the right to do things like copy [the code] to our database and make backups; show it to you and other users; parse it into a search index or otherwise analyze it on our servers; [and] share it with other users." Id. at 27-28. Further, the Terms of Service provide that users who set their repositories to be viewed publicly "grant each User of GitHub a nonexclusive, worldwide license to use, display, and perform [the content] through the GitHub Service and to reproduce [the content] solely on GitHub as permitted through GitHub's functionality." Id. at 28.
Without AI-based assistance, programmers generally write code "both by originating code from the writer's own knowledge of how to write code as well as by finding pre-written portions of code that—under the terms of the applicable license—may be incorporated into the coding project." Compl. ¶ 78. Plaintiffs have each published licensed materials in which they hold a copyright interest to public repositories on GitHub. When creating a new repository, a GitHub user may "select[ ] one of thirteen licenses from a dropdown menu to apply to the contents of that repository." Id. ¶ 34 n.4. Two of the suggested licenses waive copyrights and related rights. The remaining eleven suggested licenses3 require that any derivative work or copy of the licensed work include attribution to the owner, inclusion of a copyright notice, and inclusion of the license terms. Each Plaintiff published code to a public repository on GitHub under one of the eleven suggested licenses that include these three requirements.
In June 2021, GitHub and OpenAI released Copilot, an AI-based program that can "assist software coders by providing or filling in blocks of code using AI." Id. ¶ 8. In August 2021, OpenAI released Codex, an AI-based program "which converts natural language into code and is integrated into Copilot." Id. ¶ 9. Codex is integrated into Copilot: "GitHub Copilot uses the OpenAI Codex to suggest code and entire functions in real-time, right from your editor." Id. ¶ 47 (quoting GitHub website). GitHub users pay $10 per month or $100 per year for access to Copilot. Id. ¶ 8.
Codex and Copilot employ machine learning, "a subset of AI in which the behavior of the program is derived from studying a corpus of material called training data." Id. ¶ 2. Using this data, "through a complex probabilistic process, [these programs] predict what the most likely solution to a given prompt a user would input is." Id. ¶ 79. Codex and Copilot were trained on "billions of lines" of publicly available code, including code from public GitHub repositories. Id. ¶¶ 82-83.
Despite the fact that much of the code in public GitHub repositories is subject to open-source licenses which restrict its use, id. ¶ 20, Codex and Copilot "were not programmed to treat attribution, copyright notices, and license terms as legally essential," id. ¶ 80. Copilot reproduces licensed code used in training data as output with missing or incorrect attribution, copyright notices, and license terms. Id. ¶¶ 56, 71, 74, 87-89. This violates the open-source licenses of "tens of thousands—possibly millions—of software developers." Id. ¶ 140. Plaintiffs additionally allege that Defendants improperly used Plaintiffs' "sensitive personal data" by incorporating the data into Copilot and therefore selling and exposing it to third parties. Id. ¶¶ 225-39.
Plaintiffs filed multiple cases against Defendants, which were subsequently consolidated. ECF No. 47. Plaintiffs, on behalf of themselves and two putative classes,4 plead twelve counts against Defendants: (1) violation of the Digital Millennium Copyright Act ("DMCA"), 17 U.S.C. §§ 1201-05; (2) common law breach of open-source licenses; (3) common law tortious interference in a contractual relationship; (4) common law fraud; (5) false designation of origin in violation of the Lanham Act, 15 U.S.C. § 1125; (6) unjust enrichment in violation of Cal. Bus. & Prof. Code §§ 17200, et seq., and the common law; (7) unfair competition in violation of the Lanham Act, 15 U.S.C. § 1125; Cal. Bus. & Prof. Code §§ 17200, et seq., and the common law; (8) breach of contract for violation of the GitHub Privacy Policy and Terms of Service; (9) violation of the California Consumer Privacy Act ("CCPA"); (10) common law negligence; (11) common law civil conspiracy; and (12) declaratory relief under 28 U.S.C. § 2201(a) and Cal. Code Civ. Proc. § 1060.5 Defendants now move to dismiss the complaint. ECF Nos. 50, 53.
The Court has jurisdiction over Plaintiffs' federal claims under 28 U.S.C. § 1331 and supplemental jurisdiction over Plaintiffs' state law claims under 28 U.S.C. § 1367.
"Article III of the Constitution confines the federal judicial power to the resolution of 'Cases' and 'Controversies.' " TransUnion LLC v. Ramirez, 594 U.S. 413, 141 S. Ct. 2190, 2203, 210 L.Ed.2d 568 (2021). "For there to be a case or controversy under Article III, the plaintiff must have a 'personal stake' in the case—in other words, standing." Id. (quoting Raines v. Byrd, 521 U.S. 811, 819, 117 S.Ct. 2312, 138 L.Ed.2d 849 (1997)). A defendant may attack a plaintiff's assertion of jurisdiction by moving to dismiss under Rule 12(b)(1) of the Federal Rules of Civil Procedure. Cetacean Cmty. v. Bush, 386 F.3d 1169, 1174 (9th Cir. 2004); see also Maya v. Centex Corp., 658 F.3d 1060, 1067 (9th Cir. 2011) ().
"A Rule 12(b)(1) jurisdictional attack may be facial or factual." Safe Air for Everyone v. Meyer, 373 F.3d 1035, 1039 (9th Cir. 2004). "In a facial attack, the challenger asserts that the allegations contained in a complaint are insufficient on their face to invoke federal jurisdiction." Id. Where, as here, a defendant makes a facial attack, the court assumes that the complaint's allegations are true and draws all reasonable inferences in the plaintiff's favor. Wolfe v. Strankman, 392 F.3d 358, 362 (9th Cir. 2004).
"Dismissal under [Federal Rule of Civil Procedure] 12(b)(6) is appropriate only where the complaint lacks a cognizable legal theory or sufficient facts to support a cognizable legal theory." Mendiondo v. Centinela Hosp. Med. Ctr., 521 F.3d 1097, 1104 (9th Cir. 2008). A complaint must contain "a short and plain statement of the claim showing that the pleader is entitled to relief." Fed. R. Civ. P. 8(a)(2). Facts pleaded by a plaintiff "must be enough to raise a right to relief above the speculative level." Bell Atl. Corp. v. Twombly, 550 U.S. 544, 555, 127 S.Ct. 1955, 167 L.Ed.2d 929 (2007).
"To survive a motion to dismiss, a complaint must contain sufficient factual matter, accepted as true, to 'state a claim to relief that is plausible on its face.' " Ashcroft v. Iqbal, 556 U.S. 662, 678, 129 S.Ct. 1937, 173 L.Ed.2d 868 (2009) (quoting Twombly, 550 U.S. at 570, 127 S.Ct. 1955). "A claim has facial plausibility when the plaintiff pleads factual content that allows the court to draw the reasonable inference that the defendant is liable for the misconduct alleged." Id. In determining whether a plaintiff has met this plausibility standard, the Court must "accept all factual allegations...
Experience vLex's unparalleled legal AI
Access millions of documents and let Vincent AI power your research, drafting, and document analysis — all in one platform.
Start Your 3-day Free Trial of vLex and Vincent AI, Your Precision-Engineered Legal Assistant
-
Access comprehensive legal content with no limitations across vLex's unparalleled global legal database
-
Build stronger arguments with verified citations and CERT citator that tracks case history and precedential strength
-
Transform your legal research from hours to minutes with Vincent AI's intelligent search and analysis capabilities
-
Elevate your practice by focusing your expertise where it matters most while Vincent handles the heavy lifting
Start Your 3-day Free Trial of vLex and Vincent AI, Your Precision-Engineered Legal Assistant
-
Access comprehensive legal content with no limitations across vLex's unparalleled global legal database
-
Build stronger arguments with verified citations and CERT citator that tracks case history and precedential strength
-
Transform your legal research from hours to minutes with Vincent AI's intelligent search and analysis capabilities
-
Elevate your practice by focusing your expertise where it matters most while Vincent handles the heavy lifting
Start Your 3-day Free Trial of vLex and Vincent AI, Your Precision-Engineered Legal Assistant
-
Access comprehensive legal content with no limitations across vLex's unparalleled global legal database
-
Build stronger arguments with verified citations and CERT citator that tracks case history and precedential strength
-
Transform your legal research from hours to minutes with Vincent AI's intelligent search and analysis capabilities
-
Elevate your practice by focusing your expertise where it matters most while Vincent handles the heavy lifting
Start Your 3-day Free Trial of vLex and Vincent AI, Your Precision-Engineered Legal Assistant
-
Access comprehensive legal content with no limitations across vLex's unparalleled global legal database
-
Build stronger arguments with verified citations and CERT citator that tracks case history and precedential strength
-
Transform your legal research from hours to minutes with Vincent AI's intelligent search and analysis capabilities
-
Elevate your practice by focusing your expertise where it matters most while Vincent handles the heavy lifting