Built-in vs 3rd Party AI: How to Approach Adding Generative AI to Your Software Stack
Compliance and Security
The odds are extremely high that your team has already used the ChatGPT in their work. If that speeds up their work and reduces repetitive busy work, that’s a win for your team’s productivity. If that comes at the expense of data security, though, or opens up your company to potential copyright lawsuits, the benefits might not be worth the risk.
Built-in vs 3rd Party AI: How to Approach Adding Generative AI to Your Software Stack
It’s easy to start using AI at work. Almost too easy, in fact.
13 million people used ChatGPT each day in January 2023. That number tripled only six months later. SimilarWeb lists ChatGPT as the 28th most popular site in the world today, alongside Pinterest and Netflix.
“We cannot recall a faster ramp in a consumer internet app,” Reuters cited a UBS analyst.
The odds are extremely high that your team has already used the ChatGPT in their work. If that speeds up their work and reduces repetitive busy work, that’s a win for your team’s productivity.
If that comes at the expense of data security, though, or opens up your company to potential copyright lawsuits, the benefits might not be worth the risk.
The safest way to use large language model (LLM) generative AI today is in the software your team has already deployed.
A crisis of AI confidence
Merely four months after ChatGPT was released, Samsung banned the AI-powered chat tool from their premises. “There are growing concerns about security risks presented by generative AI,” wrote Samsung’s management team in May 2023 after a team member shared sensitive code with the AI-powered chat. Apple followed suit, restricting the use of ChatGPT and GitHub’s Copilot AI over concerns that “workers who use these types of programs could release confidential data,” reported the Wall Street Journal later that month.
Generative AI brings three core concerns to the workplace. Data leaks from data entered into the LLM are one. Will data typed into ChatGPT be used to train AI models? Could the AI then accidentally leak the info to others? That’s concerning if you’re creating new products still under wraps, when leaked details would give competitors an advantage. It’s also potentially illegal if personally identifiable data covered by HIPAA and other privacy frameworks is shared with and stored by the AI system.
Then come two equally pressing concerns around generative AI’s output. Who’s liable if the AI generates misleading or libelous content? And who owns the copyright of AI-generated output, especially if it’s created from copyrighted input?
The former is yet to be clearly decided, but there are strong indicators that AI platforms will be held responsible for their output. The European Union has taken concerns over misleading AI output seriously enough that their upcoming AI Liability Directive may “allow individuals to claim compensation for harm caused by a defective product” from the service provider.
Across the Atlantic, similar concerns have brought challenges to US Section 230—the law that protects platform operators from suits over user-generated content, protection that might not extend to AI-generated output. “The law’s 1996 drafters told DealBook that it does not,” the New York Times reported. If you operate a generative AI and it publishes libelous content, the verdict’s still out on whether you're liable for it.
Fair use is under similar scrutiny. Spotify, for example, removed AI-generated songs, shortly after Universal Music CEO stated that such creations “create rights issues with respect to existing copyright law.” Similar concerns led developer Benji Smith to take down Prosecraft, a site with AI-powered analysis of in-copyright books, while over ten thousand Authors Guild members signed a petition in July for AI teams to stop indexing their books.
Generative AI is operating in a gray space today. It’s too easy to use LLMs without thinking through the consequences of a quick copy from internal tools and a paste into ChatGPT.
That’s one of the many reasons why, on the whole, consumers and business users are not paying for AI today. It’s an untrustworthy black box, fun for experiments but with enough uncertainties to make a legal team want to hold back.
Professional AI, backed by indemnification
That’s changing. Today’s best AI integrations come with increasingly strong legal protections.
You’re on your own with AI images generated by the open-source Stable Diffusion or the more advanced, paid Midjourney service. If they’re found to infringe on others’ copyrights, “you are responsible for Your use of the service,” states Midjourney’s terms of service.
Not so with Adobe. While it came to market months later, Adobe’s generative AI is “trained on Adobe Stock images, openly licensed content, and public domain content” and “designed to be safe for commercial use.”
That gives Adobe the confidence to stand behind their AI-generated output. Claude Alexandre, Adobe’s VP of digital media, promised on Adobe Firefly’s launch “full indemnification for the content created.” “We stand behind the images that we provide,” she concluded. That commercial use readiness now also applies to the newly out-of-beta generative AI built into Photoshop.
Shutterstock offers similar protections for its AI-generated images, backed by human checks. Enterprise customers can generate AI-powered images, submit them to Shutterstock for clearance, then after review “our AI-generated content receives the same legal protection and backing as traditional stock images,” promises the stock photo company.
Text-centric AI launched with the same hands-off approach taken by Midjourney. Many of the products that were to add GPT-powered AI launched with clauses shielding the software provider from liability. Zendesk’s terms, for example, state that they “will not be liable in connection with any inaccuracies or biases produced by OpenAI Functionality,” and that AI is not covered under its HIPAA offerings. ClickUp’s terms caution that “Output may not be unique and ClickUp AI may generate the same or similar output to ClickUp or other third parties,” and similarly forbid sharing protected data. Other OpenAI-powered apps include similar clauses. HubSpot goes further, mentioning that “we may use Customer Data to train our machine learning models,” with options to opt-out if desired.
Increasingly, though, the largest software vendors are building more trust-focused AI integrations. Notion’s AI doesn’t guarantee output, but it does promise to “not use your data to train our models.” Oracle promises that their “generative AI doesn’t mix customer data. The models trained by customers are unique to them.” Salesforce takes a similar approach. “No customer data is stored outside of Salesforce,” states the CRM company. “Generative AI prompts and outputs are never stored in the LLM, and are not learned by the LLM. They simply disappear.”
OpenAI itself is following suit. While not offering any guarantees around its output, the new ChatGPT Enterprise does protect users' data. “We do not train on your business data, and our models don’t learn from your usage,” promises OpenAI, with clauses not included in their free ChatGPT offering.
Then came Microsoft, with the strongest protection for their generative AI features. Microsoft was early to the LLM game, investing $1 billion in OpenAI in 2019, shipping GitHub Copilot in October 2021 as an AI tool to aid coding, and promising to not train the LLM on your code or other data. Two years later, Copilot is now being added to Windows and Office software, and the software giant is confident enough to guarantee its AI output.
“If a third party sues a commercial customer for copyright infringement for using Microsoft’s Copilots or the output they generate,” promises Microsoft’s Copilot Copyright Commitment, “we will defend the customer and pay the amount of any adverse judgments or settlements that result from the lawsuit, as long as the customer used the guardrails and content filters we have built into our products.”
That confidence stems from the way Microsoft trained their LLM. Similar to Adobe, Microsoft has “built important guardrails into our Copilots to help respect authors’ copyrights, [with] filters ... to reduce the likelihood that Copilots return infringing content.”
There’s still no clear guarantee from any generative AI software provider that you can own the copyright on AI-generated output, but Adobe, Microsoft, and Shutterstock at least, you can trust the output isn’t infringing on others' copyright, and that you’re safe from liability.
The case to only use AI inside trusted software
That feels like the future of AI.
Software license agreements have long been written to shield platforms and providers from liability over software usage. Many of Microsoft’s older license agreements included a “No High Risk Use” clause, forbidding use in the operation of nuclear reactors among other dangerous scenarios.
That’s a far more professional way to use generative AI than simply copying and pasting data into an LLM chatbox.
Today, the best way to deploy generative AI at work is in tools you already trust. That’s how our team is deploying AI—and what we recommend to our customers. It’s more expensive, especially versus the free ChatGPT, but more trustworthy. You’ll often get more useful results, as the LLM can use your existing data to help generate contextual responses. You’ll, at the very least, be covered by existing terms of service and license agreements. And with the largest software providers, your legal team can rest easy that your AI-powered work may be protected against copyright claims.
Then, when deploying new software today, check its terms of service for details about AI—even if you’re buying it for other features. Your company needs to ensure that your data isn’t being used in ways you’re not comfortable with—or that could be illegal, for protected, sensitive data.
And if you’re building generative AI features into your products today, consider how to provide confidence to your customers. Guarantee that the models aren’t trained on customer data, or allow customers to opt-out if they wish. Test and monitor output for potential issues. If possible, offer protections around the output—or, perhaps, consider adding AI insurance as Harvard Business Review suggested in early 2020. That’s how you can build the confidence customers need to pay for AI software.
“We want to protect client information used in ML models but we don’t know how to get there,” mentioned a bank’s head of security to the HBS authors. That's how most customers feel about AI-powered tools and software in the workplace.
It’s only worth getting robots to aid work if we can trust them to first do no harm.