Dropbox’s AI integration with OpenAI turns into a messaging mess – as Amazon’s CTO apologises over data protection post
"To data scientists and developers in the domain, the answers to these questions may be laughably obvious and the questions naive, but to most end-users they will not be."
Dropbox has alarmed many users by launching a product that sends some customers’ files to OpenAI to underpin its new Dropbox AI capabilities – leading Amazon’s CTO Werner Vogels to post on X “did I agree to this somewhere?” and tag a European data watchdog in the same post.
Dropbox AI is available in countries “with the preferred language set to English – excluding Canada, the UK, and countries within the EEA” an FAQ from Dropbox showed, as many customers apparently unexpectedly found a toggle to “share with third-party AI” turned on in their settings.
A Dropbox spokesperson told The Stack that the toggle is "only turned on to give all eligible customers the opportunity to view our new AI features and functionality, like Dropbox AI... currently only available to select customers - see our Help Center article for details on who has access [and] does not enable customers to use these features without notice. Our customers are still in control of when and how they use these features."
Vogels said on X he had “created the account when I was in the US, but [was] now in the EU” – approached with a public clarification on the service from Dropbox CEO Drew Houston, the Amazon CTO then said “I drew the wrong conclusion” and apologised “sincerely” for his “error.”
Yet Vogels was not alone in initial confusion and concern about how the new Dropbox AI service and its integration with OpenAI works; a stark reminder that generative AI-powered product releases are fraught with reputational risk and the need for utter clarity by providers on the complex plumbing behind large language models augmented with customer data.
The brouhaha on Wednesday saw Dropbox’s CEO say that “any customer confusion about this is on us, and we'll take a turn to make sure all this is abundantly clear” as he attempted to highlight who was opted in to the feature and precisely what it entailed in terms of sharing customer data.
Dropbox CEO on OpenAI integration: “Clearly labeled”
In June Dropbox announced its “Dash” and “Dropbox AI” offerings in Alpha and Beta releases respectively. Dash is an “AI-powered universal search that connects all of your tools, content, and apps” including across platforms like Google Workspace and Microsoft Outlook. It described Dropbox AI meanwhile as helping customers “quickly understand large documents or videos without parsing through the entire file.”
The company said in a December 13 FAQ that “your files within Dropbox are sent to a third-party AI only when you chose to interact with AI powered features” – and CEO Houston told customers that “third-party AI services are only used when customers actively engage with Dropbox AI features which themselves are clearly labeled,” he said in a post on X.
"The third-party AI toggle in the settings menu enables or disables access to DBX [Dropbox] AI features and functionality. Neither this nor any other setting automatically or passively sends any Dropbox customer data to a third-party AI service” Dropbox CEO Drew Houston explained.
Dropbox’s FAQ on its AI services says that “some of our AI features utilize software from third-party partners that requires sending data through outside LLMs and generative AI models to generate responses.”
Data never used to train models, but confusion abounds
“At this time, we’re partnered with one third-party AI partner, OpenAI” it clarified, saying that “your data is never used to train their internal models, and is deleted from OpenAI’s servers within 30 days.”
Dropbox claims in its FAQ that “only the content relevant to an explicit request or command is sent to our third-party AI partners to generate an answer, summary, or transcript”. But the company could not immediately answer a question from The Stack about how a third-party AI “knew” what content was relevant to an “explicit request” without having parsed that content in its entirety, nor did an otherwise promptly responding spokesperson confirm if it was using Retrieval Augmented Generation.
(To data scientists and some developers in the domain, the answers to these questions may be laughably obvious and the questions themselves naive, but to most end-users they will not be. We suspect that it involves something like embedding a question with the OpenAI embeddings
endpoint; running a semantic search on an Elasticsearch index using the encoded question and sending the top search results to the OpenAI Chat Completions API endpoint for RAG. Explaining a process like this and what happens where clearly, perhaps in a user-friendly chart, is no mean feat however. Send your best efforts over if you fancy the challenge!)
AI is a big Dropbox focus
Dropbox, meanwhile, finds itself having to deal with some customers howling over data privacy fears at a time when it should have been celebrating a new product feature designed, on paper, to make both customers and its shareholders happy.
The incident comes weeks after the company reported fiscal Q3 revenues of $633 million, with GAAP net income of $114.1 million. On its Q3 earnings call CFO Tim Regan emphasised the extent to which AI was a priority for the company, saying it sees "the shift from files and folders, along with recent advancements in AI, giving way to new market opportunities. In particular, the search and knowledge discovery software market. IDC sizes this as a $7 billion market today that's expected to triple over the next four years. And we believe we're well positioned to take part in this secular wave... We recognize in the rapidly evolving world of AI, customers are looking for tools they can trust to keep their content safe. This is why we're building in the right controls, admin and compliance features so customers can feel safe deploying [such features]."