In late May, the Pentagon seemed to be on fire.
A number of miles away, White House aides and reporters scrambled to determine whether a viral online image of the exploding constructing was in actual fact real.
It wasn’t. It was AI-generated. Yet government officials, journalists, and tech firms were unable to take motion before the image had real impact. It not only caused confusion but led to a dip in financial markets.
Manipulated and misleading content shouldn’t be a brand new phenomenon. But AI enables increasingly accessible, sophisticated, and hyperrealistic content creation that—while it could be used for good, in artistic expression or accessibility improvements—can be abused to forged doubt on political events, or to defame, harass, and exploit.
Whether to advertise election integrity, protect evidence, reduce misinformation, or preserve historical records, audiences may gain advantage from knowing when content has been manipulated or generated with AI. Had the Pentagon image contained signs that it was AI-generated, technology platforms may need been capable of take motion more quickly; they may have promptly reduced its distribution or perhaps labeled the content in order that audiences may need been capable of more easily discover it as fake. Confusion, and by extension market movement, may need been avoided.
There’s absolute confidence that we want more transparency if we’re going to give you the chance to distinguish between what’s real and what’s synthetic. Last month, the White House weighed in on methods to do that, announcing that seven of essentially the most distinguished AI firms have committed to “develop robust technical measures to make sure that users know when content is AI-generated, reminiscent of watermarking.”
Disclosure methods like watermarks are a great start. Nevertheless, they’re complicated to place into practice, and so they aren’t a fast fix. It’s unclear whether watermarks would have helped Twitter users recognize the fake image of the Pentagon or, more recently, discover Donald Trump’s voice in an ad campaign as synthetic. Might other methods, reminiscent of provenance disclosure and metadata, have more impact? And most vital, would merely disclosing that content was AI-generated help audiences differentiate fact from fiction, or mitigate real-world harm?
To start to reply these questions, we want to make clear what we mean by watermarking and other forms of disclosure methods. It must be clear what they’re, what we will reasonably expect them to do, and what problems remain even after they’re introduced. Although definitional debates can seem pedantic, the broad use of the term “watermark” is currently contributing to confusion and an absence of coordination across the AI sector. Defining what we mean by these different methods is an important prerequisite for the AI field to work together and agree on standards for disclosure. Otherwise, persons are talking at cross-purposes.
I’ve observed this problem firsthand while leading the nonprofit Partnership on AI (PAI) in its multi-sector work to develop guidelines for responsible synthetic media, with commitment from organizations like OpenAI, Adobe, Witness, Microsoft, the BBC, and others.
On the one hand, watermarking can check with signals which might be visible to finish users (for instance, the “Getty Images” text emblazoned on the image supplier’s media). Nevertheless, it could even be used to mean technical signals embedded in content which might be imperceptible to the naked eye or ear. Each forms of watermarks—described as “direct” and “indirect” disclosure—are vital to get right to make sure transparency. Any conversation concerning the challenges and opportunities in watermarking, then, must highlight which sort of watermarking is being evaluated.
Further complicating matters, watermarking is commonly used as a “catch-all” term for the overall act of providing content disclosures, despite the fact that there are lots of methods. A more in-depth read of the White House commitments describes one other method for disclosure often known as provenance, which relies on cryptographic signatures, not invisible signals. Nevertheless, this is commonly described as watermarking in the favored press. In the event you find this mish-mash of terms confusing, rest assured you’re not the just one. But clarity matters: the AI sector cannot implement consistent and robust transparency measures if there shouldn’t be even agreement on how we check with different techniques.
I’ve provide you with six initial questions that would help us evaluate the usefulness of watermarks and other disclosure methods for AI. These should help make sure that different parties are discussing the very same thing, and that we will evaluate each method in an intensive, consistent manner.
Can the watermark itself be tampered with?
Mockingly, the technical signals touted as helpful for gauging where content comes from and the way it’s manipulated can sometimes be manipulated themselves. While it’s difficult, each invisible and visual watermarks could be removed or altered, rendering them useless for telling us what’s and isn’t synthetic. And notably, the convenience with which they could be manipulated varies in keeping with what sort of content you’re coping with.
Is the watermark’s durability consistent for various content types?
While invisible watermarking is commonly promoted as a broad solution for coping with generative AI, such embedded signals are rather more easily manipulated in text than in audiovisual content. That likely explains why the White House’s summary document suggests that watermarking can be applied to all kinds of AI, but in the complete text it’s made clear that firms only committed to disclosures for audiovisual material. AI policymaking must due to this fact be specific about how disclosure techniques like invisible watermarking vary of their durability and broader technical robustness across different content types. One disclosure solution could also be great for images, but useless for text.
Who can detect these invisible signals?
Even when the AI sector agrees to implement invisible watermarks, deeper questions are inevitably going to emerge around who has the capability to detect these signals and eventually make authoritative claims based on them. Who gets to make your mind up whether content is AI-generated, and maybe as an extension, whether it’s misleading? If everyone can detect watermarks, which may render them liable to misuse by bad actors. Then again, controlled access to detection of invisible watermarks—especially whether it is dictated by large AI firms—might degrade openness and entrench technical gatekeeping. Implementing these styles of disclosure methods without figuring out how they’re governed could leave them distrusted and ineffective. And if the techniques usually are not widely adopted, bad actors might turn to open-source technologies that the invisible watermarks to create harmful and misleading content.
Do watermarks preserve privacy?
As key work from Witness, a human rights and technology group, makes clear, any tracing system that travels with a chunk of content over time may also introduce privacy issues for those creating the content. The AI sector must make sure that watermarks and other disclosure techniques are designed in a fashion that doesn’t include identifying information which may put creators in danger. For instance, a human rights defender might capture abuses through photographs which might be watermarked with identifying information, making the person a simple goal for an authoritarian government. Even the knowledge that watermarks reveal an activist’s identity may need chilling effects on expression and speech. Policymakers must provide clearer guidance on how disclosures could be designed in order to preserve the privacy of those creating content, while also including enough detail to be useful and practical.
Do visible disclosures help audiences understand the role of generative AI?
Even when invisible watermarks are technically durable and privacy preserving, they won’t help audiences interpret content. Though direct disclosures like visible watermarks have an intuitive appeal for providing greater transparency, such disclosures don’t necessarily achieve their intended effects, and so they can often be perceived as paternalistic, biased, and punitive, even once they usually are not saying anything concerning the truthfulness of a chunk of content. Moreover, audiences might misinterpret direct disclosures. A participant in my 2021 research misinterpreted Twitter’s “manipulated media” label as suggesting that the institution of “the media” was manipulating him, not that the content of the particular video had been edited to mislead. While research is emerging on how different user experience designs affect audience interpretation of content disclosures, much of it’s concentrated inside large technology firms and focused on distinct contexts, like elections. Studying the efficacy of direct disclosures and user experiences, and never merely counting on the visceral appeal of labeling AI-generated content, is important to effective policymaking for improving transparency.
Could visibly watermarking AI-generated content diminish trust in “real” content?
Perhaps the thorniest societal query to judge is how coordinated, direct disclosures will affect broader attitudes toward information and potentially diminish trust in “real” content. If AI organizations and social media platforms are simply labeling the proven fact that content is AI-generated or modified—as an comprehensible, albeit limited, solution to avoid making judgments about which claims are misleading or harmful—how does this affect the best way we perceive what we see online?
Media literacy via disclosure is a noble endeavor; yet many working on policy teams inside and beyond tech firms understandably worry that a premature push to label all generated content will usher within the liar’s dividend—a dynamic wherein societal skepticism of all content as potentially AI-generated is so pronounced that it undermines trust in real content that shouldn’t be generated with AI. This prospect also contributes to uncertainty on whether all seemingly low-stakes uses of AI in content creation—for instance, the iPhone’s portrait mode, which relies on AI techniques, or voice assistants mentioned within the White House commitments—warrant a disclosure that AI was involved. The sphere must work together to measure societal attitudes toward information over time and determine when it is sensible to reveal the involvement of AI. Most significant, they have to evaluate the impact of visible disclosures that merely describe the tactic of content creation—stating that something was generated or edited by AI—as a proxy for what we actually care about: indicating whether the content’s claim is true or false.
The challenges that watermarks and other disclosure techniques pose mustn’t be used as an excuse for inaction or for limiting transparency. As an alternative, they need to provide an impetus for firms, policymakers, and others to work together on definitions and judge how they’ll evaluate the inevitable trade-offs involved with implementation. Only then can generative AI policies adequately help audiences differentiate fact from fabrication.