watermarking for privacy (with no durian)

In Scalable watermarking for identifying large language model outputs, published by Dathathri, See, Ghaisas, et (many) al. in Nature of 23 October 2024, the authors propose an algorithm to (voluntarily) watermark synthetic texts to identify them as such, through a statistical test. Here are a few quotes to relate to the authors’ solution.

“LLMs generate text based on preceding context (…) given a sequence of input text x_<t = x₁, …, x_t−1 consisting of t − 1 tokens from a vocabulary V, the LLM computes the probability distribution p_LM(⋅∣x_<t) of the next token x_t given the preceding text x_<t. To generate the full response, x_t is sampled from p_LM(⋅∣x_<t), and the process repeats until either a maximum length is reached or an end-token is generated.

In a watermarking scheme, a sampling algorithm is an algorithm that takes as input a probability distribution p ∈ ΔV and a random seed and returns a token.

Tournament sampling selects a token from the LLM distribution that is likely to score higher under the random watermarking functions (…) Given the selection of tokens x_t based on higher g-values, we expect watermarked text generally to score higher under this score than unwatermarked text (…) [It] requires g-values to decide which tokens win each match in the tournament. Intuitively, we want a function that takes a token x ∈ V, a random seed and the layer number ℓ ∈ {1, …, m}, and outputs a g-value g_ℓ(x, r) that is a pseudorandom sample from some probability distribution f_g (the g-value distribution).”

At the Ocean privacy workshop, someone came with the question of trusting (or not) synthetic data and I remembered this article. Suggesting watermarking for said synthetic data by having providers or agents (privately) running a disclosed or registered code that delivers a watermark which provides a strong support to the synthetic data being produced likewise. I remain uncertain this is at realistic.

This entry was posted on February 14, 2025 at 12:25 am and is filed under Books, Statistics, Travel, University life with tags #ERCSyG, AI regulation, AI scare, durian, large language models, Les Houches, LLM, Ocean, privacy, proper scoring rule, watermarking. Nature. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

Xi'an's Og