[Openid-specs-ab] [for browser geeks only] comments on privacy trials now on-going

Tom Jones thomasclinganjones at gmail.com
Tue May 25 15:46:57 UTC 2021

This is interesting, in part, because it talks about trust tokens as a
potential replacement for 3rd party cookies.  How would that fit into an
identity protocol?

<<<<<<<<< Here are the comments from the TT explainer.  >>>>>>>>>>>>>
We'd like to request an increase from the standard “0.5% of page loads”
origin trial usage target, to 5%. We feel this won’t lead to burn-in
because the API is intended to be a *less useful* (but privacy-better)
partial substitute for an existing web feature (third-party cookies): *the
raison d’etre of the API in the *current* world is prototyping and

The main goal of this origin trial is to understand whether trust tokens
are useful for fraud detection. A key challenge is distinguishing the
effect of sparsity---if we enable Trust Tokens for 1% of clients, models
trained on observed tokens will see a roughly 1% sample---from the effect
of the inherent coarseness of tokens’ encoded information.

Our experimentation partners have evaluated existing models analogous to
potential trust token-based models on sparse subsamples of their training
data; they’ve found that roughly a 10% sample would suffice to retain most
of the models’ quality. This translates to estimated UseCounter values of
under 5%.^

The primary argument for keeping origin trial usage low is the ability to
avoid large developers depending on the feature. We don’t expect developers
to develop a production reliance on tokens’ embedded information during
this origin trial, because trust tokens are harder to use and contain far
less information than third-party cookies (the benefit being that they’re
much better for privacy). Further, since Trust Tokens is independently
gated by a base::Feature, origin trial participants will still see the API
unavailable for a substantial majority of their users.

<<<<<< following is from the original email on [blink dev]. >>>>>>>>>>>
<<<<<< this presages how privacy innovation might be deployed >>>>>>>>

The API owners plan to approve increased Origin Trial limits for certain
trials, feedback welcome.


* In cases where there is a strong justification, the API owners have
approved increasing the total amount of web traffic for an Origin Trial
from 0.5% to a higher number, potentially as  high as 20%. The three origin
trials we considered were FLoC [1], Conversion Measurement [2] and Trust
Tokens [3].
* The higher limits are justified by situations where the experimental goal
of the trial is inherently dependent on statistically significant
measurements aggregated across a number of sites
* Approval for these increased limits comes with strong additional
reporting requirements to the API OWNERS and transparency to blink-dev


The API OWNERS have been asked to consider relatively high per-feature
Origin Trials limits for a succession of privacy features in recent months
[1][2][3]. We recently met with teams driving these APIs to help build a
mutual understanding of the goals of the trials and the API OWNERS approach
to risk in evaluating trials.

This message is an attempt to capture some of that context, document
agreement about trial limits, and outline next steps.

*A Risk-Based Approach*
The job of the API OWNERS is to ensure the Blink process is followed in
spirit. The design goal of our process has been to maintain the health of
the platform, acknowledging that:
 * Progress requires different views to be sifted in order to find good
outcomes that maximize benefits over the long haul
 * Features should be able to demonstrate that they solve important
problems and solve them well
 * *Somebody* must lead in designing features. Given how frequently
Chromium is out in front, that’s uncomfortable, so we raise the bar where
we’re taking larger risks
 * We want to ensure that when features are developed, they have (and take)
every opportunity to learn and adapt while they’re still malleable, both to
avoid “burn in” of regretted designs, and cast the widest net for developer
feedback possible

Basically, we guide you to pack as much iteration and feedback gathering
into your development process as possible, recognizing that mistakes are
incredibly expensive and that we learn by listening.

Commensurately, we give maximum consideration to developer feedback, and
somewhat less to other factors (working group consensus, TAG feedback,
etc.) in I2S threads. They’re important quality signals, but what gives us
the most confidence that we’ve solved an important problem well are the
results from the field.

*Complex OT Requirements*
Origin Trials are one of our best ways for getting feedback on the design
and usefulness of features without creating new, undue risks for the
ecosystem. But since these trials are in-the-wild experiments, we must be
very careful to avoid burn in or “soft launch” risks.

Some Privacy-oriented features being trialed now have several additional
properties that complicate the picture:

 * These APIs are being consumed by third parties who are attempting to
verify ML models in concert with the signals being given through the
K-anonymity algorithm the APIs provide
 * Multiple variants need to potentially be tried
 * It’s difficult to build confidence in whether these ML models work at
the fractional levels of traffic we’ve green-lit in the past

Instead of treating each API as a third-party Origin Trial with one-off
limits, we want to unify the story for these trials, manage them
coherently, and report back to the community about how it goes.

*Risks We Face*
By potentially raising limits for OTs, we face risks that we want to make
explicit and address head-on:

 * Burn-in: Developers who come to rely on a feature may “lean on” us not
to take it away as we consider alternatives, change direction, or iterate
on the design. Low usage limits + single-digit milestone time limits have
been the way we have addressed this in the past. Some OTs have been allowed
to use well above the general limit for short periods of time to collect
data across big sites, so long as total use over, say, a week remains under
the threshold. This requires careful partnership and coordination.

 * Reputation risk: The OWNERS acknowledge a risk to the project from being
seen to “pre launch” features. Required breakage between OT and launch has
been our mechanism for tempering “go fever”. Recently we modified this to
add a (new) extra step for requesting a “gapless OT” as part of an Intent
to Ship which comes with a higher evidentiary bar regarding developer

 * Precedent: Each exception we make to policy is potential precedent. To
ensure that we are not “playing favourites”, our approach has been to
green-light exceptions after discussion with requesters, document them as
we go, and (if they work) to perhaps change the process. Recent examples
here include Gapless OT exceptions and the (conditional) TAG review
exception policy.

We have discussed each of these risks with the folks running these origin
trials and have come to a more nuanced understanding of the parties
involved and our ability to count on the folks running the trials to
prevail on partners regarding breaking changes.

As our approach is risk-based, these discussions are helpful in reducing
the potential for burn-in from the API OWNERS perspective. It may also
suggest a path for others to request higher limits over longer time-scales
in conversation with us.

*Proposed Policy*
We want to facilitate broad-scale learning across the ecosystem in a way
that will shorten the eventual path to launch.

Respectful of the risks above, the tentative agreement we have reached
is:the straw-person agreement is:

 * Frequent (perhaps monthly) reporting of total page load usage (via UMA)
of the APIs in question. This matters because the top-line traffic limit
won’t necessarily map to the amount of use by third parties who will be
running their own experiments.

 * Experiments also gated by finch flags will report on where those limits
have been set and if/when they are substantially changed.

 * High nominal limits for the length of the trial. Enablement of > 80% of
page loads _in theory_ might be enabled by the OT, but they would be “cut”
with fractional rollout via finch to ensure that limits of ~15-20% of page
loads are never breached. This is much higher than usual, and is an
acknowledged risk.

 * 2 release duration for each variant of the trial, with agreement to
change the API in ways that ensure partners must change their systems
within the trial to keep pace. The goal here is to prevent burn-in despite
high usage.

Per usual, we expect trial running developers to report back regarding the
experience of developers and how it informs their APIs going forward.

Obviously, this is a departure from current practice and something we want
community feedback on.

Thanks for taking the time to read this, and for your feedback in this




Be the change you want to see in the world ..tom
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openid.net/pipermail/openid-specs-ab/attachments/20210525/378da5ab/attachment.html>

More information about the Openid-specs-ab mailing list