Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I2I: Run 3p service scripts in AMP #30193

Closed
zhouyx opened this issue Sep 11, 2020 · 23 comments · Fixed by #33643
Closed

I2I: Run 3p service scripts in AMP #30193

zhouyx opened this issue Sep 11, 2020 · 23 comments · Fixed by #33643
Assignees
Labels
INTENT TO IMPLEMENT Proposes implementation of a significant new feature. https://bit.ly/amp-contribute-code

Comments

@zhouyx
Copy link
Contributor

zhouyx commented Sep 11, 2020

Summary

The AMP team decided to restrict all cross domain iframe running in the background. To allow 3p service providers to run their scripts in AMP document, we propose a workaround to run all 3p scripts in a web worker.

This is different from <amp-script> in two ways. 1. No DOM access or DOM change. 2. Scripts must come from publishers trusted service providers. (Provided by an AMP component)

The following proposal serves as a workaround. It is still recommended that 3p service integrate with AMP as first party AMP component.

Design document

image

ScriptRunner Service

run(scriptUrl) {
  // The service will import and run script in the web worker
}

send(type, data) {
  // send data to the worker. ScriptRunner handles msg buffering before the worker is up.
}

listen(type, listener) {
  // register listener
}

It’s up to the 3P AMP component to define the communication API between its AMP component and the web worker via the listen and send method.

Worker Initialization & Termination

The Worker will be created lazily when ScrptRunner.run() is called the first time.
Messages from AMP to the worker will be buffered for a certain time before the worker has been created.
The worker will remain alive. But AMP runtime may choose to terminate it after a certain timeout.

Multiple scripts

Only one web worker will be created for all 3rd party scripts to share.

Security Concern

We'll need to sandbox the web worker. (e.g. dereferencing Worker globals)

Based on the feedback from the security review. We need to place the web worker within a sandboxed cross origin iframe. The iframe will serve as a proxy to post messages between AMP and the web worker.

To be discussed: Given the sandbox iframe, do we still need to sandbox the web worker? (e.g fetching scripts)

AMP Analytics Integration

There's a request to feed the web worker with data from <amp-analytics>. We think this is reasonable and propose a new transport method worker (name tbd) to <amp-analytics>

Motivation

The Permutive needs to run their scripts to calculate <amp-ad> JSON config on the client side. While they're willing to introduce a <amp-permutive> component, they can't generalize their publisher based solution. #28095

Similar to why we introduce <amp-script>. Being able to run a small piece of their scripts may unblock more service providers from integrating to AMP.

Additional context

Some 3P service providers integrates with AMP via running their scripts in an iframe. This has been discussed in Design Review #28471. AMP will no longer allow cross domain iframe running in the background (e.g. <amp-pinterest> is allowed because there's content) unless an exception has been made with review.

Launch tracker

/cc @ampproject/wg-approvers @joshfg

@zhouyx zhouyx added the INTENT TO IMPLEMENT Proposes implementation of a significant new feature. https://bit.ly/amp-contribute-code label Sep 11, 2020
@cramforce
Copy link
Member

Do you have a list of use cases? I'm not against this, just want to make sure there is a broad set of use cases that both

  • want this
  • and are satisfied by this

@zhouyx
Copy link
Contributor Author

zhouyx commented Sep 11, 2020

Permutive's integration #28095 is the only use case I know right now. We have discussed with them offline on this approach.

We thought about two potential use cases. One is client side ads targeting solution similar to the Permutive case. The other is to allow analytics vendors to compose their outgoing request in their custom way. But I'm not aware of any immediate use case apart from the Permutive request.

We saw 21 thumbs up when the Permutive team asked for this feature: ) #28471 (comment)

All, Please let us know if the proposed feature will be helpful. Thanks!

@JenCastilloSlate
Copy link

This proposal would be extremely helpful in ensuring we have a product that covers a large part of our mobile inventory as it pertains to 1P audience targeting.

@KayeESI
Copy link

KayeESI commented Sep 21, 2020

We are not currently able to make the most of our increasingly mobile audience due to the AMP restrictions. This proposal would help us to better understand and activate this audience.

@jadbousaleh
Copy link

This will be super helpful as it will allow publisher to unlock all their audiences on all platforms

@dpetrieldn
Copy link

This would be huge for us, our mobile product optimisation is lacking in this space due to the limitations around our ability to unlock inventory for audience targeting.

@zhouyx
Copy link
Contributor Author

zhouyx commented Sep 22, 2020

Add @lannka
Thanks for the feedback! Seems the major use case is to support audience targeting.

I'm not very familiar with that. What are the common audience targeting options available? I'd assume this is done on the server side, is client side audience targeting a very common approach? Thanks

@albertosenia
Copy link

This feature would be terrific way for our Publisher Alliance audience based company, to reach audiences and do better targeting in AMP environment.

@morsssss
Copy link
Contributor

Is this a case in which cookies support would be helpful? (Even though it sounds like cookies aren't needed for #28095)

See also ampproject/worker-dom#451 .

@samouri
Copy link
Member

samouri commented Sep 30, 2020

I think I have good news! A similar use-case for dynamically manipulated json was just solved for <amp-list>. I believe the recent changes to worker-dom and amp-script have mostly closed the gap between <amp-script> and this proposal.

This is different from in two ways. 1. No DOM access or DOM change. 2. Scripts must come from publishers trusted service providers. (Provided by an AMP component)

  • (1) Was recently added to amp-script with the no-dom attribute. It removes access to the DOM and reduces bundle size of the worker mjs from 12.5kb --> 1.42kb. It also forces visibility:hidden on the component.
  • (2) This is more restrictive than amp-script. Unless we are granting greater permissions to these scripts than we give to amp-script, it seems like an argument for reuse.

The Permutive needs to run their scripts to calculate JSON config on the client side.

Scripts in worker-dom can export functions for use in the main-thread (ampproject/worker-dom#850). In amp-list, publishers can now specify an amp-script function to use to calculate the json -- could we use the same plumbing/format for <amp-ad>?

Note that I still do see some diffs in functionality between this I2I and current capabilities. E.g.

  1. There isn't great support for arbitrary message passing back/forth (currently only one way callFunction from main-thread to worker).
  2. Each script would run in its own Worker instead of all in the same one.

@zhouyx
Copy link
Contributor Author

zhouyx commented Oct 1, 2020

@samouri Wonderful, didn't know we have the no-dom <amp-script> support.

Sounds to me this new <amp-script> worker is a great start. I believe we can reuse the infrastructure of the no DOM worker!

A few notes

(2) This is more restrictive than amp-script. Unless we are granting greater permissions to these scripts than we give to amp-script, it seems like an argument for reuse.

Because scripts are loaded from trusted service provider, we'd also like to remove the script hash requirement from <amp-script> https://amp.dev/documentation/components/amp-script/#calculating-the-script-hash

There isn't great support for arbitrary message passing back/forth (currently only one way callFunction from main-thread to worker).

If Scripts in worker-dom can export functions for use in the main-thread we probably won't need to pass message back from worker to AMP. However to feed the worker with analytics data, we'd still need pass message to the worker.

Each script would run in its own Worker instead of all in the same one.

We propose reusing the same worker due to the concern of the core number limit. It doesn't have to run in the same one, but I believe some type of worker number limit need to be applied.

@TameImp
Copy link

TameImp commented Oct 1, 2020

As a publisher who has worked with permutive for over 3 years now, we've benefited immensely from their technology's ability to maximise the visibility and targeting of our audiences. However, an ever-increasing proportion of our users engage with us in AMP environments and so this proposal would be hugely beneficial for us in the insights and targeting we can provide on these users.

@dbl-wemass
Copy link

Being able to pass the permutive (or other DMP) data to our adserver is critical for our core business. We'd benefit a lot with this solution.

@zhouyx
Copy link
Contributor Author

zhouyx commented Oct 2, 2020

We've got a lot of positive feedback on this proposal!

Would someone from Permutive or other DMP take a look and let us know that the proposed solution could fulfill the audience targeting requirement.

It would be helpful to get answers to the following questions

  1. What data you're expecting to get from AMP documents
  2. What functionality or data you're expecting to pass back to AMP documents
  3. Would your script need to access cookies or storage value

Thanks!

@joshfg
Copy link
Contributor

joshfg commented Oct 3, 2020

Great to see the progress on finding a solution here!

@zhouyx yes I believe the proposed solution would fulfil our audience targeting requirement, as long as there is a way to communicate targeting data back to amp-ad components from the worker.

Removing the script hash requirement for trusted service providers will work well for our use case. Just to double check my understanding - how would we register as a trusted service provider? Would this be a case of us creating an amp-permutive component which is responsible for loading our worker script?

To answer your questions:

  1. What data you're expecting to get from AMP documents

The only data we’d expect to get is analytics events, triggered by our amp-analytics vendor config. Our script running in the worker should be able to consume this data. Also we may need to pass some publisher-provided config into the worker (e.g. when publisher configures amp-permutive on their page they may be required to specify an API key as an attribute, which we should be able to pass into our worker script).

  1. What functionality or data you're expecting to pass back to AMP documents

I believe the only data we’d need to pass back is targeting data to be used by amp-ads. This is essential for the audience targeting requirement. Ideally there will be a mechanism for publisher to configure their amp-ad to wait for our worker to return targeting data (similar to how RTC works with a timeout/max-wait)

  1. Would your script need to access cookies or storage value

Yes, our worker script would need access to storage. Not cookies, but ideally localStorage and IndexedDB (we rely on these storage mechanisms to cache state on non-AMP pages). We spoke about the limitations of storage on SERP in Safari previously @zhouyx (storage being limited to very few bytes due to it being stored on the google.com domain). Would our script be able to use storage without these strict limits outside of SERP in Safari (using the publisher CDN domain)?

@zhouyx
Copy link
Contributor Author

zhouyx commented Oct 9, 2020

Just get feedback from the security review. Based on the feedback, we'd need to sandbox the web worker within an iframe. The infrastructure will be kind of similar to the ads iframe, but all scripts will run within the web worker in the iframe. Something like this
image

@zhouyx
Copy link
Contributor Author

zhouyx commented Oct 9, 2020

Removing the script hash requirement for trusted service providers will work well for our use case. Just to double check my understanding - how would we register as a trusted service provider? Would this be a case of us creating an amp-permutive component which is responsible for loading our worker script?

If you need to handle the response back from the worker specially, I think an amp-permutive component makes sense. However AMP's goal is to generalize the use cases. Can we expect the response to be in a standard format that a generalized component can handle?

Would your script need to access cookies or storage value
Yes, our worker script would need access to storage. Not cookies, but ideally localStorage and IndexedDB (we rely on these storage mechanisms to cache state on non-AMP pages). We spoke about the limitations of storage on SERP in Safari previously @zhouyx (storage being limited to very few bytes due to it being stored on the google.com domain). Would our script be able to use storage without these strict limits outside of SERP in Safari (using the publisher CDN domain)?

Yes it's possible to expand the storage usage outside of AMP Viewer (when Storage API is not used). We currently don't have ways to retrieve localStorage value in amp-analytics. I assume that would be something you need as well.
Regarding IndexedDB, AMP doesn't supports writing to IndexedDb today. That would be another discussion topic : )

@joshfg
Copy link
Contributor

joshfg commented Oct 12, 2020

If you need to handle the response back from the worker specially, I think an amp-permutive component makes sense. However AMP's goal is to generalize the use cases. Can we expect the response to be in a standard format that a generalized component can handle?

OK that makes sense, in that case we may be able to get away without an amp-permutive component. I can't immediately think of a reason we'd need to handle responses from the worker, under the following assumptions:

  • there is a separate communication mechanism for the integration between our worker script and amp-ad
  • all interaction with local storage can be done from within the worker itself
  • it's possible for the publisher to pass Permutive-specific configuration into the worker, so that we avoid the need for passing configuration using attributes on an amp-permutive component

Yes it's possible to expand the storage usage outside of AMP Viewer (when Storage API is not used).

That's great news! Will it be possible to read/write this storage directly from the worker script? Are there any local storage usage limits imposed here aside from the standard browser limits?

We currently don't have ways to retrieve localStorage value in amp-analytics. I assume that would be something you need as well.

Not sure exactly what you meant here, I don't believe this is a requirement for our use case.

Regarding IndexedDB, AMP doesn't supports writing to IndexedDb today. That would be another discussion topic : )

For now, we can ensure our worker script relies on local storage only (with a view to changing this if IndexedDB is supported in the future).

@KEcclesTelegraph
Copy link

We would be very interested in seeing this proposal progress, so we are able to target across our scaling amp inventory. Following for updates.

@zhouyx
Copy link
Contributor Author

zhouyx commented Oct 26, 2020

Filed an I2I to expand the LocalStorage usage to all #30872

Let's bring the topic to the design review. Otherwise I don't see any blocker to building this messaging infrastructure and allow 3p scripts to run in AMP.

@samouri
Copy link
Member

samouri commented Oct 28, 2020

@zhouyx: Is the plan to expand the functionality of <amp-script> to meet this use-case or to develop a new system?

@zhouyx
Copy link
Contributor Author

zhouyx commented Oct 28, 2020

Feedback from Design Review.

  1. We should be able to reuse <amp-script> here. We will expand <amp-script> to support a new running in sandbox and no DOM case. Something like following
<amp-script sandbox no-dom src=https://permutive.example.js>`

where no script hash will be required

  1. We've already have messaging API built for <amp-script> (Sending & Receiving messages)
    If cannot fulfill the use case, we can build the messaging channel into the <amp-script> code. Rather than having the ScriptRunner above

  2. We will first limit the number of such worker to one. This will be enforced via validator.
    In the future we can loose the restriction if there's requirement

  3. The <amp-analytics> integration will be the same as proposed above. Analytics service will need to identify the worker to send message to.

  4. The <amp-script> bootstrap code is small (1.4kb mjs)

  5. To report the use case (force iframes to run in seperate thread) to web platforms.

@spormeon
Copy link

spormeon commented Dec 1, 2020

bump on this, just so i stay in the thread and get notified on developments

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
INTENT TO IMPLEMENT Proposes implementation of a significant new feature. https://bit.ly/amp-contribute-code
Projects
None yet
Development

Successfully merging a pull request may close this issue.