GW> On 12 August 2014 01:53, Edward Burns <edward.burns_at_oracle.com> wrote:
EB> If the server is going to pre-push resources that go along with
EB> a page, how can the server know if the user-agent already has an
EB> up-to-date copy of the resource in its cache? In the case where the
EB> user-agent already has the resource, pre-pushing it would be a waste of
EB> bandwidth.
EB>
>>>>> On Tue, 12 Aug 2014 18:16:39 +1000, Greg Wilkins <gregw_at_intalio.com> said:
GW> Ed, this is a very important question!
GW> How does http2 deal with this very fundamental problem?
GW> Not at all!
Well that's surprising. That seems to be *the* hard problem here. At
least I was not off base in asking about it.
GW> In Jetty, we have implemented the heuristics based approach to build up a
GW> push cache of related resources. Getting the framework involved (or just
GW> delegating the entire responsibility to the framework) is going to allow
GW> for better strategies because the framework should have more knowledge
GW> about how pages are related. So this is where framework developers are
GW> going to have to be smart to come up with
GW> strategies and heuristics.
I think the best we can do is make it possible for frameworks to build
hueristic based strategies by providing adequate API support.
GW> The Jetty heuristics are based on the referer and if-modified-since
GW> headers. Firstly we work out which resources are related to each other
GW> by tracking the referer headers for requests from the same session. If a
GW> request is referred from another within a short timeout, then we assume it
GW> is a closely associated resource and should be pushed.
Such an approach is worthy of investigation, but whenever I see
referer(sic) being used, my CSRF alarm bells start going off.
GW> Any request that has an if-modified-since header we assume is coming from a
GW> client with a hot cache, or at least a warm cache, so we don't push any
GW> resources at it. If the request doesn't have an if-modified-since header,
GW> then we do push associated resources at it.
GW> But there are a lot more smarts that could be done:
GW> - how are query strings handled in associated requests? merged replaced?
GW> ignored?
GW> - should there be per session resource as well as per page resources
GW> pushed?
GW> But also note that from a transport point of view, pushing a resource can
GW> be cancelled by the client if they have it already, so it does not use too
GW> much bandwidth... but have to make sure we don't commit a server thread to
GW> build something expensive that get's cancelled after sending 1 frame.
And we have to trust the client will be smart enough to cancel it.
Greg, is this the rationalization used by the httpbis wg to justify not
handling the problem more explicitly? Is there text in the http/2 spec
that at least says user agents SHOULD do such a thing?
At this point in our development cycle, I would like to ask for
agreement that we should even attempt to leverage server push with a
view toward giving framework authors an API to intelligently leverage
that key feature of http/2.
Ed
--
| edward.burns_at_oracle.com | office: +1 407 458 0017
| 24 work days til JavaOne 2014