jsr369-experts@servlet-spec.java.net

[jsr369-experts] Re: [servlet-spec users] Re: Re: Re: Re: Re: Re: Re: Re: revised trailer proposals

From: Shing Wai Chan <shing.wai.chan_at_oracle.com>
Date: Thu, 20 Apr 2017 15:33:36 -0700

Let me summarize what we discussed in #getTrailers and with additional comemnts.

I. Return null?
   There are two options: NeverReturnNull and SometimesReturnNull.
   For option NeverReturnNull, we have: Stuart, Ed and I.
   For option SometimesReturnNull, we have: Mark, Greg.
   Neither Greg nor Mark think it's a big issue.
   And no-one has articulated a compelling use case for SometimesReturnNull.

   In this case, I will choose the NeverReturnNull option.
   I have updated the javadoc in HttpServletRequest as follows:

    /**
     * Get the request trailers.
     *
     * <p>The returned map is not backed by the {_at_code HttpServletRequest} object,
     * so changes in the returned map are not reflected in the
     * {_at_code HttpServletRequest} object, and vice-versa.</p>
     *
     * <p>This method should only be called after all the request content is read.
     * Calling this method before that point will cause the action described in the
     * throws clause below.</p>
     *
     * @implSpec
     * The default implementation returns empty Map.
     *
     * @return A map of trailers in which all the keys are in lowercase,
     * regardless of the case they had at the protocol level, or the empty map
     * if there are no trailers or the underlying transport does not support trailers.
     *
     * @throws IllegalStateException if neither
     * {_at_link javax.servlet.ReadListener#onAllDataRead} has been called nor
     * an EOF indication has been returned from the
     * {_at_link #getReader} or {_at_link #getInputStream}
     *
     * @since Servlet 4.0
     */
    default public Map<String, String> getTrailers() {
        return Collections.emptyMap();
    }


III. Trailer vs Trailers.
    In RFC 7230, Appendix B. Collected ABNF, we have the followimg:
        Trailer = *( "," OWS ) field-name *( OWS "," [ OWS field-name ] )
        chunked-body = *chunk last-chunk trailer-part CRLF
        t-codings = "trailers" / ( transfer-coding [ t-ranking ] )
        trailer-part = *( header-field CRLF )

    In 4.1.2 of RFC 7230, we have
        A trailer allows the sender to include additional fields at the end
        of a chunked message in order to supply metadata that might be
        dynamically generated while the message body is sent, such as a
        message integrity check, digital signature, or post-processing
        status. The trailer fields are identical to header fields, except
        they are sent in a chunked trailer instead of the message's header
        section.

    In 4.3 of RFC 7230, we have
        The TE field-value consists of a comma-separated list of transfer
        coding names, each allowing for optional parameters (as described in
        Section 4), and/or the keyword "trailers".

    So, look like "trailers" is a keyword used in TE.
    And a trailer is a collection of headers.

    Should we rename our API as
        #getTrailerFields, #setTrailerFields or
        #getTrailer, #setTrailer?
    I prefer the former.

III. The data structure for trailer.
   In the current proposal, we use Map<String, String>.
   Here, lower case is used to store header name as key.
   Also, we assume that multiple headers with the same header name will be combined into one.

   MT> If you have multiple header fields with the same name, then order is
   MT> important but that is why I was suggesting Map<String,List<String>>

   Let us take a look at the 3.2.2 of RFC 7230:
        A sender MUST NOT generate multiple header fields with the same field
        name in a message unless either the entire field value for that
        header field is defined as a comma-separated list [i.e., #(values)]
        or the header field is a well-known exception (as noted below).
     
        A recipient MAY combine multiple header fields with the same field
        name into one "field-name: field-value" pair, without changing the
        semantics of the message, by appending each subsequent field value to
        the combined field value in order, separated by a comma. The order
        in which header fields with the same field name are received is
        therefore significant to the interpretation of the combined field
        value; a proxy MUST NOT change the order of these field values when
        forwarding a message.
     
           Note: In practice, the "Set-Cookie" header field ([RFC6265]) often
           appears multiple times in a response message and does not use the
           list syntax, violating the above requirements on multiple header
           fields with the same name. Since it cannot be combined into a
           single field-value, recipients ought to handle "Set-Cookie" as a
           special case while processing header fields. (See Appendix A.2.3
           of [Kri2001] for details.)

   In other words, all header fields except "Set-Cookie" can be combined into one.
   However, "Set-Cookie" cannot be used as a trailer.
   Do we still want to have the flexibility to have multiple header fields with
   the same field name?

   GW> Maybe List<Map.Entry<String,String>> is a better type to use as it is
   GW> fully semantically equivalent to HTTP headers rather than an approximation.

   Can we keep a simpler data structure here?

Please let me know your comments.
Thanks.
    Shing Wai Chan


> On Apr 19, 2017, at 9:10 PM, Greg Wilkins <gregw_at_webtide.com> wrote:
>
>
> Ed,
>
> I certainly hear you regarding complexity.
>
> Still I don't mind a null check if it is semantically meaningful, so I'm +0 for the null return, but it's not a big issue. I can think of not entirely contrived use-cases that would need to know the difference. Let's say you are using trailers to send checksum or even digital signatures of the content, but if trailers are unavailable you are happy to continue in a less-trusted mode, thus the app would want to know the difference between a check-sum not sent and one not able to be sent.
>
> With regards to case sensitivity, the TreeMap approach violates the basic contract of Map as equals does not work. I'd prefer to just document that keys are in lowercase and return a normal map. However, thinking about TreeMap also indicates another deficiency of using Map, in that the order of headers is lost, which can be significant.
>
> Maybe List<Map.Entry<String,String>> is a better type to use as it is fully semantically equivalent to HTTP headers rather than an approximation.
>
> Because that is a little unwieldy, on the request side we could have:
>
> Enumeration<String> getTrailerNames()
> String getTrailer(String name)
> Enumeration<String> getTrailers(String name)
>
> I know it's a very retro API, but it is semantically correct and consistent with the headers. The it would only be on the request side where we would have:
>
> setTrailers(Supplier<Iterable<Map.Entry<String,String>>>)
>
> cheers
>
>
>
>
>
>
>
>
>
>
> On 20 April 2017 at 08:59, Edward Burns <edward.burns_at_oracle.com <mailto:edward.burns_at_oracle.com>> wrote:
> GW> Chunked encoding or HTTP/2 would give an empty map if there are no
> GW> trailers, otherwise null to indicate no trailers could have been
> GW> transferred.
>
> >>>>> On Wed, 19 Apr 2017 13:52:22 -0700, Shing Wai Chan <shing.wai.chan_at_oracle.com <mailto:shing.wai.chan_at_oracle.com>> said:
>
> SW> So, the former means no TE header.
> SW> Do we need to distinguish this in #getTrailers()? Or we have to check the presence of TE header?
> SW> So we have to do the null check for #getTrailers() here. Do we want that?
>
> >>>>> On Thu, 20 Apr 2017 07:48:28 +1000, Greg Wilkins <gregw_at_webtide.com <mailto:gregw_at_webtide.com>> said:
>
> GW> It is more that TE header as Trailers can be carried on chunked HTTP/1.1 or
> GW> HTTP/2, plus some other transports, so the text needs to not be protocol
> GW> specific.
>
> GW> Now I'm not sure if we really need to know the difference between
> GW> no-trailers and not-able-to-transport-trailers, but the difference between
> GW> a null return and an empty map is a good way to convey that semantic so why
> GW> not use it.
>
> To answer why not I'll observe that having to check for null returns is
> a pain and we should avoid forcing people to do it if possible. Now, I
> grant that we already return null from
> HttpServletRequest.newPushBuilder() to indicate that push is not
> available, but here we have an opportunity to return an empty Map. I
> think of it as a complexity tax. If we are going to ask users to pay
> this complexity tax, I'd like to know what we are giving them in return
> is valuable. If there is a compelling case to distinguish between
> no-trailers and not-able-to-transport-trailers, then it might be worth
> paying the tax. If not, I'd say no it's not.
>
> GW> Note also that Simone Bordet has pointed out to me that Map<String,String>
> GW> is a poor data structure to use as it is case sensitive and trailers are
> GW> case insensitive. If we are to use map, then we should say that all the
> GW> keys are set as lowercase rather than any case passed by the client -
> GW> which would make it hard to find headers if strange casings were
> GW> passed.
>
> Again, the complexity tax of introducing an entirely new data structure
> was judged not worth paying. What about this implementation approach to
> this problem?
>
> http://stackoverflow.com/questions/11929542/how-to-ignore-the-case-sensitive-when-we-look-for-a-key-in-the-map <http://stackoverflow.com/questions/11929542/how-to-ignore-the-case-sensitive-when-we-look-for-a-key-in-the-map>
>
> >>>>> On Wed, 19 Apr 2017 15:43:35 -0700, Shing Wai Chan <shing.wai.chan_at_oracle.com <mailto:shing.wai.chan_at_oracle.com>> said:
>
> SW> Yes, we need to clarify that the header keys are in lowercase.
>
> Thanks,
>
> Ed
>
> --
> | edward.burns_at_oracle.com <mailto:edward.burns_at_oracle.com> | office: +1 407 458 0017 <tel:%2B1%20407%20458%200017>
> | 1 business days until planned start of Servlet 4.0 Public Review
>
>
>
> --
> Greg Wilkins <gregw@webtide.com <mailto:gregw@webtide.com>> CTO http://webtide.com <http://webtide.com/>