[Fwd: Re: [ietf-dkim] canonicalized null body and dkim]

Charles Lindsey chl at clerew.man.ac.uk
Tue Dec 19 06:25:03 PST 2006


On Mon, 18 Dec 2006 16:02:55 -0000, Tony Hansen <tony at att.com> wrote:

>>
>> Making the entirely reasonable assumption that "body" means exactly what
>> RFC 2822 defines it to mean, then here is what gets hashed in all of
>> those cases:
>>
>> 1) ordinary message with <body> of 1 non-empty line:
>> ---------------------
>> barbazCRLF
>> ---------------------
>>
>> 2) <body> consisting of 2 empty lines
>> ---------------------
>> CRLF
>> ---------------------
>>
>> 3) <body> consisting of 1 empty line
>> ---------------------
>> CRLF
>> ---------------------
>>
>> 4) <body> containing no lines
>> ---------------------
>> CRLF
>> ---------------------
>>
>> 5) message with absent <body>
>> ---------------------
>> ---------------------
>
> I contend that the current wording in base-07 also requires that example
> 5 canonicalize into a
>
> ---------------------
> CRLF
> ---------------------
>
Well that's a philosophical issue :-( .
The present texts instructs you to do various things with "the body". If  
there is no body, then you can't do those things. So there is nothing to  
canonicalize and nothing to hash. The the only sensible thing to do in  
that case is to hash nothing. So I seem to disagree with you :-( .

But as to what we should actually do when this is sorted out should, I  
think, be that which causes the "least astonishment". So one would hardly  
expect the canonicalization to finish up with more lines that you started  
with.

The easiest rule for everyone to understand is an invariant such as:

    "After canonicalization, there will NEVER be an empty line at the end
     of what remains to be hashed."

> Even when the body doesn't exist, it still must be treated as having 0
> lines following, which still canonicalize to a CRLF.
>
> But even with my contention on case #5, I don't disagree with your
> conclusions here:
>

> I firmly believe that we *intended* to canonicalize each of these cases
> into the empty body
>
> ---------------------
> ---------------------
>
Which is indeed the "least astonishing".

> PS. For completeness, the only missing cases, after taking into
> consideration RFC 3030 and MIME, are as follows. *These* are the reason
> that the 0*CRLF rule was added and where it needs to be applied:
>
> 6) ordinary message with <body> of >1 non-empty line, not ending in CRLF
>
> Content-Type: binary
> Last-Header: foobarCRLF
> CRLF
> ---------------------
> somethingCRLF
> anything
> ---------------------
>
> 7) ordinary message with <body> of 1 non-empty line, not ending in CRLF
>
> Content-Type: binary
> Last-Header: foobarCRLF
> CRLF
> ---------------------
> anything
> ---------------------

I think those cases are covered by the rule that tells you to add a CRLF  
if there is no trailing CRLF anyway. But weird things can happen when you  
have binary. Here are some more cases for you all to chew over:

8) Binary text with assorted line endings:

Content-Type: binary
Last-Header: foobarCRLF
CRLF
---------------------
anythingLF
somethingCRLF
nothingCR
---------------------

So do I fix that last line by adding CRLF, or by adding just LF, or by  
leaving it alone?

Moreover, if I specify "l=1" in the Dkim-Signature, exactly which of those  
lines gets included in the process? And if I do any sort of  
canonicalization, is the count for "l=?" done before of after the  
canonicalization (I presume before).

And if I specify "l=0" and then treat my case (5) as you proposed to treat  
it, do I then have to hash that CRLF that appeared out of nowhere?

Canonicalization, you see, is tricky stuff to specify (as indeed is any  
cryptographic process). And agreeing the count for "l=?" is another of the  
things that has to be gotten right.

So, all in all, I think what is needed is an Appendix containing  
interesting test cases for which everybody needs to get the same results,  
and maybe another Appendix containing a model implementation (as the  
punycode people did, which is as well because nobody would have understood  
their algorithm without it).

-- 
Charles H. Lindsey ---------At Home, doing my own thing------------------------
Tel: +44 161 436 6131                       
   Web: http://www.cs.man.ac.uk/~chl
Email: chl at clerew.man.ac.uk      Snail: 5 Clerewood Ave, CHEADLE, SK8 3JU, U.K.
PGP: 2C15F1A9      Fingerprint: 73 6D C2 51 93 A0 01 E7 65 E8 64 7E 14 A4 AB A5


More information about the ietf-dkim mailing list