UPDATE: per 01-10-2012 bytecdn.nl is deactivated. The Byte CDN service cannot be used anymore.

ByteCDN

ByteCDN in use

For some time now we’ve been running a very successful beta test with the ByteCDN used in our Magento Accelerator tool. When using the ByteCDN for, say, shop.domain.com, your static files become available on shop-domain-com.bytecdn.nl. If your domain is shop.my-domain.com, we translate that to shop--my-domain.com.
This is a highly efficient way to dynamically map many hostnames to their original form, without the overhead required for gigantic hashes and maps of thousands of domainnames, a definite must for high speed applications. The reason we have to do is SSL. Because all the configuration has to be lean and dynamic, we got a wildcard certificate on *.bytecdn.nl so that we can serve the static content over a secured SSL connection. This all worked perfectly fine, until we got a bug report a few days ago from a customer that had some problems with their CSS and Javascript while using the ByteCDN…

The Problem

The customer let us know that with the CDN enabled extra Javascript was loaded in some browsers, and some CSS was missing in others. It took us a while to debug this problem, especially since it was quite hard to reproduce the problem on our test domains, until we managed to reproduce it on a test domain that also had a ‘-’ sign in the domain name.
It didn’t help of course that Firefox was nice enough to display part of  the problem in View: Page Source but chose to hide the rest. This is what we saw with the ByteCDN enabled:

Broken Conditional Comments

Conditional Comments

For those of you not familiar  with them, these are so called HTML Conditional Comments. These are statements interpreted by Microsoft Internet Explorer in HTML Source code. It’s a trick to hide CSS, Javascript and other HTML code inside a HTML comment.  Knowing that, the problem becomes quite clear. For some reason the second script tag is not considered a comment (Which Firefox would have turned green), but valid code. Also, the comment closing tag has been malformed, it would seem. This is how it should look:

Conditional Comments

Valid Conditional Comments

Here you can clearly see the code has been completely interpreted as a (green) comment, and thus will work the way the webmaster intended for them to work. Comments to be ignored in most browsers, but valid code for older Internet Explorer browsers.

At first this confused us greatly, since the page that contains these comments was not served via the ByteCDN, but directly from our Magento Clusters. That is, until we  tried to download the page directly from a Linux terminal and looked at the source there, and discovered the following. Even with the ByteCDN enabled, the code looked perfectly fine. Apparently Firefox took it upon itself to change the displayed HTML code for us, and show us what it interpreted, instead of what it downloaded.

Now things are starting to make more sense. Nothing weird is changed in the HTML generated by Magento, so it’s not a bug in Magento. Then what, a bug in Firefox’ HTML parser? To answer that question there’s only one place to look at.

The Standards

Time to check the official HTML standards. The World Wide Web Consortium (W3C) maintains all the different HTML, XML and related standards. So let’s have a look at the W3C definition of HTML Comments.

It took a few reads, but pay close attention to how the ending tag is defined. Contrary to popular believe, the comment does NOT end with “-->“. Instead it ends in “--”, and ignores anything else until it finds the “>” sign. Now, if we look at the above screenshot of Firefox parsing the HTML comments, it all makes sense. Firefox correctly assumes the comment ends at the first “--” in the comment tag, but unfortunately there is now a “--” inside the commented URL. So as far as firefox is concerned, the comment is as follows:

After that there’s some more text, and the comment ends with a “&gt”. It keeps on parsing, finds a “</script>” tag it doesn’t care about, and on the next line it finds a new set of <script …></script> tags. Firefox is no longer in comment mode, runs this script, and ignores the following (malformed) “<![endif]--<” tag.

Hurray! Problem found!

Now this shouldn’t be a problem normally, if your comment isn’t completely seen as a comment, you just change it so that it no longer conflicts with the standard. However, in this case, Microsoft’s own Conditional Comments standard is not as compatible with everything as they’d hoped. It really sucks that our customers are having problems with the ByteCDN translated URL’s inside Conditional Comments, but the problem exists on a larger scale then this. Any URL with a “--” in it can not be used inside these Conditional Comments. Be they in the path section of the URL, or in the hostname. Now, you may not see many hostnames with “--” in them, but they are most certainly used. IDN uses the characters “xn--” to signify that the name has been encoded using punycode. Punycode is a simple translation that turns a domain name like “Bücher.ch” into “xn--bcher-kva.ch“. This translated domainname uses the same “--” in their hostname as our ByteCDN URL’s, and thus it too will never be able to be used inside Microsoft’s Conditional Comments. They are non standard after all, even the W3C says so; “Information that appears between comments has no special meaning

Scan je eigen Magento shop op veiligheidslekken