Troubleshooting Varnish with Magento 2

tl;dr: Set the launch parameter:


http_resp_hdr_len

to something very large if you are using Varnish with Magento 2. The default is not enough, and terrible things can happen to you if you don't change this.


Magento 2 uses Varnish as its full page caching back-end. This is terrific, and it's quite well implemented. Magento will generate a VCL for you from a template, and carefully craft HTTP response headers to get Varnish to do the right thing.

But it can cause problems if you're not careful. We recently enabled Varnish on one of our Magento 2 sites, only to find that while most pages were served startlingly quickly, a large fraction of category pages were giving us "503: Backend Fetch Failed" errors.

To track this down, I first tried turning on debug headers, but that wasn't enough. I was getting 503s, and I really needed to view the raw response. So I did two things:

First, I checked the logs by running:


varnishlog > /tmp/varnishlog

and searching for the error code quoted on the response page. This allowed me to view the nature of the error. I learned that it was an HTTP fetch error, caused by Magento sending a malformed HTTP response to Varnish. This seemed strange to me.

Next, I turned on full page caching in Magento, but passed web traffic directly to PHP-FPM. This allowed me to use curl to view the back end response that Varnish was complaining about.

Chrome certainly didn't mind, but I passed the URL to redbot, an HTTP response validator to look at. It had no errors, but warned me that "X-Magento-Tags" was an excessively long header, with which some clients might take issue.

Finally, a lead. After Googling around for Varnish header length, I happen upon the answer: Varnish has a launch parameter


http_resp_hdr_len

By default, this is set to 8192. A quick "wc -c" of the "X-Magento-Tags" header gives somewhere around 12000. Changing this parameter fixed all my problems.

But what should I change it to? Change it to something big, okay? The rub here is that Magento adds a tag for every product in a category. And the tag is of the form "catalog_product_{{PRODUCT_ID}}". This takes up about 20 characters, which by some back of the envelope calculations gives you a hair under 400 products in a category before you hit this error. This is not an unreasonable number of products to have in a category.

If you want my advice, I'd set it to something like 30 ⨉ (maximal number of products in a category). This gives you a decent buffer, but also sets it to a sensible value. I'm not sure how Varnish allocates memory for storing responses, but my guess would be that Varnish does a malloc of "http_resp_hdr_len" ⨉ "http_max_hdr" + "http_resp_size". This would mean that your memory consumption would increase as you increase this header length limit.

My reasoning behind this is that dynamically allocating memory is very difficult, since they would have to keep some big bucket around to parse the response, and then store it somewhere else. This would also have an impact on performance, and my understanding of the philosophy of Varnish is that a bit of wasted memory is fine if it shaves more milliseconds off response time. Furthermore, why bother asking for a maximum if you're going to dynamically allocate memory? Especially if you're asking at launch and not allowing as something to be configured in VCL?

I did my good citizen thing and sent a pull request to the devdocs team with a note in the documentation that someone might want to look here.