Monday, June 04, 2012

Problems of scale

Recently I noticed that the web logs of the Eclipse Download servers were being overrun by an inordinate amount of requests for various p2.index files.  All the responses appeared to be "404 Not Found".  This seemed odd to me, so I began to investigate.

My findings: 6,000,000 each day. We respond to 6 million requests for p2.index files each day.  Ouch.

Why not?

I won't go into the "why?" but I do want to elaborate on the "why not."  "Checking for a small file is a trivial request, no one will ever notice, impact will be minimal" is usually the thinking.  And that thinking is correct -- go ahead and click this link, and it will all be over before you know it.

Small, insignificant request

Problems arise when small, insignificant requests add up by the millions.  An Ethernet packet cannot be smaller than 64 bytes, so although responding to a "404 Not Found" is only a few bytes, this adds up to 384 MB of actual data sent on the wire each day.

Where have all the bandwidths gone?

Unfortunately, there is more.  Until recently, Eclipse wasn't able to reuse an http connection for multiple requests, so it would go through a 10-packet TCP exchange for each and every file it wanted.  That's 4 Ethernet packets departing the server for that p2.index "404" response.  All of a sudden we're sending 1.5 GB of 404 material.

It's happened before!

Client-server developers typically don't know much of what's going on server-side, there's nothing new there.  I remember many years ago when the Mylyn (née Mylar) project was killing Eclipse's Bugzilla by fetching too many bugs and too many repository listings too often when their project became wildly successful.

But here's one small tip: if you catch yourself saying, "it's small and insignificant", "no one will notice" or "no impact", go ahead and multiply the action by 6 million to see of the answer is the same.

Or engage with your IT team.  They want to help  :-)


Blogger Ian Bull said...

Denis. Thanks for the information about the concrete problems this file is causing. However, I think it is important to understand the why, because any solution we arrive at should still address the problems the p2.index file was designed to address.

I wrote up a post on the 'why' to hopefully help us get some more good ideas flowing.

12:24 AM  
Blogger Ian Bull said...

BTW, you should do another classic-404-webmaster-poll. will be my write in response :-).

12:27 AM  
Blogger Maarten Meijer said...

Yeah, but the Mylyn/Mylar was a problem we were downloading a huge XML file that contained 30% spaces. There is little scope for reducing the p2.index file :(
But you are absolutely right that "just adding a request for a small file " often seems the right way forward, but introduces tremendous HTTP level overhead for the reasons you describe. A similar problem often occurs in poorly designed AJAX websites, where there are many small requests instead of a single provide all one.

4:01 AM  

Post a Comment

<< Home