Archive

Posts Tagged ‘random files’

WAN optimization testing

June 1st, 2009 steve 4 comments

Have been thinking a lot about testing WAN optimization devices lately, and am surprised at how many of the test scenarios required to validate a product like this don’t exist in the market yet.

One of the missing pieces is control over the randomization of content.  Many vendors offer the ability to generate random content, but in some cases the overall file size is extremely limited (typically no larger than 10MB) or the “randomness” is not really random.    I’d like to see the randomness be controlled by dynamic percentages – varying it so that you can emulate the various scenarios required for testing compression related algorithms.

The other missing piece is the size of files capable of being retrieved from a server – this is an issue that has existed with almost everyone’s solution for years, and although it’s getting more attention as of late because of this and other markets, is still not fully addressed.  Some vendors have limits of less than 1MB, while others can serve some files out as large as 100MB but either with severe performance limitations, or other restrictions.

File size limitations are really frustrating – this should have been addressed ages ago.  Typical downloads of files, even for things like Power Point or Word documents in a corporate environment where WAN optimzation is present, are often times several megabytes.   Outside of WAN optimization testing, typical downloads on the internet are far larger.

It’s understandable based on many test vendors architectures that putting files of any serious size, or randomized files, is very challenging, or in some cases, impossible.  Many of the architectures have an extremely limited space to store files because they are optimized for connections per second or other more CPU intensive items.   However, from the pure file size perspective, all of these architectures could adapt to grab a file from a network location and serve it out – almost always the limitation here isn’t CPU cycles on the test gear, but bandwidth or throughput.  Obviously I don’t know all the intracacies of everyone’s architecture, but it seems like just larger files would be an easy challenge to solve.

Randomized content would definitely be more difficult if you were doing it on the fly.  Making things completely random all of the time, or even a percentage of it, could be CPU intensive to the test equipment, but I bet there are some options there, whether they be hardware offloaded or not.

There are some good resources out there for WAN optimization testing, including the Silesia Corpus, which includes a library of files of varying compressibility and content types, like medical X-rays, text files, images, etc.  Even integrating something like this into the test gear would be great.    Of course it doesn’t solve the need for large datasets of random content for testing memory and disk caches.

To solve this now, I’ve built real servers that house large random datasets.  This is the approach that others I’ve talked to have done as well.  The problem here is that the server isn’t instrumented for stats like a traditional performance harness, so you have a whole set of additional pain and problem points, not to mention potential test failures that you won’t be aware of.  Additionally, it requires a decent amount of coordination work on the client side to make sure you’re actually pulling the right objects in your test.

Many vendors have expressed interest in this market and are working on solutions.  At least until there are solutions from some of the big ones, this leaves a gap for the smaller or newer players to fill, and I’d say rather easily given some of their architectures.  The WAN optimization market is growing very well even in these tough economic times, so there’s money to be had pretty easily.  Hopefully we’ll be seeing some new solutions here soon!