Wednesday, April 16, 2008

Netezza

This is just a quick post....

A colleague of mine was talking about how he was working with Netezza on a project of his, and I thought I'd look it up. It seems to be a super fast Data warehousing database. It's actually a DW Appliance!!!!

They move the analytics right next to the database, and things what you would need to do by pulling out data are instead being done right on the Db. Wow!

Something to check out I guess - and see if it will make sense to your app.


Here's a picture from their website:

Tuesday, April 15, 2008

Etags - Performance slowdowns / network bandwidth waste

A friend and ex-colleague of mine (Anand) introduced me today to the Firefox plugin YSlow.

I quickly installed it on Firefox since I already had Firebug installed. This is an awesome plugin!

I ran it on our internal website, and found a number of things, but what caught my eye was Etags. I never knew what they were and wanted to dig into it a little more.

So what are Etags?
From Yahoo:

Entity tags (ETags) are a mechanism that web servers and browsers use to determine whether the component in the browser's cache matches the one on the origin server. (An "entity" is another word a "component": images, scripts, stylesheets, etc.) ETags were added to provide a mechanism for validating entities that is more flexible than the last-modified date. An ETag is a string that uniquely identifies a specific version of a component. The only format constraints are that the string be quoted. The origin server specifies the component's ETag using the
ETag response header.

Heres the entire article: Etags

Yahoo further goes on to talk about what the problem is with Etags.

The problem with ETags is that they typically are constructed using attributes that make them unique to a specific server hosting a site. ETags won't match when a browser gets the original component from one server and later tries to validate that component on a different server, a situation that is all too common on Web sites that use a cluster of servers to handle requests.

Apparently, the problem exists with both Apache, IIS 5.0 & 6.0 servers. What does this mean to Load Balanced servers? It means that if the request switches from one webserver to another for any reason, then your cached image is now reloaded completely, instead of just sending a quick 304 - waste of bandwidth and perhaps worser performance (Basically by using an Etag on load balanced servers, your proxy caching and content caching don't work as well as intended!).

By removing the Etag completely (especially in LB scenarios), it can instead just validate with the Last-Modified header and not use the Etag.

I'm reiterating here, what's already said very eloquently. Be sure to read the article. They have code for fixing both Apache & IIS (link to Microsoft's article)

Monday, April 07, 2008

Keeping tests short & Data Managable: An Example

A few days ago I wrote up a blog post with some essential points to remember while performance testing. Here's why I wrote it and why I think it's a valid approach.

We had to test a very complicated application, which had many facets, it had a Web component, database component, lots of windows applications, com objects and so and so forth.

I first asked: What is it we want to find out from this test?
The response: Given what we do, and our major transactions how will our application be affected by migrating from Db version X to Db Version X+1.

That was it, simple concise objective.

What did I (we) do?
Background: All our transactions start from the Web Layer i.e the browser.

I used Coradiant to find out the most traversed paths and listed the top 50 use cases. From there, I spoke to the project manager to get a sense of what are the most critical use cases. For this particular test, all I cared were most critical. We came up with 5 (not including ofcourse the Login component)

Next: Develop the scripts and test them in house. We then took this highly portable script (I used VTS in Loadrunner to give me unique users and such) to the Testing Center.

What did we setup: The boxes were all setup by the DBA and Sys admin.
The scripts were so simple, that it took a very short time after the boxes were setup, to kick off the tests (mind you, even the scenario was created at the test center) (around or less than 1/2 hr). This is what I mean by keeping the tests, simple and reusable.

How did we test?
We first tested on Db Version X. We got X number of transactions and looked at critical response times.

Next we tested on Db Version X+1. We got a significantly lesser number of transactions and a much larger response time for a few transactions. We were immediately able to tell the vendor that there is something missing. By keeping the test length small as well as the number of transactions we were able to give this data back immediately. That means right after we validated that our test was not bogus (by resetting all the variables and running the test again) we were able to tell our Db vendor that there is a problem. In less than 2 hrs we were able to do this (2 hrs after we started running tests on the new Db version) from start to finish.

This was because we were able to see our data really quickly. I stress on this a lot, it's important to look at your data soon after the test. The most important part of your test is the data. If you get overwhelmed with it, you're doing your entire team a disservice.

How did we solve it? We got the vendor to give us a patch which was expected in the following service pack. It immediately got out transaction count up and we were off to the races.

This is a simple validation test. There a lot more complicated things in performance testing. The reason I show this as an example is because I believe that when people performance test, we try to get a lot of data, and that is good, if thats what you're looking for, but if you want to validate a build or something like this, getting overwhelmed with data is not a good thing.

Caveat: If you're doing full scale Db X to Db Y then you need to have a much more comprehensive test. Also, if you're going to look for memory leaks in your application or any other kind of leak, you're probably going to need a long running test (may be an hr or even more) to see it. (I should mention here, that my colleague actually reminded me of memory leaks and the use of long running tests after my previous post, and I thought it would be good to note it here)

Friday, April 04, 2008

Coradiant - How I use it

In my previous post Performance Testing I spoke about using Coradiant to figure out what to performance test. I must note here that Coradiant is only good for HTTP / HTTPS applications (basically webapps, for other apps, you can't use this tool)

So, what do I use Coradiant for?

1. Keep an eye on the application as a whole.
2. Figure out why, how and where a User is having trouble. (Especially when they call help and it gets escalated, this is just an awesome way to figure out where the trouble is - is it the network or the server)
3. Figure out what to put in my Loadrunner tests.
4. Help the sys admins by creating some really useful watch points to let us know what the error pages are. We found that our application was so clean (in terms of 404s), the only offending link was a missing css.
5. See if users having trouble are having trouble because of application issues or network issues, so that the correct people are notified.
6. Proactively find where pages are taking too long and let people know.
7. Use their Truesight & TruesightBI product to generate some Before and After fix graphs to validate any particular fix that has been moved into production.

Performance Testing...

Well, I've left my old job being a Performance Engineer, and into a new job with some Perf Eng responsibilities. I've used Loadrunner to do my performance testing and some of the terminology may be unique to Loadrunner.

Well, this got me reflecting, and thinking how best to implement a process here so that we are able to performance test well and rapidly implement the whole process from scratch.

Here's what I came up with.
0. Figure out WHY you are testing: Response time? Capacity Planning? Figure out WHAT you want to report (Current Performance?, Performance after changes?, Available headroom on hardware?) Now you can plan....

1. First off, identify what you want to test: Sure you know that you want to test your critical use cases, but what about after that? Sit down with some users and figure out what they use the most. Or you can use a really cool tool like Coradiant to do what you need. Using a tool like Coradiant will give you the most accessed pages, and you can get user sessions to see the most used paths. Guessing will get you only so far, but if you want to regularly test your application so that your users are always happy - tools like Coradiant are a very handy.

Test anything new thats going to be added to the application. Most Web applications are always in a constant state of development, and anything new you add, should also be tested, lest you end up with a bunch of very unhappy users.

2. Keep your tests small and repeatable: Testing 100 things at the same time will overload you with data and nothing more. You also will perhaps end up producing conditions that will never occur. That doesn't mean you should test every use case individually (which in my opinion may give you some data, but will probably never catch race conditions, deadlocks etc), it means keep a core number of tests.

When you add something new to the mix, be prepared for changes in numbers. If you are looking for exact same numbers, then you should be doing the exact same tests.

3. Length of Tests: Depending upon the application, you need to define the length of your tests. Too long tests mean that you'll have to wait a long time for results. Really short tests don't give you any reasonable data, because the users have not yet reached a stable state. I used to do 1 hr tests, but now I think that was over kill. 10 Min ramp up (100 VUsers) and 15 min tests would have given us just about the same amount of information as an hr long test.

4. Don't over complicate your testing: Remember your audience. Remember, that you want to prove / test that / whether the application will perform well under load. There are 2 different types of load:
1. Large data set
2. Large number of users on the system.

For regular testing, you need to find some sort of mid point and test. If you have written your performance tests right, you should be able to simulate 2 very easily. For 1, you will have to rely on your developers to provide an adequate data set.

Most important of all, always remember, you need to be providing useful information back to the team. Doing the same test 20 different ways will not give you useful results. Unless you define what & why you're testing, you're perhaps wasting your time.