WYSIWYG

http://kufli.blogspot.com
http://github.com/karthik20522

Monday, November 18, 2013

Speedier upload using Nodejs and Resumable.js

[updated] View source code at https://github.com/karthik20522/MultiPortUpload

Resumable.js is by far one of the best file uploading plugin that I have used followed by Plupload. Resumable.js provides offline mode features where if a user gets disconnected while uploading it would automatically resume when online. Similar to Plupload it has chunking options. Nodejs on other hand provides a non- blocking functionality which is perfect for uploading purposes.

There is no upload speed difference between upload plugins (resumablejs, plupload etc) except for few features hear and there. Recently I developed a proof of concept for speedier upload using existing plugins and systems. As part of the research was to emulate other file accelerator where multiple ports are used to upload files, thus making uploading quicker.

Using the same above concept, I modified the resumable.js to accept multiple urls as an array and upload individual chunks to different urls in a round-robin style. On the backend I spawned nodejs in multiple ports. But resumable.js only uploads multiple chunks in parallel but not multiple files. This limitation was overcome with some simple code change and following is a test result with various scenarios.

Note: in resumable.js, simultaneous sends option was set to 3

Single Server single file upload Multiple Server single file upload Multiple server + multiple file upload
1 file (109MB) 54secs 56 secs 56 secs
59 file (109MB) 152secs 156 secs 17 secs


Single Server single file upload – default configuration on resumable.js and single Node.js server to accept files/chunks
Multiple Server single file upload – modified resumable.js to take multiple urls and Node.js was configured to listen to different ports (3000, 3001, 3002). Resumable.js when uploading chunks would upload to different ports in parallel.
Multiple Server + multiple file upload – modified resumable.js to upload multiple files and multiple chunks in parallel instead of one file at a time.

But the above test results are for only 3 simultaneous connections. Modern browsers can handle more than 3 connections, following is the number of connections per server supported by current browsers. The theory is that browsers make parallel connections when different domains are used and uploading parallel would make use of full user bandwidth for faster upload.

BrowserConnections
IE 6,72
IE8 6
Firefox 2 2
Firefox 3 6
Firefox 4 6 (12?)
Safari 4 6
Opera 4
Chrome 6 7


Let’s test the above scenario with 10 simultaneous connections:

Single Server single file upload Multiple Server single file upload Multiple server + multiple file upload
1 file (109MB) 27 secs 18 secs 18 secs
59 files (109MB) 156 secs 158 secs 14 secs


Server was using almost entire Bandwidth on a multi-file upload! ~1Gbps; 986Mbps!

As you can clearly see from the above results having different upload endpoints (ports/host-names) would allow browser to make parallel connections as it would treat as a new host.

Advantages:
  • Customizable. In house development
  • As Fast as user bandwidth
  • Use Resumable.js plugin for offline support! Win-Win for everyone!
Disadvantages:
  • Html5 only i.e. No IE 9 and below support!
  • Server s/w needs to be reliable enough to handle huge data & IO operations
Note: Maximum chunk size for the above test were set to 1MB. There is a bit of code which determines the user Internet speed and determines the chunksize; I am doing this basically by downloading a JPEG file and calculating the time token to download. This chunkSize calculation is just a POC

Labels: , ,

Tuesday, February 19, 2013

JMeter - Posting JSON

Apache JMeter is probably one of the most comprehensive Load/Stress testing tool out in the market that is free! One of the tasks that I recently was to load test a WebApi service with production data to simulate real world data. Unfortunately there were not many articles that discussed about a simple way of using jMeter to post data. Following is a step by step procedure to post JSON to a URI:

Step 1: Assuming you already have your root setup (thread group)



Step 2: Add a HTTP Request "Config Element" and fill up the "Server Name or IP" with your server address and port number i.e. if you have one and change the Http Request Protocol to http and Method to POST. Update the Http Request Path with the URI. In the Http Request Parameters, add a new row with the Value as "${somename}"



Step 3: Add "CSV Data Set Config" [Config Element] and add your Filename with the relative path and provide the same variable name "somename" as mentioned above. If you are not using a comma as the delimiter then add the delimiter. The options, "Recycle at EOF" when set to True would start back from the beginning of the file if the test reached the end of file.



Step 4: Add a Http Header Manager and set the content-type to application/json



Done.. Hit Start (ctl + R) and it should start posting JSON to your service.

Labels:

Sunday, June 14, 2009

ASP.NET Least Recently Used (LRU) Caching c#

I would probably assume that anyone who is or has been a web-developer would have at some point used ASP.NET caching feature. A very powerful but yet so easily implementable. As a web-developer one fine day I decided to start caching all common objects on the website such as Auto-complete data, Rewritten urls and other database intensive objects/functionality. Though it worked fine and helped us reduce our DB load by more than 30-40% but unfortunately all these 1000's of objects started building up private bytes (memory) on the server and ended up recycling IIS (App recycle) more often than before (mainly because of out-of-memory exception). Just to confirm the extent of memory build-up I wrote a simple CacheViewer handler and after 20mins of data there were more than 20K objects in the server memory being severed with a very long TTL (time to live)!! Following is a screen shot of the Cache size after 20 mins (extracted Memory dump using DebugDiag and used Tess's DotNetMemoryAnalysis script to extract the information)




In order to solve this caching problem, I had to implement LRU (Least Recently Used) based caching to cache only most used/accessed data. Though there were many implementation of LRU Caching online, but it just seemed a little complicated for such simple functionality. Following is the source code of my implementation. To my surprise it works like a charm :). So basically the way it works is that there is counter for each key in the Cache and the counter is incremented when the particular object/key is requested. Incase if the LRULimit is reached the least accessed object is replaced by the new object. Following is a screen shot of what the LRUCache looks like:




Syntax:
[Add] LRUCache.Add("keyName", object);
[Add with TTL] LRUCache.Add("keyName",object, 100);
[Get Cache] object Value = LRUCache.Get("keyName");
[Delete from Cache] LRUCache.Clear("keyName");

Labels: , ,

Friday, June 12, 2009

HTML Strip [ RegEx vs String ]

Since I work on a extremely user driven content web site, I have to make sure that there is no user inputted HTML on the page that break the CSS or the layout of the page. So we had to build a HTML Stripping functionality to strip out the HTML on the fly. We had initially used the obvious RegEx technique to strip out the HTML. But as the traffic increased the page performance/page load time started increasing. So we decided to enable trace on the page to determine the most expensive operation. So while re factoring the code we realized that the HTML stripping functionality was adding on the page load time.

So while digging around the internet to find a optimized stripping code, I came across two site. 1) DotNet Pearls [http://dotnetperls.com/remove-html-tags] and 2) StackOverflow [http://stackoverflow.com/questions/473087/string-benchmarks-in-c-refactoring-for-speed-maintainability].

Both these sites spoke about string operations vs RegEx and I decided to implement the technique mentioned on their site and following is the result from Page.Trace




As you can see from above data that there is a huge speed difference (thou it’s in milliseconds factor!!) but still much faster and there are fewer objects (string) used which is good for memory usage.

Though the original string index code worked wonders, but since we are optimizing code for performance (speed and memory usage) we can re-factor the code to use stringbuilder (better memory management).

Labels: ,

Thursday, June 11, 2009

Using String.Intern Benefits C#

Recently I was given a daunting task of figuring out why the web-server (IIS) was recycling every couple of hrs and when I analyzed the Crash Dump file there it was obvious that "System.String" type objects were occupying about ~250MB in memory and probably not being released or garbage collected fast enough. While some of String objects were all greater than 85000 bytes it was stored in Gen 2 [LOH - Large Object Heap] instead of Gen 0 and probably highly fragmented. So I decided to try to replicate same issue of my development machine so I hooked up the DebugDiag Memory leak process to my "aspnet_wp.exe" process (IIS 5.5 Windows XP) and simulated traffic using "tinyget" (Microsoft IIS Tool) to one of my test pages which performed similar string manipulation as the live page.



From above screen shot we can see that the String.Object type is ~10MB with 109894 objects in memory which looked quite a lot just to display data from DB and some simple string manipulation.

So while digging around MSDN [http://msdn.microsoft.com/en-us/library/system.string.intern.aspx], I was introduced to String.Intern method. Basically .NET creates a Intern Pool which contains a single reference to each unique literal string declared or created programmatically.

After I modified my application to use String.Intern (basically I added String.Intern in my property (get;set)) in my Business Entity class's and simulated the traffic again and created a new memory dump and following was the result:



As you can see from my above screen shot, the "string.object" memory and the number of instances have reduced to ~3.5MB and 21447 respectively when compared to ~10MB and over 100K objects. Though this string.intern has reduced the string.objects but one main drawback is the fact that the intern pool is not flushed until app recycles. This pool is global and objects in the pool remain for life of the app.

Following is a sample of my BusinessEntity Class:



Syntax: string.intern("string object" or "text")
The reason why I used "string.intern(value ?? "")" is because string.intern does not accept null values. Since I added string.intern in Getter's and Setter's, I had to make sure the value is not NULL and by default I set it to "". Also another example would be string concatenating the same data over and over again like the following example:



As you can see above that "/images/ratingBig-" is being concatenated or basically a constant value which over hear would make sense to use String.Intern to create only one instance of the global string object.

I personally feel that String.intern works wonders or atleast works for my situation but word of caution about it's global scope of intern pool objects.

Labels: ,

Tuesday, December 25, 2007

IIS 7.0 HTTP Compression

While poking around with IIS 7 I stumbled upon on how easy it was to setup http compression to both static files and dynamic responses. One of advantages of compressing static files on IIS 7 is that it can be cached. So all request for the static file would be served from cache unlike the dynamic responses where the response data has to compressed on the fly (a performance hit on the server CPU).
To enable static compression do the following:
1)By default 'Compression' feature was not installed on my IIS so I had to navigate to Enable "Performance Features" on Windows features in Control Panel and select the "Static Content Compression".

2) Now to enable Compression for you site, follow the steps:
a.Open IIS Manager and navigate to the level you want to manage.
b.In Features View, double-click Compression.

c.On the Compression page, select the box next to Enable static content compression.

d.Click Apply in the Actions pane.

3)To verify if the static compression is working, I used Firebug (plugin for Firefox) and verified the repsonse headers.

Labels: ,