WYSIWYG

http://kufli.blogspot.com
http://github.com/karthik20522

Monday, November 18, 2013

Speedier upload using Nodejs and Resumable.js

[updated] View source code at https://github.com/karthik20522/MultiPortUpload

Resumable.js is by far one of the best file uploading plugin that I have used followed by Plupload. Resumable.js provides offline mode features where if a user gets disconnected while uploading it would automatically resume when online. Similar to Plupload it has chunking options. Nodejs on other hand provides a non- blocking functionality which is perfect for uploading purposes.

There is no upload speed difference between upload plugins (resumablejs, plupload etc) except for few features hear and there. Recently I developed a proof of concept for speedier upload using existing plugins and systems. As part of the research was to emulate other file accelerator where multiple ports are used to upload files, thus making uploading quicker.

Using the same above concept, I modified the resumable.js to accept multiple urls as an array and upload individual chunks to different urls in a round-robin style. On the backend I spawned nodejs in multiple ports. But resumable.js only uploads multiple chunks in parallel but not multiple files. This limitation was overcome with some simple code change and following is a test result with various scenarios.

Note: in resumable.js, simultaneous sends option was set to 3

Single Server single file upload Multiple Server single file upload Multiple server + multiple file upload
1 file (109MB) 54secs 56 secs 56 secs
59 file (109MB) 152secs 156 secs 17 secs


Single Server single file upload – default configuration on resumable.js and single Node.js server to accept files/chunks
Multiple Server single file upload – modified resumable.js to take multiple urls and Node.js was configured to listen to different ports (3000, 3001, 3002). Resumable.js when uploading chunks would upload to different ports in parallel.
Multiple Server + multiple file upload – modified resumable.js to upload multiple files and multiple chunks in parallel instead of one file at a time.

But the above test results are for only 3 simultaneous connections. Modern browsers can handle more than 3 connections, following is the number of connections per server supported by current browsers. The theory is that browsers make parallel connections when different domains are used and uploading parallel would make use of full user bandwidth for faster upload.

BrowserConnections
IE 6,72
IE8 6
Firefox 2 2
Firefox 3 6
Firefox 4 6 (12?)
Safari 4 6
Opera 4
Chrome 6 7


Let’s test the above scenario with 10 simultaneous connections:

Single Server single file upload Multiple Server single file upload Multiple server + multiple file upload
1 file (109MB) 27 secs 18 secs 18 secs
59 files (109MB) 156 secs 158 secs 14 secs


Server was using almost entire Bandwidth on a multi-file upload! ~1Gbps; 986Mbps!

As you can clearly see from the above results having different upload endpoints (ports/host-names) would allow browser to make parallel connections as it would treat as a new host.

Advantages:
  • Customizable. In house development
  • As Fast as user bandwidth
  • Use Resumable.js plugin for offline support! Win-Win for everyone!
Disadvantages:
  • Html5 only i.e. No IE 9 and below support!
  • Server s/w needs to be reliable enough to handle huge data & IO operations
Note: Maximum chunk size for the above test were set to 1MB. There is a bit of code which determines the user Internet speed and determines the chunksize; I am doing this basically by downloading a JPEG file and calculating the time token to download. This chunkSize calculation is just a POC

Labels: , ,

Thursday, March 15, 2012

Async WCF web-Services

In the world of Scalable programming, it’s all about Event Driven or Asynchronous programming. Event driven, callback based programming like node.js have taken the programming world by storm and few .NET based open source Event driven servers like KayakHttp (OWIN) have it’s uses but when it comes to Asp.NET MVC or WCF, asynchronous programming can be achieved using Tasks based programming approach. Do remember that Asynchronous programming model requires a good data access/interaction system design. A good async design can provide better scalability and potentially high server throughput. NOTE: higher throughput is server handling more requests and not speedier execution (some cases, yes)

A WCF service having async operations can provide a higher server throughput since the server is not longer waiting for the operation to complete to serve the next request. Off-course an Async design pattern adds complexity to the system. Following is how to provide a async operation:

Step 1, in the service interface we need to let the OperationContract know it’s a async operation



Step 2 is make the function call Async by providing a Begin and End operation. Basically Begin method is called when operation starts and End Method is the callback function when the operation completes execution. Following is the same function as above but with Begin and End




Step 3, once the operations are modified with Begin and End methods in the Service Interface, we need to build out the functions in the actuall Service Class.




In BeginGetData function, a new async Task is spawned with a return value of “string” and once the method “GetData” is executed the callback function is called. Task.ContinueWith is trigged only once the value is returned or if there is any unhandled exception thrown. In EndGetData function, basically takes the result and returns back to the calling client.

An example of calling client:




To learn more about Tasks, MSDN should be a good starting point [http://msdn.microsoft.com/en-us/library/system.threading.tasks.task.aspx]

Labels: , ,