NAV Navbar
  • Welcome to the Hamsters.js wiki!
  • Features & Benefits
  • Installing Hamsters
  • Getting Started
  • Restructuring Standard Functions
  • Promises
  • Sorting
  • Memoization
  • Transferable Objects
  • Persistence
  • Thread Pool
  • Debugging
  • Limitations
  • Performance Considerations
  • Environment Support
  • Tested Devices & Browsers
  • Welcome to the Hamsters.js wiki!

    Welcome to the new and improved Hamsters.js documentation! We now have JavaScript code examples for different versions of the library and a far more in depth explanation of how the library works under the hood! You can view code examples in the dark areas under each section and use the menu on the left to quickly navigate or search the wiki.

    If you have any questions or suggestions do not hesitate to contact us so we can improve the wiki.

    Features & Benefits

    Hamsters.js provides several unique features detailed below that are especially helpful in managing multi-threading and parallel processing. These features should hopefully be good enough to spark your interest into making use of the library and helping us reach our goals of pushing javascript multithreading and parallelism to an even wider audience.

    Intelligent Execution & Thread Management

    Traditionally multi-threading and parallel processing are complicated to implement into an application due to a myraid of issues with asynchrounous timing and inconsistent thread outputs. The library abstracts all of this complexity away and handles all the timing and thread synchronization for you. Allowing individual functions to be run on their own threads or even run individual functions across multiple threads for parallel execution.

    Thread Pool System

    The library includes a fully featured thread pool system allowing extremely flexible and scalable library operation regardless of logical cores available on a system. When initialized the library will detect the number of logical processors available on the system it's running on and spawn the same amount of threads. This pool of threads will be intelligently managed to efficiently use and reuse these same threads over and over completely eliminating the overhead of creating new threads during execution while also including a intelligent queue system allowing pending work to be queued up in order keep threads busy until all work is completed.

    Memoization (Result Caching)

    Memoization is an amazing feature aimed at the goal of reducing processor usage and battery drain with mobile and ultrabook users in mind. In the event that you know you will be performing the same calculation multiple times, enabling this feature will instruct the library to save the function output allowing the results for individual functions to be returned from a local cache instead wasting cpu cycles and power doing the same work again.

    Automagical Sorting

    Automagical sorting is another useful feature designed to make using the library even easier with the ability to sort the final execution results from a function either alphabetically or numerically regardless of the number of threads your function has been executed across reducing the number of steps you needed to get your desired output.

    Installing Hamsters

    HTML

    1. Download a copy of the latest release version, or clone the repo locally
    2. Upload the contents of the src directory to your server and add the script to your project as described below
      <!-- HTML4 and (x)HTML -->
      <script type="text/javascript" src="path/to/hamsters.js">
    
      <!-- HTML5 -->
      <script src="path/to/hamsters.js"></script>
    

    React Native

    1. Download a copy of the latest relase version, or clone the repo locally
    2. Add the contents of the src directory to your project and import the library as described below
      import hamsters from 'path/to/hamsters';
    

    Once you've installed a third party worker implementation simply pass Worker as an argument in the library initialization start options

      import Worker from '...';
      import hamsters from 'hamsters.js';
    
      hamsters.init({
        Worker: Worker
      });
    

    Node

    1. Use npm to install hamsters.js npm install --save hamsters.js
    2. Import the library into your app.js file like below
      var hamsters = require('hamsters.js');
    

    Once you've installed a third party worker implementation simply pass Worker as an argument in the library initialization start options

      var Worker = require('...').Worker;
      var hamsters = require('hamsters.js');
    
      hamsters.init({
        Worker: Worker
      });
    

    Bower

    You can also use bower to install the library in other environments though support is not guaranteed, submit a ticket if you encounter problems.

      bower install WebHamsters
    

    NPM

    You can also use npm to install the library in other environments though support is not guaranteed, submit a ticket if you encounter problems.

      npm install hamsters.js
    

    Next Steps

    Once you've downloaded and added the library to your project you should have a variable named hamsters available, this variable is the core of the library. Do not create any globally scoped variables with the same name or you will risk causing compatibility issues. Now that you've successfully added the library to your project, continue reading to begin using the library.

    Getting Started

    Hamsters.js attempts to mimic exactly how you would normally make functions in JavaScript in order to make threading your functions feel as natural as possible to your everyday work flow. The library is traditionally invoked by calling a function named hamsters.run this function takes several arguments that are paramount to making multi-threading easy on you. These arguments are described below and will be important to understand moving forward.

    The first to understand is that Hamsters.js is a message passing interfafce at its core, thus when invoking functions with the library we need to instruct the library how to operate by passing a params object (message) to the library.

        var params = {
            bar: 'foo'
        };
    
        hamsters.run(params, ....);
    

    The next argument we'll use is going to be the logic we wanted executed inside a thread or thread(s), the params object we passed before will be accessable within the context of our function. You should now be able to see how to ensure differnt things like variables and functions can be accessed within your threads.

        hamsters.run(params, function() {
            var foo = params.bar;
        }); 
    

    The third and final argument is going to be our onSuccess callback method, the only argument required for this function is your output.

      hamsters.run(params, function() {
         var foo = params.bar;
      }, function(results) {
         console.log(results);
      });
    

    Back to the original params object, there are some conventions to follow in order to get the best performance and reliability out of the library. Hamsters.js was built with the goal of parallelism and not concurrency, though the library does both very well the primary goal was parallel execution. Due to this various design decisions were made to assist in that aim, one of those decisions is how the library splits data between threads for execution, therefore any array that you want accessed within more than one thread must have the index of array within your params object.

        var params = {
            array: [1, 2, 3, 4];
        };
    
        hamsters.run(params, function() {
          for(var i = 0; i < params.array; i++) {
            rtn.data.push(params.array[i] * 4);
          }
        }, function(results) {
    
        });
    

    Using this convention makes it extremely simple to parallelize the method above by simply changing one option in your params object. Using the method below now 4 threads will complete the same task with each thread only operating on a single digit of the array.

        var params = {
            array: [1, 2, 3, 4];
            threads: 4
        };
    
        hamsters.run(params, function() {
          for(var i = 0; i < params.array; i++) {
            rtn.data.push(params.array[i] * 4);
          }
        }, function(results) {
    
        });
    

    Hopefully now you can start to see the power the library gives you. Taking things a step further the library uses an internal return object called rtn, this rtn object is paramount for the library to have a consistent way to handle thread outputs. Thus when we want to return a value from a thread we need to push our results into the rtn.data array. Alternatively you can make rtn.data your output but only if your output is already an array.

     hamsters.run(params, function() {
       rtn.data.push(params.bar);
     }, function(results) {
        console.log(results); // 'foo';
     });
    

    Now that you see how to make use of the library let's breakdown the available library options for execution.

     var params = {
        threads: Integer,
        aggregate: Boolean,
        dataType: String,
        memoize: Boolean
        sort: String,
     };
    
    1. threads This optional argument will tell the library how many threads to execute the function declared previously across, this allows on a very easy level to change how many threads you are executing across. If you do not supply a value here the library defaults to a value of 1.

    2. aggregate This optional argument will tell the library whether or not we want to aggregate our individual thread outputs together after execution, this is only relevant if you are executing across multiple threads and defaults to false.

    3. dataType This optional argument will inform the library that our data array is one of JavaScript's Typed Arrays, when making use of this argument the library will automatically format your output to match the specified dataType.

    4. memoize This optional argument is intended to be used in conjunction with memoization mode, when memoization mode is enabled this argument allows one to control on an individual function level whether or not the results from that function are cached, this has a default value of false.

    5. sort This optional argument will tell the library to automatically sort our final output either alphabetically or numerically, this argument has a default value of null and can be configured using the sorting options.

    Anything else included in the params object will be accessable within the execution context of a thread or multiple threads depending on how you use the library.

    Restructuring Standard Functions

    Hamsters.js makes use of parameter based execution, the params object is a main component of using the library. Remember behind the scenes thread communication takes place using a message passing interface (MPI), we must send everyting we want accessable within the context of a thread.

    
      function() {
        var array = [0,1,2,3,4,5,6,7,8,9];
        var results = new Array(array.length);
        array.forEach(function(item) {
          results.push((item * 120)/10);
        });
        console.log(results);
      }
    
    

    Now we can put this task onto its own thread like so

    
      //1 thread and do not aggregate thread results (only one thread output)
      function() {
        var params = {
         array:[0,1,2,3,4,5,6,7,8,9]
         threads: 1
        };
        hamsters.run(params, function() {
           var arr = params.array;
           arr.forEach(function(item) {
             rtn.data.push((item * 120)/10);
           });
        }, function(results) {
          console.log(results);
        });
      }
    
    

    Alternatively we can split this task among 2 threads for paralell execution like so

    
      //2 threads and let's aggregate our individual thread results into one final output
      function() {
        var params = {
          array: [0,1,2,3,4,5,6,7,8,9],
          threads: 2,
          aggregate: true
        };
        hamsters.run(params, function() {
            var arr = params.array;
            arr.forEach(function(item) {
              rtn.data.push((item * 120)/10);
            });
        }, function(results) {
          console.log(results);
        });
      }
    
    

    We can even define a function to split across all available threads like so

    
      //All threads and let's aggregate our individual thread results into one final output
      function() {
        var params = {
          array: [0,1,2,3,4,5,6,7,8,9]
          threads: hamsters.maxThreads,
          aggregate: true
        };
        hamsters.run(params, function() {
            var arr = params.array;
            arr.forEach(function(item) {
              rtn.data.push((item * 120)/10);
            });
        }, function(results) {
          console.log(results);
        });
      }
    
    

    Promises

    New in v5.0.0 is the support for native async promise support, you can make use of the library in the exact same way as before only now with more sequentail control over execution.

        var params = {
            array: [1, 2, 3, 4];
        };
        hamsters.promise(params, function() {
            for(var i = 0; i < params.array; i++) {
                rtn.data.push(params.array[i] * 4);
            }
        }).then(results) {
            console.log(results);
        ).catch(error) {
            console.error(error);
        });
    

    This should enable the use of async await as well giving you the power to take the above example and make it even easier.

        var params = {
            array: [1, 2, 3, 4];
        };
        var results = await hamsters.promise(params, functionToRun);
    

    Sorting

    Sorting introducted in version 2.7 is an optional parameter which allows for automatic sorting of your data contained within the rtn.data array.

    1. Numerical Sorting
      • asc
      • desc
    2. Alphabetical Sorting
      • ascAlpha
      • descAlpha
    
      function() {
        var params = {
          array: [0,1,2,3,4,5,6,7,8,9],
          threads: 2,
          aggregate: true,
          dataType: 'Int32',
          sort: 'ascAlpha'
        };
        hamsters.run(params, function() {
          params.array.forEach(function(item, index) {
            rtn.data[index] = ((item * 120) / 10);
          });
        }, function(results) {
          console.log(results);
        });
      }
    
    

    Memoization

    Memoization introduced in versions 2.2 is an optional operating mode for the library as well as an optional parameter for individual functions. In the event that you know you will be performing the same calculation over and over, then making use of memoization can dramatically reduce the cpu cycles your application consumes as after the first computation future requests for the same function with matching input will return the result from cache.

    The implementation on this has changed several times, in general between versions 2.2 and 3.9.7 made use of sessionStorage and were limited to roughly 5MB of cache depending on the browser used. In versions 3.9.8 and later this is implemented as an in memory cache and is only limited by how much memory can be allocated to the library.

    In order to enable memoization mode you should set hamsters.cache to true, like below.

    Version 4.0.0 and below

     hamsters.cache = true;
    

    Version 4.1.0 and above

      hamsters.init({
        cache: true
      });
    

    In order to enable caching for individual functions you must have the optional caching parameter set to true.

    function() {
      var params = {
        array: [0,1,2,3,4,5,6,7,8,9],
        threads: 2,
        aggregate: true,
        dataType: 'Int32',
        memoize: true
      };
      hamsters.run(params, function() {
        var arr = params.array;
        arr.forEach(function(item, index) {
          rtn.data[index] = ((item * 120) / 10);
        });
      }, function(results) {
        console.log(results);
      });
    }
    

    Transferable Objects

    To obtain the best performance possible version 2.0 introduced an optional dataType argument, assuming your problem array is any one of JavaScript's typed arrays you can see up to 10x the performance boost over previous releases. If you do not know what typed arrays are please take a look at this guide Typed Arrays

    Starting in version 4.1.0, any and all items passed into the hamsters.run parameters object that contains an ArrayBuffer will automatically make use of Transferable Object support when available regardless of the supplied dataType argument value. This means that supplying the optional dataType argument is only required for ensuring your final output matches the supplied input array type.

    You may write a function to make use of these like so

    
    function() {
      var params = {
        array: new Float64Array([0,1,2,3,4,5,6,7,8,9]),
        threads: 2,
        aggregate: true,
        dataType: 'Float64',
      };
      hamsters.run(params, function() {
        var arr = params.array;
        arr.forEach(function(item, index) {
          rtn.data[index] = ((item * 120) / 10);
        });
      }, function(output) {
        invokeFunction(output);
      });
    }
    
    

    Where dataType is one of the below options.

    Persistence

    Persistence introduced in version 3.3 is an optional operating mode for the library which is enabled by default and can dramatically reduce runtime latency at the cost of somewhat higher heap allocation. When enabled the library will spawn all threads on initialization and will reuse the threads. When this option is disabled the library will instead spawn threads when needed and will destroy threads when they've completed execution. It is recommended you keep this enabled unless you are developing for memory constrained systems or do not require real time performance. You can however disable this easily by setting hamsters.persistence to false.

    Version 4.1.0 and below

        hamsters.persistence = false;
    

    Version 4.1.1 and above

        hamsters.init({
          persistence: false
        });
    

    Thread Pool

    The library includes a fully featured thread pool system allowing extremely flexible library operation regardless of logical cores available on a system. The thread pool is a first-in first-out system meaning if there are pending tasks waiting for an available thread to use the library will immediately make use of the first thread that returns an output. There are a few configuration options that you can control either on an individual function level or on a global level.

    This optional configuration option is a global setting limiting the number of threads the library will make use of during operation, this setting is independent of the number of threads you define a specific function on. A global thread limit of 1 does not limit the number of threads a function can be invoked across so a function invoking 32 threads will still be split across 32 threads and pending work will be queued keeping the available global thread active until all 32 tasks have been completed.

    By default this configuration option defaults to either the number of logical cores on a system or 4 if the library is unable to detect the number of logical cores available. Additionally this configuration option has no limit though it should be noted that some environments namely FireFox enforce a per origin thread limit of 20 and you should probably leave this default or set it to a realistic amount as most likely your workloads are not going to require more than 4 threads.

    You can set this value by simply passing the option within your library init call.

      hamsters.init({
        maxThreads: 16 
      });
    

    Independent of the setting above you can also define the number of threads to split a given function across (assuming your input data can be broken into smaller chunks), this option again is not arbitrarily limited by the library and thread counts of up to 500 have been tested in the past though it's recommended to keep thread counts low since you are increasing overhead with every additional thread and will eventually encounter diminishing returns.

    Also if your input data cannot be split up into reasonably sized smaller chunks you will encounter errors, for example if you only have 100 integers to work with and you ask the library to split your task across 120 threads you've now made 20 threads with no data to work with since you only had the capacity for 100 threads all computing a single integer.

    You can easily change this value by simply changing the integer following your onSuccess callback function.

    //2 threads and let's aggregate thread results into a single output
    function() {
      var params = {
        array: [0,1,2,3,4,5,6,7,8,9],
        threads: 2,
        aggregate: true
      };
      hamsters.run(params, function() {
        var arr = params.array;
        arr.forEach(function(item) {
          rtn.data.push((item * 120)/10);
        });
      }, function(results) {
        console.log(results);
      });
    }
    

    Debugging

    The library supports two modes of debugging each provide useful information which may assist in fine tuning performance & uncovering issues with your logic.

    You can invoke debug mode by setting hamsters.debug to true or 'verbose'. Verbose mode provides quite a large amount of console spam and should be used with that in mind. Normal debug mode is useful for performance profiling, verbose mode however will introduce it's own slight performance drawbacks. You can enable debug mode by either setting hamsters.debug to true or 'verbose' like below.

    Version 4.0.0 and below

        hamsters.debug = 'verbose';
    

    Version 4.1.0 and above

        hamsters.init({
          debug: 'verbose'
        });
    

    Limitations

    Hamsters.js makes use of Web Workers to accomplish it's multi-threading functionality, due to the sandbox nature of Web Workers there are some limitations in how you can make use of them. Hamsters.js is continually working to minimize the limitations imposed on the libraries functionality however some things cannot be bypassed.

    You cannot make use of localStorage or sessionStorage from within a thread, as this would expose the main thread to changes made within another thread. Additionally you cannot access the DOM (Domain Object Model) from within a thread for the same reasons as above, changes to the DOM cannot be made from anything other than the main thread.

    When not making use of transferrables data passed between the main page and workers is copied, not shared. Objects are serialized as they're handed to the worker, and subsequently, de-serialized on the other end. The page and worker do not share the same instance, so the end result is that a duplicate is created on each end. Most browsers implement this feature as structured cloning.

    The above means there are some limitations in what data can and cannot be passed to a thread, the following sourced from structured cloning algorithm are known limitations though they may have a workaround at the cost of slower performance.

    Lastly due to a bug in how JavaScript handles data aggregation if you wish to have your individual thread outputs aggregated into a final result the maximum number of threads any single function can invoke is 20, there is no limitation on thread count if you are not asking for the library to aggregate your individual thread outputs back together. Coincidentally Firefox enforces a per origin thread limit of 20, therefore on systems with greater than 20 logical cores maximum threads will be limited to 20 when using Firefox. Functions invoking greater than 20 threads will have threads pooled until execution is complete.

    Performance Considerations

    Not every task can be easily parallelized and depending on the size of the task putting it onto its own thread may introduce its own performance drawbacks as any benefit may be outweighed by the overhead of the runtime itself. I highly recommend especially for a larger scale application that you spend some time learning about Amdahls Law

    Alternatively if your problem size scales with the amount of threads you use, you can see some serious performance gains. This is known as Gustafson's Law you can read more about this at Gustafson's Law. Also be sure to check out this performance example demonstrating the performance boost additional threads can have.

    The library attempts to detect the number of available cores on a client machine and formulates a maximum concurrent thread count based on that value, if the library is unable to detect a valid core count it will fall back to a maxThread count of 4.

    The library will automatically pool and manage execution across all available threads. With persistence mode disabled the library will create threads and scale based on demand destroying threads when they do not have pending work to complete. When making use of persistence mode explicit threads are reused keeping threads as active as possible, this is enabled by default as it provides a significant performance boost for real time applications.

    Threads are not the same as logical cores, assuming your machine has 4 logical cores you can ask the library to split a given task across exactly 4 threads, however there is no guarantee that a single thread will have access to it's own core for execution. The operating system manages thread allocation to individual cores, Hamsters.js simply manages splitting tasks across individual threads the library cannot control how the OS manages those threads once they are handed off for execution.

    With the above in mind, design decisions were made to maximize performance in parallel tasks especially on lower end machines. One of these decisions revolved around how the thread pool system handles pending work thread allocation, which is a first in - first out system.

    As an example your application may have 8 threads available for processing and assuming a function is invoked spanning across 16 threads what will occur is that the first 8 threads will begin work and the first threads to return will immediately be used to process the remaining queued work. Thus assuming an even distribution of work across all threads, threads #0 and #1 will be reused significantly more often than threads #2 and # 3 with this trend continuing downward making thread #7 the least reused thread, this design choice allows the execution engine to optimize the boilerplate logic within these threads making threads more performant as they're used ensuring the library truly scales with your demands while staying optimized for smaller tasks that may not require all of a systems resources.

    Also remember that JavaScript is an interpreted language, the library has been built and optimized to be as fast as possible but there are situations that cannot be avoided that will impact performance such as garbage collection, how well the browser can optimize hot spots in the logic, and just the seemingly random nature of javascript performance thus results may vary.

    Environment Support

    Hamsters.js attempts to be as isomorphic as possible meaning any logic written for the library will be a seamless transition for any and all platforms where JavaScript can be executed. This does not mean however that every device will have full support for all features of the library, many environments especially older versions of Internet Explorer do not have support for workers which presents some caveats.

    Environments that have full support for workers and transferable objects will see the greatest performance as operations can be sent to available threads with almost no overhead. Some environments support workers but do not support transferable objects, On environments that do not support transferable objects data will be serialized and copied (read: duplicated) to threads instead. This means additional overhead though in practice for the majority of tasks this will prove to be a very small impact on overall performance as large parallelized tasks typically have a long execution time and the benefits of spreading the work across multiple cores far outweighs the overhead.

    Within environments that do not support workers the library makes use of a legacy fallback mode which simulates the async nature of real threads, the library operations are exactly the same and fallback is seamless to the end user. These users will not see the performance benefits of multiple threads. However there are benefits of the simulated threading which can in some instances provide a very small performance boost due to computations being run asynchronously, this also adds a level of concurrency for legacy users allowing the library to hand tasks off to the execution engine while new tasks can still be added to the pipeline.

    Tested Devices & Browsers

    Devices

    Browsers