none
Need advice on a specific scenario RRS feed

  • Question

  • Hello Everyone,

         I have noticed that a common scenario in HPC is to process slightly different data with the same algorithm.  My scenario involves processing the exact same data using different algorithms.  More specifically I am working on a financial application that analyzes stock quote streaming data in real time.  So I need services/executables that create various trend lines, moving averages etc.  At a high level how would I go about processing the same data stream in multiple ways?

    Thanks in advance,

    Bob

    Thursday, May 5, 2011 6:03 PM

Answers

  • Hi Bob,

    It seems like you have two ways to approach this problem:

    1.) If you can only have a single connection to the incoming data feed, you could use the SOA infrastructure and start up as many sessions as you have algorithms. Each session could have a single core allocated to it to ensure that only a single service instance is processing the results and allow it to build up state. You would need to build a client executable that creates these sessions and attaches to the data feed. Then, for each incoming data piece, you could send a copy to each session (where each session hosts a different algorithm)

    You could either package all the algorithms together into a single SOA service and use different  service operations to differentiate between the algorithms or package each algorithm as a separate service DLL and differentiate at the session level.

    2.) If you can have any number of connections to the incoming data feed, you could just use a single job with a separate task for each algorithm. Each task would then be required to make a connection to the data feed and process the incoming data independently.

     

    Jeremy

    Thursday, May 12, 2011 6:02 PM

All replies

  • One job with multiple tasks, meaning each task is a different executable?  Anyone?
    Saturday, May 7, 2011 6:30 PM
  • HPC (Parametric Sweep and SOA) works with different data + same algorithm. However, there is nothing to stop you from working on same data with different algo.

     

    Question is, how big is your data? how many algorithm do you want to apply to those data? i assume all of them are isolated so no data contention, so you should be able to scale it out.

    Tuesday, May 10, 2011 3:14 AM
  • A new piece of data comes in about once a second I guess which would need to go to each algorithm.  The algorithms do not communicate with one another only with the head node.  The data piece itself is small, usually less than 20 bytes.  Basically I want to have one algorithm per core.  Any thoughts on a specific way I might want to implement this? One job that has one task for each algorithm set to work on one core each?  That's my best idea so far, let me know if you have a better suggestion.

    Bob

    Tuesday, May 10, 2011 3:59 AM
  • Hi Bob,

    It seems like you have two ways to approach this problem:

    1.) If you can only have a single connection to the incoming data feed, you could use the SOA infrastructure and start up as many sessions as you have algorithms. Each session could have a single core allocated to it to ensure that only a single service instance is processing the results and allow it to build up state. You would need to build a client executable that creates these sessions and attaches to the data feed. Then, for each incoming data piece, you could send a copy to each session (where each session hosts a different algorithm)

    You could either package all the algorithms together into a single SOA service and use different  service operations to differentiate between the algorithms or package each algorithm as a separate service DLL and differentiate at the session level.

    2.) If you can have any number of connections to the incoming data feed, you could just use a single job with a separate task for each algorithm. Each task would then be required to make a connection to the data feed and process the incoming data independently.

     

    Jeremy

    Thursday, May 12, 2011 6:02 PM
  • Jeremy,

         Thanks, I'll give both of those a try and see which one works best for my scenario.

    Bob

    Thursday, May 12, 2011 7:59 PM