I am using linq to HPC in a cluster with 8 compute node (Total 500 Cores).
I have final fileset which contains around 200K lines and each line needs to be matched against a collection of regex. Regex collection contains around 10 million regex expressions, loaded in memory from a text file. After
evaluting each row in fileset, atleast one column in row will get the value by matching with regex.
This process is taking around 2 days. Does any one has any idea why it is taking so long? Is not HPC divide the this calculation among all compute nodes and can perform more efficiently?
Any Clues?
Thanks
Harvail