Wednesday, October 17, 2012 6:34 AM
I need help on an exception which i get saying "Microsoft.Hpc.Dsc.DscException: Failed to seal fileset /T0-GA-PAGEVIEWS-TechCenter-WK-33-2013. Failed to assign all replicas associated with the fileset. Check whether there are enough nodes available with adequate disk space"
i also know i have atleast 6 DSC nodes available all tagegd to head node & all node have quite good space atleast 500+ GB free space, i tried to process and seal a samll file which is approx 30MB size , it get exception as above .. i tried processing & seal even smaller file say 1 MB still same message..
need your help to know if there is any configuration which i am missing & causing this, this was workign very fine till yesterday , but there was a windows patch and server rebooted i am thinking is this is causing any configuration missing ..
let me explain the issue in detail..
My HPC server details:- i have 8 server's in one of the environment lets say
serverHPC01 is a admin node and rest all are DSC nodes, all the DSC nodes are tagged to admin node
What is the issue :- i need to process Bigdata on these boxes for which i have few mapreduce etc done and i am able to create fileset and add files to that fileset, once i finish adding these files i need to Seal the file set after which HPc is supposed to make replica of the created files set based of replica number set, default to 3
the problem is its failing to make replica of fileset and give exception as below
Trace level set to VERBOSE
Error sealing output fileset: Microsoft.Hpc.Dsc.DscException: Failed to seal fileset /T3-DELL-PROPS-ENU-2012-10-12-00. Failed to assign all replicas associated with the fileset. Check whether there are enough nodes available with adequate disk space.
at DrDscOutputStream.FinalizeSuccessfulPartitions(DrArray<DrOutputPartition> partitionArray)Critical error occured: code = -2146233088
Application failed with error code 0x80131500.
According to the error message i think some how HPC admin node thinks it dosent have enough DSC node tagged to it or its refering to some configuration or DB table entry & which its beliving there is not enough DSC node tageed
BUT if i set replica param to say 2 or 1 it works fine, but it fails if i set to 3 and above..
earlier the same was workign even for 3 replica's, last friday we had a windows patch updated on HPc server's and server was rebooted, after that i see these error i am not sure what & where any configuration is messed up..
i tried looking into its DB in admin node but dint find anything crazy...
i have not made any changes post the issue occured, need your help in trouble shooting this..
thanks a lot in advance