locked
DryadLINQ sample DupPic2 RRS feed

  • Question

  • Hi -

    Not sure if this is the right forum to ask this question.

    I am trying to execute the DryadLINQ sample DupPic2 on our HPC cluster.  The program runs fine when I run it locally.  When I change the app.config to point to the cluster headnode, when I step through to debug the app, I get a FileNotFound exception specifying the following dll was not found.  No other code changes were made.

    Could not load file or assembly 'Microsoft.WindowsAzure.StorageClient, Version=1.1.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35' or one of its dependencies. The system cannot find the file specified.

    Where is this dll supposed to come from?  I tried repairing all of my VS, SQL, DSC, and HPC installs.  There is no Microsoft.WindowsAzure.StorageClient.dll anywhere on my machine.

    Please advise-

    Thanks-

    Steve

    Monday, January 10, 2011 7:32 PM

Answers

All replies

  • Hi,

    What is the version of Windows HPC Server installed?

    Thanks,
    Łukasz

    Monday, January 10, 2011 10:26 PM
  • HPC Pack 2008 R2 Server 3.1.3267.0

     

    However, I found the reason behind this issue.  I needed to install the WindowsAzure SDK from here:

    http://www.microsoft.com/downloads/en/details.aspx?FamilyID=7a1089b6-4050-4307-86c4-9dadaa5ed018&displaylang=en

     

    Note that this requirement was not mentioned anywhere in the documentation.

     

    After I installed that, I am now running into a different exception:

     

    DscException was unhandled 

    Error communicating with DSC. Error code=1  Message=Failed 

     

     

    Is there some magic place that I'm supposed to place the files when I want to run on the cluster?

    This is not very well documented.

    For example, the Histogram sample refers to this uri:

    "hpcdsc://localhost/Samples/XmasBook"

    Where is that supposed to be mapped?  inetpub?  somewhere else?  Or does DryadLINQ create these locations on the head node on the fly?

     

    I AM seeing things added to the XC directory under a staging/<myname>/ directory.  The query plans do refer to output dirs that match the above uri:

     

        <StorageSet>
          <DataPath>hpcdsc://localhost/Samples/XmasBook</DataPath>
          <IsTemporary>false</IsTemporary>
          <Attribute>
            <Data>
              <XmlDataBlob>
                <PartitionedData xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" Version="0.1" xmlns="http://microsoft.com/Distributed/Internal/PTX.xsd">
                  <DataInfo>
                    <RecordType Encoding=".NETTypeName">System.String, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089</RecordType>
                    <IsAutoSerialized>false</IsAutoSerialized>
                  </DataInfo>
                </PartitionedData>
              </XmlDataBlob>
            </Data>
          </Attribute>
        </StorageSet>

     

    I'm sure this is pretty simple, once I get past these first sticky points.

    Thanks-

     


    Thanks- Steve
    Monday, January 10, 2011 11:32 PM
  • When you want to run the samples against the cluster, you need to change the host in the URI from "localhost" to "localcluster".  This should be in the documentation, but maybe in a non-obvious location.
    Tuesday, January 11, 2011 6:14 PM
  • I'm getting a similar error when running the trivial  code based on the sample

       int[] data = new int[] { 1, 2, 3 };

     

     

      var results = data.AsDistributed().Select(a => a + 1);

    It throws an aggregateException

    Distributed query execution failed, however no vertex exceptions could be retrieved. Job ID = 1479

    Error files are empty.

    I've installed the Azure mentioned on my client. Do I have to install it on all nodes?

    Does there have to be a SQL Server 2008 on the Head Node as well as the client?

     

    The sample had no storage set. I assume the uri isn't relevant in this case.

     

    Any suggestions would be appreciated.


    Nick Nussbaum
    • Proposed as answer by Nick Nussbaum1 Thursday, January 13, 2011 12:48 AM
    Wednesday, January 12, 2011 11:50 PM
  • The problem showed up in the job log as 'HpcLinqToDryadJM.exe' is not recognized as an internal or external command, operable program or batch file on the head node task. It was using it with a QueryPlan xml.

    The fix was to install the vertex code on the head node after installing the Server; DISCVertex_x64.msi. I wasn't using clusrun, which may do that automatically. I installed it from an elevated dos box and life is good.

    Note that I had already installed .Net 4 on the head node, not just the .Net 3.5. I don't know if this is needed for the Vertex utility being ued


    Nick Nussbaum
    • Proposed as answer by Nick Nussbaum1 Thursday, January 13, 2011 12:54 AM
    • Marked as answer by SAFti618 Thursday, January 13, 2011 1:01 AM
    • Unmarked as answer by SAFti618 Monday, January 31, 2011 9:29 PM
    Thursday, January 13, 2011 12:54 AM
  • Can you amplify on this. Do you mean localCluster as a distinguised keyword or a specific machine name like "myheadnodemachine"

     

    I'm able to get programs to run if I set up an explicit cluster head node in the app.config

    <

     

     

    microsoft.distributedquery HPCDryadExecutor_HeadNode="myheadnodemachine" Executor="Cluster" />

    with uris for pub3

    .Execute(

     

    "hpcdsc://myheadnodemachine/samples/images"); and .open

    I haven't been able to make it work

    This doesn't work for local host, either by omission

    <microsoft.distributedquery HPCDryadExecutor_HeadNode="myheadnodemachine" Executor="Local" />

    with 

    .Execute("hpcdsc://localhost/samples/images"); and .open

    or by

    .Execute("hpcdsc://localcluster/samples/images"); and .open

     

    where does localcluster go and is it a keyword.


    Nick Nussbaum
    Thursday, January 27, 2011 6:28 PM
  • I have been able to get this sample to run just fine on the cluster, but local mode fails with the following exception:

    Microsoft.Distributed.Linq.DscException was unhandled
     Message=Error communicating with DSC. Error code=1 Message=Failed
     Source=Microsoft.Distributed
     StackTrace:
      at Microsoft.Distributed.Internal.DscUtils.ThrowIfFailed(DscResult dscResult)
      at Microsoft.Distributed.Internal.DscUtils.CreateStreamFromLocalFiles(Uri fullyQualifiedUri, DataInfo dataInfo, Type recordType, String[] partitionDataFullPaths)
      at Microsoft.Distributed.Internal.DscUtils.CreateStreamFromLocalFiles[TRecord](Uri fullyQualifiedUri, DataInfo`1 dataInfo, String[] partitionDataFullPaths)
      at Microsoft.Distributed.Internal.DistributedDataHelper.CreatePartitionsAsDsc[TRecord](String tmpDir, Uri fullyQualifiedUri, DataInfo`1 dataInfo, IEnumerable`1 dataSets, IReaderWriterFactory`1 readerWriterFactory)
      at Microsoft.Distributed.Linq.AsDistributedInputQueryNode`1.BuildQueryPlanAndCodeDom(BuildQueryState queryExec)
      at Microsoft.Distributed.Linq.UnaryQueryNode`2.BuildQueryPlanAndCodeDom(BuildQueryState queryExec)
      at Microsoft.Distributed.Linq.HashPartitionQueryNode`2.BuildQueryPlanAndCodeDom(BuildQueryState queryExec)
      at Microsoft.Distributed.Linq.UnaryQueryNode`2.BuildQueryPlanAndCodeDom(BuildQueryState queryExec)
      at Microsoft.Distributed.Linq.UnaryQueryNode`2.BuildQueryPlanAndCodeDom(BuildQueryState queryExec)
      at Microsoft.Distributed.Linq.UnaryQueryNode`2.BuildQueryPlanAndCodeDom(BuildQueryState queryExec)
      at Microsoft.Distributed.Linq.DistributedOutputQueryNode`1.BuildQueryPlanAndCodeDom(BuildQueryState queryExec)
      at Microsoft.Distributed.Linq.DistributedQuery`1.ExecuteInternal(DistributedQueryConfiguration config, Uri outputUri, IReaderWriterFactory`1 readerWriterFactory)
      at Microsoft.Distributed.Linq.DistributedQuery`1.ExecuteInternal(Uri outputUri, IReaderWriterFactory`1 readerWriterFactory)
      at Microsoft.Distributed.Linq.DistributedQuery.Execute[TSource](DistributedQuery`1 source)
      at Microsoft.Distributed.Linq.DistributedQuery`1.<GetEnumerator>d__1.MoveNext()
      at DupPic2.Program.Main(String[] args) in D:\Dryad-SampleCode-CTP1\ProgrammingGuideSamples\DupPic2\DupPic2\Program.cs:line 28
      at System.AppDomain._nExecuteAssembly(RuntimeAssembly assembly, String[] args)
      at Microsoft.VisualStudio.HostingProcess.HostProc.RunUsersAssembly()
      at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean ignoreSyncCtx)
      at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
      at System.Threading.ThreadHelper.ThreadStart()
     InnerException: 
    
    
    
    
    using System.Security.Cryptography;
    using Microsoft.Distributed.Linq;
    
    namespace DupPic2
    {
     public class Program
     {
      public static void Main(string[] args)
      {
       string directoryName = @"\\MyServerName\Temp\pics";
       var duplicatedFiles =
        Directory.GetFiles(directoryName, "*.jpg",
         SearchOption.AllDirectories)
        .AsDistributed() //<== comment out this line and everything works fine
        .Select(filename => new
        {
         hash = GetChecksum(filename),
         name = filename
        })
        .GroupBy(record => record.hash)
        .Where(group => group.Count() > 1)
        .SelectMany(group =>
         group.Select(record => record.name));
    
       foreach (var file in duplicatedFiles)
       {
        Console.WriteLine(file);
       }
      }
    
      public static string GetChecksum(string file)
      {
       using (FileStream stream = File.OpenRead(file))
       {
        SHA256Managed sha = new SHA256Managed();
        byte[] checksum = sha.ComputeHash(stream);
        return BitConverter
         .ToString(checksum)
         .Replace("-", String.Empty);
       }
      }
     }
    }
    
    

    I have followed all of the instructions for installation on the client side.  The ONLY thing that I have noticed is that I am running Win7 Ultimate 64 instead of Enterprise 64.  Could this possibly make the difference? (All the docs out there comparing the 2 editions indicate it should work, except for the DryadLINQ install doc itself - which says it MAY)

    I have verified the local XC share and DryadData share exist but that are never being populated.  I created the picture directory, shared it and ran it on the cluster and it works fine.  Commenting out the cluster configuration stuff in the app.config so that it runs locally causes the above exception.  Commenting out the .AsDistributed() line makes the sample work fine locally when the config says to run locally.  VS was launched in Admin mode.

    I have uninstalled and reinstalled the HPC DISC Server, Vertex, and Client in that order (all 64 bits).

    The only thing I can think of is that somehow the local process does not have proper File IO permissions for writing out the data into the XC shared directory.  Is there an additional user (from inside SQL maybe?) that should have write access to these shares?  Is there a specific user (NT AUTHORITY) that needs to be set up for that?  (My local box sql server uses NETWORK SERVICE.)

    Any other ideas around where should I be checking next?

    Thanks-

     


    Thanks- Steve
    Monday, January 31, 2011 11:04 PM
  • For issues concerning the pre-release Dryad software please either log in to the Connect beta website and submit a 'feedback' item or use the beta forum at http://social.msdn.microsoft.com/Forums/en-US/dryad/threads 
    • Marked as answer by Don Pattee Saturday, February 5, 2011 12:22 AM
    Saturday, February 5, 2011 12:21 AM