locked
parallelize my script to process/move files RRS feed

  • Question

  • hello, I am trying to parallelize my script so that it can run on 8 or more threads, I have a folder that contains 6 million or so files, the script I am using now is extremely slow and only runs on one core. I can implement this in c# easier and faster, but my manager wants to use PowerShell. I have tried to adapt my script with run jobs and -parallel...etc.; but I keep getting errors. as you can tell I am not a scripting guy (I am a programming guy), any help would be appreciated thanks.

    $FilePath = Resolve-Path * -Relative
    Get-ChildItem $FilePath | %{
    
      $ScriptBlock = {
        Foreach {$i = $j = 0} { 
            if ($i++ % 13000 -eq 0) { 
                $dest = "files $j"
                md $dest
                $j++ 
            }
            Move-Item $_ $dest 
        }
      }
    
      Start-Job $ScriptBlock
    }
    
    Get-Job
    
    While (Get-Job -State "Running")
    {
      Start-Sleep 10
    }
    
    Get-Job | Receive-Job


    • Moved by Bill_Stewart Thursday, January 2, 2014 8:47 PM Abandoned
    Sunday, November 17, 2013 6:35 PM

Answers

  • You can use "wait-job" cmdlet to wait till the job completed.

    I'm not sure that I understood your idea correctly in the line

    Foreach {$i = $j = 0}

    But i changed it to

        while ($i = $j = 0) { 


    I hope the following code gives you some ideas how to manage your issue

    $jobs = @()
    $FilePath = Resolve-Path * -Relative
    Get-ChildItem $FilePath | %{

      $ScriptBlock = {
        while ($i = $j = 0) {
            if ($i++ % 13000 -eq 0) {
                $dest = "files $j"
                md $dest
                $j++
            }
            Move-Item $_ $dest
        }
      }

      $jobs += Start-Job $ScriptBlock
    }

    Wait-Job $jobs
    foreach($job in $jobs){
        Receive-Job -Job $job
    }




    • Marked as answer by justin rassi Monday, July 21, 2014 10:51 PM
    Sunday, November 17, 2013 7:15 PM

All replies

  • You can use "wait-job" cmdlet to wait till the job completed.

    I'm not sure that I understood your idea correctly in the line

    Foreach {$i = $j = 0}

    But i changed it to

        while ($i = $j = 0) { 


    I hope the following code gives you some ideas how to manage your issue

    $jobs = @()
    $FilePath = Resolve-Path * -Relative
    Get-ChildItem $FilePath | %{

      $ScriptBlock = {
        while ($i = $j = 0) {
            if ($i++ % 13000 -eq 0) {
                $dest = "files $j"
                md $dest
                $j++
            }
            Move-Item $_ $dest
        }
      }

      $jobs += Start-Job $ScriptBlock
    }

    Wait-Job $jobs
    foreach($job in $jobs){
        Receive-Job -Job $job
    }




    • Marked as answer by justin rassi Monday, July 21, 2014 10:51 PM
    Sunday, November 17, 2013 7:15 PM
  • If you are just trying to move files then RoboCopy does this on as many threads as you want. m It is about the absolutely fastest way to move or copy files.

    Note that move is much slower because it deletes each file as it goes along.  Copy then delete is faster.


    ¯\_(ツ)_/¯

    Sunday, November 17, 2013 7:35 PM
  • If you are just trying to move files then RoboCopy does this on as many threads as you want. m It is about the absolutely fastest way to move or copy files.

    Note that move is much slower because it deletes each file as it goes along.  Copy then delete is faster.


    ¯\_(ツ)_/¯

    I am trying to split the large folder and move the contents into smaller sub-folders.

    Monday, November 18, 2013 1:16 AM
  • You can use "wait-job" cmdlet to wait till the job completed.

    I'm not sure that I understood your idea correctly in the line

    Foreach {$i = $j = 0}

    But i changed it to

        while ($i = $j = 0) { 


    I hope the following code gives you some ideas how to manage your issue

    $jobs = @()
    $FilePath = Resolve-Path * -Relative
    Get-ChildItem $FilePath | %{

      $ScriptBlock = {
        while ($i = $j = 0) {
            if ($i++ % 13000 -eq 0) {
                $dest = "files $j"
                md $dest
                $j++
            }
            Move-Item $_ $dest
        }
      }

      $jobs += Start-Job $ScriptBlock
    }

    Wait-Job $jobs
    foreach($job in $jobs){
        Receive-Job -Job $job
    }



    this code just spawns like a 1000 different PowerShell and command window processes.
    Monday, November 18, 2013 1:19 AM
  • Use Robocopy and split be either date or name filter.  It will be the absolute fastest way.


    ¯\_(ツ)_/¯

    Monday, November 18, 2013 1:28 AM