none
Copy a section of a text file RRS feed

  • General discussion

  • Hi All,

    I am new to powershell but I have a main report that contains different reports in txt format. I would like to copy the content of one of the reports within the main report until it gets to the page header of the next report.

    try {
       
            while($true){

                $line = $reader.ReadLine()

                $strToCheck = select-string -InputObject $line -pattern "Job Listing"

                if ($strToCheck)
                   {
          $nextline = $reader.ReadLine()
                  
                   $nextStrToCheck = select-string -InputObject $nextline -pattern "Source: TEST-Job: 9100"

    I wrote the code above but my main issue is how to copy the content  under ""Source: TEST-Job: 9100" until it reaches the next page header that starts with "job summary" after the test above is passed.

    Basically i want my script to copy all contents after "Source: TEST-Job: 9100" and stop copying when it hits the next page header "job summary"

    Please advice..

     
    • Changed type Bill_Stewart Tuesday, November 7, 2017 10:05 PM
    • Moved by Bill_Stewart Tuesday, November 7, 2017 10:06 PM This is not "keep redesigning a parsing script based on my continuouously changing list of requirements" forum
    Friday, September 22, 2017 7:15 PM

All replies

  • Post a short example of the text file format you are talking about. (A short example with very few sample lines.)

    -- Bill Stewart [Bill_Stewart]

    Friday, September 22, 2017 7:18 PM
  •  Hi Bill, thanks for your response.. Below is just a short sample of one the reports in the main file. Basically i want the script to copy all contents below "source: Test-Job:9100" and stop when it hits "Audit of All" using "Audit of All" as a pattern.. then i'll export to a text file

                                                                           Job Listing
                                                                     Source: TEST-Job: 9100

    Computer Disabled on 07/30/2015 09:36:33 CN=NSCA,OU=Temp,OU=Inactive Computers,OU=macy,DC=macy,DC=int
    Computer Disabled on 07/30/2015 09:36:47 CN=XPRES3UM,OU=Temp,OU=Inactive Computers,OU=macy,DC=macy,DC=int                    

                                                                                  Audit of All 
                                                             Source: Servers Returns  Job: 8001

    Computer Disabled on 07/30/2015 09:36:44 CN=VMDWHDB03,OU=Temp,OU=Inactive Computers,OU=macy,DC=macy,DC=int
    Computer Disabled on 07/30/2015 09:36:42 CN=T24DATAMIG,OU=Non Windows Servers,OU=Servers,DC=macy,DC=int
    Computer Disabled on 07/30/2015 09:33:35 CN=820EXCHNODE02,OU=Temp,OU=Inactive

     

    Friday, September 22, 2017 7:39 PM
  • Here's one way:


    $content = Get-Content "report.txt"
    for ( $i = 0; $i -lt $content.Count; $i++ ) {
      $line = $content[$i]
      if ( $line -match '^\W+audit of all') {
        break
      }
      if ( $line -match '^\w' ) {
        $line
      }
    }
    


    -- Bill Stewart [Bill_Stewart]

    Friday, September 22, 2017 8:47 PM
  • Thanks, Bill.

    Unfortunately, this wouldn't work because in the main report, there are several reports that starts with  each report has "job listing" and "Source: TEST-Job: ****"..the **** represents the job number which is a variable, hence why i decided to search using -pattern. If the two conditions $strToCheck and $nextstrToCheck are met, that means the script found the right report and the contents under it should be copied until when it gets to the next report which could have any page header name like "Audit of All" as an example.

    Friday, September 22, 2017 9:11 PM
  • It probably can be done. My code is merely an example.

    I would recommend getting the original report in CSV format or something that's easier to process.


    -- Bill Stewart [Bill_Stewart]

    Friday, September 22, 2017 9:14 PM
  • Thanks, Bill.


    Saturday, September 23, 2017 1:52 PM
  • Hi All,

    I am new to powershell but I have a main report that contains different reports in txt format. I would like to copy the content of one of the reports within the main report until it gets to the page header of the next report.

    try {
       
            while($true){

                $line = $reader.ReadLine()

                $strToCheck = select-string -InputObject $line -pattern "Job Listing"

                if ($strToCheck)
                   {
          $nextline = $reader.ReadLine()
                  
                   $nextStrToCheck = select-string -InputObject $nextline -pattern "Source: TEST-Job: 9100"

    I wrote the code above but my main issue is how to copy the content  under ""Source: TEST-Job: 9100" until it reaches the next page header that starts with "job summary" after the test above is passed.

    Basically i want my script to copy all contents after "Source: TEST-Job: 9100" and stop copying when it hits the next page header "job summary"

    Please advice..

     

    I still haven't been able to get this to work.. Could anyone please look into my lines of code below and check while it isn't working as required? Thanks

                                                                       

    $All_input_files = get-childitem -path $input\* -include ($Finalstring) | where-object {!($_.PSIscontainer)}
    $path_in = $All_input_files.fullname

    $numberoflines = Get-content $path_in
    $lines = $numberoflines.count

    $reader = [System.IO.File]::OpenText($path_in)

    $index = 0;

        try {

            while($true){

                $line = $reader.ReadLine()

                $strToCheck = select-string -InputObject $line -pattern "Job Summary" 

                if ($strToCheck) 
                   {
      $nextline = $reader.ReadLine()

                   $nextStrToCheck = select-string -InputObject $nextline -pattern "Source: TEST-609-Job: 9100"

      if ($nextStrToCheck){
      add-content $path_out $line
                              add-content $path_out $nextline

                   for ( $i = 0; $i -lt $content.Count; $i++ ) {
                   $mainline = $content[$i]
                   if ( $mainline -match '^\w' ) {
                   $mainline
                   }
                   elseif ($mainline -match '^\W+audit of all') {
                   break
                   }
                   add-content $path_out $mainline
                 }
      }

    }

    if($index -ge $lines){

                    write-host $index+ "I'm exiting"

                  exit}   
     
              }
        }

        finally {
            $reader.Close()
            }

    write-host "program execution:`ncompleted"

    Saturday, September 23, 2017 1:55 PM
  • Please take the time to learn how to post code correctly.  The code posting tool is on the edit bar. 

    Start by using "Get-Content" to read the file.  You will need to use a switch statement to detect the type of line and a state flag to know whenyou have found a block and when you have found the end of a block.  A RegEx switch would be best for this.

    You will also have to become better at programming to see how to do this.

    Parsing files is a very advanced programming skill.  It is a bad way to start learning how to program.  Start with one file and figure out how to search for the "key" lines.

    This is how to parse this kind of a structure.

    $results = Get-Content somefile.txt |
    	ForEach-Object{
    		switch -regex ($_){
    			'job listing' {
    				write-host 'Found start line'
    				$inblock = $true
    			}
    			{ $_.Trim().Length -eq 0 } { 
    				write-host 'Found end line'
    				$inblock = $false
    			}
    			default {
    				if($inbloc){
    					# code to skip unwanted lines
    					$_ # output line
    				}
    	               }
    	        }
    	}


    \_(ツ)_/




    • Edited by jrv Saturday, September 23, 2017 6:13 PM
    Saturday, September 23, 2017 6:11 PM
  • Thanks for the explanation Jrv.. The issue is the report contains several reports inside it that starts with "job listing" hence why I used two patterns "Job listing" and "Source: TEST-Job: 9100" which is unique for the report I want to extract. Using only "Job Listing"  only will make the script run through all the reports since the page header for each report starts with "Job Listing".. Is there a way to parse "job listing" and "source: Test-job 9100" to mark the beginning of the extract and for the $inblock to return false when it hit's the header "Audit of All'? because there are space in between the report, the trim().length -eq 0 wouldn't work if i don't use a particular character pattern.
    Saturday, September 23, 2017 6:51 PM
  • So just change the "end" detector.

    You will have to work out the logic of how to make this works.  You can also just use the string converter and a match file

    https://blogs.msdn.microsoft.com/powershell/2014/10/31/convertfrom-string-example-based-text-parsing/

    Techniques of Plain Text Parsing

    Example of example based text parsing:

    Get-Content stuff.txt |
    	ConvertFrom-String  -TemplateFile .\template.txt |
    	select -expand inforecord |
    	Export-Csv converted.csv -NoType
    

    Example parsing template:

    {InfoRecord*:NAMEx: "{Name1: Module 1}", DESCR: "{Desc:20x10GE/Supervisor}"
    PID: {Pid:N5K-C5010P-BF     }, VID: {Vid:V03} , SN: {sn:JAF1412AGFP}}
    {InfoRecord*:NAMEx: "{Name1:Chassis}", DESCR: "{Desc:Nexus5010 Chassis}"
    PID: {Pid:N5K-C5010P-BF     }, VID: {Vid:V03} , SN: {sn:SSI141004M2}}
    
    

    Input text file:

    NAME: "Chassis", DESCR: "Nexus5010 Chassis"
    PID: N5K-C5010P-BF     , VID: V03 , SN: SSI141004M2
    
    NAME: "Module 1", DESCR: "20x10GE/Supervisor"
    PID: N5K-C5010P-BF     , VID: V03 , SN: JAF1412AGFP
    
    NAME: "Fan 1", DESCR: "Chassis fan module"
    PID: N5K-C5010-FAN     , VID: N/A , SN: N/A
    
    NAME: "Fan 2", DESCR: "Chassis fan module"
    PID: N5K-C5010-FAN     , VID: N/A , SN: N/A
    
    NAME: "Power supply 1", DESCR: "AC power supply"
    PID: N5K-PAC-550W      , VID: V02 , SN: DTM1413034R
    
    NAME: "FEX 100 CHASSIS", DESCR: "N2K-C2148T-1GE  CHASSIS"
    PID: N2K-C2148T-1GE    , VID: V02 , SN: FOX1406G9KH
    
    NAME: "FEX 100 Module 1", DESCR: "Fabric Extender Module: 48x1GE, 4X10GE Supervisor"
    PID: N2K-C2148T-1GE    , VID: V02 , SN: JAF1414BDPA
    
    NAME: "FEX 100 Fan 1", DESCR: "Fabric Extender Fan module"
    PID: N2K-C2148-FAN     , VID: N/A , SN: N/A
    
    NAME: "FEX 100 Power Supply 1", DESCR: "Fabric Extender AC power supply"
    PID: N2K-PAC-200W      , VID: 01  , SN: AC13493MHE
    
    NAME: "FEX 100 Power Supply 2", DESCR: "Fabric Extender AC power supply"
    PID: N2K-PAC-200W      , VID: 01  , SN: AC14023NGS
    


    \_(ツ)_/

    Saturday, September 23, 2017 7:05 PM