locked
Inconsistent output using Export-Csv - returns partial results -OR- System.Object[] RRS feed

  • Question

  • As it will become clear, I am new to scripting and even newer to PowerShell.  That said, I have searched for an answer to these specific problems now going on 3 days; including the three that were recommended as I began this.  Time to ask for a fish (or at the very least a link to a fish) rather than trying to catch the fish myself.

    I captured most of my notes within the PowerShell script itself along with example links that demonstrate the issues.  In addition to being open to suggestions on how to clean up this script, the jist of my 3 problems are:

    1. Problem 1 and 3 - When the page I am searching has multiple results, I get unreliable output; particularly in the .csv.  Either a.) I only get one of the results when there should be multiples pulled off the page or b.) I get "System.Object[]"
    2. Problem 2 - When a pattern regex is found multiple times though the actual text itself differs, I would like only the unique occurrences.

    Thanks to everyone in this community who is always so helpful and patient.

    $output = @()
    $web = New-Object Net.WebClient
    $urls = get-content "C:\Scripts\ResearchURLs.txt"
    #If you want to build this file here are four good example URLS:
        #Example URL1: https://securelist.social-kaspersky.com/en/advisories/59781
        #Example URL2: http://www.viruslist.com/fr/advisories/59781
            #This is just the French version of Example 1, but sometimes has more information
        #Example URL3: https://securelist.social-kaspersky.com/en/advisories/59003
        #Example URL4: http://www.viruslist.com/fr/advisories/59003
            #Again the French version of URL3
    $FormatEnumerationLimit = 150
    foreach($url in $urls){
    $results = $web.DownloadString("$url")
    $matchesCVE = $results | Select-String '(?<n>CVE-\d+-\d+)' -AllMatches
    #Problem #1 - I can only get the output in a .txt file.  The .csv only shows System.Object[].
        #Example of both outputs shown in output notes.
    #Problem #2 - The same CVE may appear on the page more than once, so it repeats over and over in the output.
        #Is there a way to grab only every unique match?
        #I would also like to clean this up as much as possible and grab description from both sites as well as the original advisory.  Haven't figured that one out yet.
    $matchesImpact = $results | Select-String -Pattern ("Denial of Service","DoS","Exposure of sensitive information","Manipulation of data","System access","Cross Site Scripting","Unknown","Hijacking","Security Bypass","Privilege Escalation","Spoofing") -AllMatches
    #Problem 3 - When this works, it only matches one of the patterns
            #I say "when it works" because, (as with advisory 59003) there should be 2 impacts. 
            #The .txt and .csv both only show Exposure of sensitive information.
    if ($matches.Matches){
    $Object = New-Object PSObject
    $Object | add-member Noteproperty URL               $url
    $Object | add-member Noteproperty CVE               $matchesCVE.Matches.value
    $Object | add-member Noteproperty Impact            $matchesimpact.Matches.value
    
    $output+=$Object}
    }
    $output | Export-Csv "C:\Scripts\ResearchOutput.csv" -NoTypeInformation -force
        #Example of .csv file output:
            #URL                                                            CVE                Impact
            #https://securelist.social-kaspersky.com/en/advisories/59003	System.Object[]    Exposure of sensitive information
            #http://www.viruslist.com/fr/advisories/59003                   System.Object[]    Exposure of sensitive information
            #https://securelist.social-kaspersky.com/en/advisories/59781	System.Object[]    System.Object[]
            #http://www.viruslist.com/fr/advisories/59781	                System.Object[]    System.Object[]
    $output | Format-Table -AutoSize | Out-String -Width 4096 | Out-File "C:\Scripts\ResearchOutput.csv" -force
        #Example of .txt file output:
            #URL                                                         CVE                                                                                        Impact                                             
            #---                                                         ---                                                                                        ------                                             
            #https://securelist.social-kaspersky.com/en/advisories/59003 {cve-2014-0497, CVE-2014-0497, CVE-2014-0497, CVE-2013-1500}                               Exposure of sensitive information                  
            #http://www.viruslist.com/fr/advisories/59003                {CVE-2013-1500, CVE-2013-1500}                                                             Exposure of sensitive information                  
            #https://securelist.social-kaspersky.com/en/advisories/59781 {cve-2014-0497, CVE-2014-0497, CVE-2014-0497, CVE-2014-0537, CVE-2014-0539, CVE-2014-4671} {Security Bypass, Security Bypass}                 
            #http://www.viruslist.com/fr/advisories/59781                {CVE-2014-0537, CVE-2014-0537, CVE-2014-0539, CVE-2014-0539, CVE-2014-4671, CVE-2014-4671} {Security Bypass, Security Bypass, Security Bypass}
    • Moved by Bill_Stewart Friday, October 24, 2014 3:21 PM Abandoned
    Wednesday, July 23, 2014 2:38 PM

All replies

  • What page is an example of a failure.  We cannot be of much help without knowing what you are seeing.


    ¯\_(ツ)_/¯

    Wednesday, July 23, 2014 3:02 PM
  • Thanks for the quick reply jrv.  There are 4 examples pages that I put within the script itself, but here they are again:

        #Example URL1: https://securelist.social-kaspersky.com/en/advisories/59781
       
    #Example URL2: http://www.viruslist.com/fr/advisories/59781
           
    #This is just the French version of Example 1, but sometimes has more information
       
    #Example URL3: https://securelist.social-kaspersky.com/en/advisories/59003
       
    #Example URL4: http://www.viruslist.com/fr/advisories/59003
           
    #Again the French version of URL3

    There is also example of the output in that section of the script.

    Wednesday, July 23, 2014 3:11 PM
  • Which one is failing and how do you know it is failing?


    ¯\_(ツ)_/¯

    Wednesday, July 23, 2014 3:53 PM
  • This is how I would get the CVE reports:

    $uri='http://www.cvedetails.com/json-feed.php?numrows=10&vendor_id=0&product_id=0&version_id=0&hasexp=0&opec=0&opov=0&opcsrf=0&opfileinc=0&opgpriv=0&opsqli=0&opxss=0&opdirt=0&opmemc=0&ophttprs=0&opbyp=0&opginf=0&opdos=0&orderby=3&cvssscoremin=0'
    $wc=new-object SYstem.Net.Webclient
    $json=$wc.DownloadString($uri)
    $cve=$json|convertFrom-Json
    $cve|select cve_id

    With the CVE soap service the json file download we can keep up with the details without having to scrape web pages.


    ¯\_(ツ)_/¯

    Wednesday, July 23, 2014 4:09 PM
  • Ok, so I ran that script and it pulled several CVE's from the cvedetails.com site.  But it's unclear how that alternative will help me.  It produced a listing of 10 CVE's, but I need so many more peices of information.  I leverage cvedetails.com as part of my process, but not until I have dwindled down my list to only specific products I am interested in.

    It may also help to know that the script I posted is paired down significantly to only show the peices I am having problems with.  I am also pulling 'title', 'criticality', 'solution status', 'from where' using these two primary sources which do a good job of aggregating newly released vulnerabilities.

    Wednesday, July 23, 2014 4:43 PM
  • That still doesn't tell us what still doesn't work.

    By the way.  On the NVD site you can query by product, alert level and anything you need with the SOAP tools.

    If you just want to screen scrap you are going to have to provide better information on what doesn't work.


    ¯\_(ツ)_/¯

    Wednesday, July 23, 2014 4:49 PM
  • I only summarized the issue in the main question since most of the issues were detailed better in the actual script itself.  Here they are slightly modified.  Hopefully this is better?

    #Problem #1 - I can only get the cve output in a .txt file.  The .csv only shows System.Object[].
       
    #Example of both outputs shown [below problem 3].
    #Problem #2 - The same CVE may appear on the page more than once, so it repeats over and over in the output.
       
    #Is there a way to grab only every unique match?
       
    #I would also like to clean this up as much as possible and grab description from both sites as well as the original advisory.  Haven't figured that one out yet.
    #Problem 3 - When [the matchesImpact] works, it only matches one of the patterns [also displayed in the output example below]
           
    #I say "when it works" because, (as with advisory 59003) there should be 2 impacts.
           
    #The .txt and .csv both only show Exposure of sensitive information.

    #Example of .csv file output:
           
    #URL                                                            CVE                Impact
           
    #https://securelist.social-kaspersky.com/en/advisories/59003 System.Object[]    Exposure of sensitive information
           
    #http://www.viruslist.com/fr/advisories/59003                   System.Object[]    Exposure of sensitive information
           
    #https://securelist.social-kaspersky.com/en/advisories/59781 System.Object[]    System.Object[]
           
    #http://www.viruslist.com/fr/advisories/59781                 System.Object[]    System.Object[]

    #Example of .txt file output:
           
    #URL                                                         CVE                                                                                        Impact                                            
           
    #---                                                         ---                                                                                        ------                                            
           
    #https://securelist.social-kaspersky.com/en/advisories/59003 {cve-2014-0497, CVE-2014-0497, CVE-2014-0497, CVE-2013-1500}                               Exposure of sensitive information                 
           
    #http://www.viruslist.com/fr/advisories/59003                {CVE-2013-1500, CVE-2013-1500}                                                             Exposure of sensitive information                 
           
    #https://securelist.social-kaspersky.com/en/advisories/59781 {cve-2014-0497, CVE-2014-0497, CVE-2014-0497, CVE-2014-0537, CVE-2014-0539, CVE-2014-4671} {Security Bypass, Security Bypass}                
           
    #http://www.viruslist.com/fr/advisories/59781                {CVE-2014-0537, CVE-2014-0537, CVE-2014-0539, CVE-201

    Wednesday, July 23, 2014 4:58 PM
  • The results of a match will be multiple objects.-  You will have to find a ways to enumerate them into the file.

    Same is true of multiple items. Each page can have dozens of CVE strings that are the same. You need to group these to fix the.

    As I said.  You will need to use a soap source or become a master of RegEx.  THis is why we have daa sources and the NVD is the standard source for all.

    Screen scraping is always a very frustrating experience.  HTML is notoriously inconsistent and can create many issue unless you design very robust expressions.

    I have looked at these pages.  They are very unstructured because they are code generated.  They can generate differently under many circumstances.

    Get the text CVEs then query the NVD for the details.


    ¯\_(ツ)_/¯

    Wednesday, July 23, 2014 5:32 PM
  • # the following gets exactly one CVE number from the page.
    $uri='https://securelist.social-kaspersky.com/en/advisories/59781'
    $wc=New-Object System.Net.WebClient
    $wc.DownloadFile($uri,"$pwd\results.txt")
    $cveid=cat results.txt |%{if($_ -match 'title="(?<n>CVE-\d+-\d+)'){$matches['n']}}
    
    
    


    ¯\_(ツ)_/¯


    • Edited by jrv Wednesday, July 23, 2014 7:33 PM
    Wednesday, July 23, 2014 7:33 PM
  • Years ago I worked with the SCAP and NVD data.  I also worked on methods to extract CVE data from the NVD.  I managed to track down one of my old posts on this from which I have extracted a function that queries the online NVD data.

    function Get-CVEByIDFromNVD{
           Param(
                        $cveid='CVE-2014-4269',
                        $nvddb='http://static.nvd.nist.gov/feeds/xml/cve/nvdcve-2.0-recent.xml'
          )
    
         $nvdxml=[xml]'<root/>'
         $nvdxml.load($nvddb)
    
         $nsmgr = New-Object System.XML.XmlNamespaceManager($nvdxml.NameTable)    
         $nsmgr.AddNamespace('default','http://scap.nist.gov/schema/feed/vulnerability/2.0')
         $nsmgr.AddNamespace('xsi','http://www.w3.org/2001/XMLSchema-instance')
         $nsmgr.AddNamespace('vuln','http://scap.nist.gov/schema/vulnerability/0.4')
         $nvdxml.SelectSingleNode("//default:entry[@id='$cveid']",$nsmgr)
    }

    The query returns an XML node with all of the CVE data.  To extract the data you will either have to remove the namespaces or use the namespace manager.  I would expand the function to convert the XML into a PsCustomObject.

    Here is a link to the original post which is a good discussion on how to query XML data: http://social.technet.microsoft.com/Forums/de-DE/6d192879-6b6c-4017-9276-1b43599c3379/xpath-queries-for-xml-objects-in-powershell?forum=ITCGYou can use this to query downloaded copies of the archive.  All active CVEs should be in the current DB but sometimes we want the history or to look up old vulnerabilities.

    The plus with this is that is more up-to-date than most vendor systems.


    ¯\_(ツ)_/¯




    • Edited by jrv Thursday, July 24, 2014 4:09 PM
    Thursday, July 24, 2014 4:05 PM
  • Here is an example of how to extract data directly.

    PS C:\scripts> $cve.'vulnerable-software-list'.product
    cpe:/a:oracle:hyperion:11.1.2.3
    cpe:/a:oracle:hyperion:11.1.2.2
    PS C:\scripts>


    ¯\_(ツ)_/¯

    Thursday, July 24, 2014 4:15 PM
  • I had a few minutes so I converted the function to return objects.  Here is a link to the most current version.

    http://1drv.ms/1pMq1hm


    ¯\_(ツ)_/¯

    Thursday, July 24, 2014 4:52 PM
  • Great stuff!  Thanks jrv.  I wish I was more experienced with PowerShell (and scripting in general) so I could dissect and put this together for what I am doing.  I appreciate all the help.  I'm trying an alternative of pulling the xml feeds into a database. 

    Friday, July 25, 2014 2:26 PM
  • Great stuff!  Thanks jrv.  I wish I was more experienced with PowerShell (and scripting in general) so I could dissect and put this together for what I am doing.  I appreciate all the help.  I'm trying an alternative of pulling the xml feeds into a database. 

    There are many databases on NVD.  There is an update almost daily.  It is the source of CVE records and all other sites get the records from NVD via a remote pull of these databases.  You will need to check for the updates.  I would set up a scheduled task to pull the files daily if they have been updated.

    Check the daily file.  If it does not have the CVE then check the annual file. CVE-YYYY-nnnn


    ¯\_(ツ)_/¯

    Friday, July 25, 2014 3:44 PM
  • Note also that you can pull an RSS feed that has only the daily changes as a summary.  These come as XML in RSS schema format and can be read very easily in PowerShell.  AV vendors have their own RSS feeds of posted vulnerabilities which are also in XML. (RSS is XML)


    ¯\_(ツ)_/¯

    Friday, July 25, 2014 3:47 PM