Answered by:
ASP data collection

Question
-
This is a post from ASP data collection in which I was referred to this forum. Sorry I did not start here. I have since found a substandard, overly complex, and resource consuming solution, but have not had time to drill down to make it work yet. If I could figure out how to get the method of this question to work, it should be much easier to work. Also, I will be excited and very grateful!
Hello, I am trying to collect a small list with:
Full Product, SHA1, and File Name from
http://msdn.microsoft.com/en-us/subscriptions/downloads/default.aspx
(Link) for several different servers. I am going to make an AIO install disk with dism/wim, and need to download all the different ISO's. Planning/wanting to do so for Server 2003, 2008, and 2012, so there are a lot of different images I will need to download. Wanted to do the download with a manager which will check the SHA1s automatically.Anyway, I thought this would be a quick and easy project, but I have been struggling with it. I do not know asp, but I have been trying to use
Invoke-WebRequest
andInvoke-RestMethod
to retrieve this information. I have not been able to make much sense of it at all. I have been working onhttp://msdn.microsoft.com/en-us/subscriptions/downloads/default.aspx#searchTerm=&ProductFamilyId=137&Languages=en&PageSize=100&PageIndex=0&FileId=0(Link)
and have not been able to retrieve the information yet! Usually I can get a little something, but I am completely stuck. I tried to do$var.Forms[0].Fields.Add("ProductFamilyId","137")
and then useInvoke-RestMethod -Body $var.Forms[0]
but have not been able to retrieve the info.Is there a quick way to do this within PowerShell? Would the HtmlAgilityPack work better? I am completely flummoxed and tired, so any help will be greatly appreciated. I am going to go to bed before I fall asleep on my keyboard. Thanks for any help you can render!
I added a little more information
If you were navigating the page in a web browser, you would need to press the "Details" link for each product to show the "SHA1" and "File Name" information. I do not have a lot of free time for this, so I was hoping someone would be able to provide a little guidance to get past the posting/requesting hump. I believe there are two posts/requests, one when the page first loads which requests the specific search / ProductFamilyId, and two needing to open all of the "Details" links.
Thanks for any and all help!
Sincerely,
John- Edited by -J-o-h-n- Saturday, October 12, 2013 3:56 AM
- Moved by Bill_Stewart Tuesday, December 31, 2013 10:03 PM Off-topic post
Saturday, October 12, 2013 3:04 AM
Answers
-
The simple answer is the same as before. It cannot be done. Leaving this open will not change that.
The reason it cannot be done is not about scripting but is about web site design. It is about a product that is not related to scripting.
No need to worry as this has already been moved to the dust bin.
¯\_(ツ)_/¯
Thursday, January 2, 2014 9:12 PM
All replies
-
You appear to be asking how to download product disks from the MSDN subscription site.
This is only possible using the MSDN site downloader for obvious reasons. They provide no API or tools to automate this.
Sorry. YOU might post to the MSDN site to see if they plan on adding this to their support.
¯\_(ツ)_/¯
- Proposed as answer by jrv Tuesday, December 31, 2013 10:25 PM
Saturday, October 12, 2013 3:47 AM -
No, I am asking how to get the Full Product (commercial name), SHA1, and File Name into a csv format not how to download the actual files (ISOs) from MSDN.
- Edited by -J-o-h-n- Saturday, October 12, 2013 4:01 AM
Saturday, October 12, 2013 4:00 AM -
Just pull the screen and extract the table data.
You can download the HTML using the WebClient in PowerShell.
I don.t see what any of this has to do with ASP. There is no ASP involved. It is all HTML.
¯\_(ツ)_/¯
- Edited by jrv Saturday, October 12, 2013 4:29 AM
Saturday, October 12, 2013 4:06 AM -
here: You will have to figure out the page structure yourself. That is the time consuming part.
$url='http://msdn.microsoft.com/en-us/subscriptions/downloads/default.aspx#searchTerm=&ProductFamilyId=479&Languages=en&PageSize=10&PageIndex=0&FileId=0' $wc=New-Object System.Net.WebClient $page=$wc.DownloadString($url) $xml=[xml]$page $xml.html.body.div.div.div.ul.li
¯\_(ツ)_/¯
Saturday, October 12, 2013 4:30 AM -
(New-Object System.Net.WebClient).DownloadFile("http://msdn.microsoft.com/en-us/subscriptions/downloads/default.aspx#searchTerm=&ProductFamilyId=137&Languages=en&PageSize=100&PageIndex=0&FileId=0", ".\Desktop\temp\download.html")
(New-Object System.Net.WebClient).DownloadString("http://msdn.microsoft.com/en-us/subscriptions/downloads/default.aspx#searchTerm=&ProductFamilyId=137&Languages=en&PageSize=100&PageIndex=0&FileId=0")
Neither of the above two commands returns the requested parameters, no Windows Server 2003 family, no Full Product (commercial name), no SHA1, and no File Name. Am I missing your point?
Saturday, October 12, 2013 4:34 AM -
No point. If it is available that is how ot get it. Chances are it is XML injected dynamically and rendered into the data div using XSLT. That is not available. The XML source is only known to the server.
I could decode it by disassembling the page and its scripts. It is time consuming and tedious and would not likely result on a solution.
¯\_(ツ)_/¯
Saturday, October 12, 2013 4:44 AM -
The only way I could imagine to do that would be an attempt to send a webrequest to https://msdn.microsoft.com/en-us/subscriptions/securejson/GetFileDetail. If you use the IE developers tools/network tab you can capture the request that is being send when clicking on the "Details" tab and try to programmatically create the same. This might still fail due to lack of required permissions.
Saturday, October 12, 2013 8:25 AM -
Dirk - Tried that. It doesn't work. It appears that the data is rendered dynamically via JavaScript and is not directly available in the request stream. If there is a way to do this it is not simple.
Remember that this is an Ajax page. Even developer tools do not capture the page data.
I suspect there is a way to do this but it may require direct management of the TCP link.
Maybe someone who has experience debugging an Ajax page might know how to do this. Posting in the IIS ASP.Net developer forums would be better I suspect.
¯\_(ツ)_/¯
- Edited by jrv Saturday, October 12, 2013 2:17 PM
Saturday, October 12, 2013 2:14 PM -
Strange I can see the Filename Product name and SHA within the response body.
Saturday, October 12, 2013 5:14 PM -
Strange I can see the Filename Product name and SHA within the response body.
Using what tool?¯\_(ツ)_/¯
Saturday, October 12, 2013 5:39 PM -
If I force a Json call this is what is returned.
{ "FileId":47977, "DownloadProvider":1, "NotAuthorizedReasonId":null,"FileName":"en_windows_server_2003_sp2_ia64_cd.iso", "Description":"Windows Server 2003 Service Pack 2 (ia64) - CD (English)\r\n", "Notes":null, "Sha1Hash":"D5829B080FF2401AD259A15C54A7704529FE392D", "ProductFamilyId":138, "PostedDate":"\/Date(1318351988783)\/", "LanguageCodes":["en"], "Languages":["English"], "Size":"574 MB", "IsAuthorization":false, "BenefitLevels":["BizSpark", "BizSpark Admin", "MSDN OS (Retail)", "MSDN OS (VL)", "MSDN Platforms", "VS Premium with MSDN (MPN)", "VS Premium with MSDN (Retail)", "VS Premium with MSDN (VL)", "VS Pro with MSDN (Retail)", "VS Pro with MSDN (VL)", "VS Pro with MSDN Premium (Empower)", "VS Pro with MSDN Premium (MPN)", "VS Test Pro with MSDN (Retail)", "VS Test Pro with MSDN (VL)", "VS Ultimate with MSDN (MPN)",0 "VS Ultimate with MSDN (NFR FTE)", "VS Ultimate with MSDN (Retail)", "VS Ultimate with MSDN (VL)"], "IsProductKeyRequired":false }
We would need to parse all of the action links toreturn alof the details.
¯\_(ツ)_/¯
- Edited by jrv Saturday, October 12, 2013 5:57 PM
Saturday, October 12, 2013 5:56 PM -
Sorry for the delay, its been a little busy recently. The messages kept getting squeezed to the right side, so I quoted instead of replied.
If I force a Json call this is what is returned.
{ "FileId":47977, "DownloadProvider":1, "NotAuthorizedReasonId":null,"FileName":"en_windows_server_2003_sp2_ia64_cd.iso", "Description":"Windows Server 2003 Service Pack 2 (ia64) - CD (English)\r\n", "Notes":null, "Sha1Hash":"D5829B080FF2401AD259A15C54A7704529FE392D", "ProductFamilyId":138, "PostedDate":"\/Date(1318351988783)\/", "LanguageCodes":["en"], "Languages":["English"], "Size":"574 MB", "IsAuthorization":false, "BenefitLevels":["BizSpark", "BizSpark Admin", "MSDN OS (Retail)", "MSDN OS (VL)", "MSDN Platforms", "VS Premium with MSDN (MPN)", "VS Premium with MSDN (Retail)", "VS Premium with MSDN (VL)", "VS Pro with MSDN (Retail)", "VS Pro with MSDN (VL)", "VS Pro with MSDN Premium (Empower)", "VS Pro with MSDN Premium (MPN)", "VS Test Pro with MSDN (Retail)", "VS Test Pro with MSDN (VL)", "VS Ultimate with MSDN (MPN)",0 "VS Ultimate with MSDN (NFR FTE)", "VS Ultimate with MSDN (Retail)", "VS Ultimate with MSDN (VL)"], "IsProductKeyRequired":false }
We would need to parse all of the action links toreturn alof the details.
¯\_(ツ)_/¯
I am uncertain what you mean by "force Json call". I don't mind parsing the information. So, you used Webclient->DownloadString($url)->$xml=[xml]$page -> some sort of Json call?
Thursday, October 17, 2013 2:02 PM -
I used the IE developer tool bar and caused the link to be evaluated. Each link on the page makes a json call and uses XSLT to render the results. n YOu cannot do this directly with PowerShell.
There is no URL> There is a post back with an ID that retrieves XML. The call appears to be part of the Ajax for the page.
¯\_(ツ)_/¯
Thursday, October 17, 2013 3:38 PM -
- Moved by Bill_Stewart<abbr class="affil"></abbr> 9 minutes ago Off-topic post
Tuesday, December 31, 2013 10:15 PM -
- Moved by Bill_Stewart<abbr class="affil"></abbr> 9 minutes ago Off-topic post
The question was answered and it is not a scripting issue. You are asking how to screen scrape a page that has been designed in such a way that it cannot be"scraped". If you can figure out a way to do it then good but most of us highly doubt that it is possible.
It also appears that the question was abandoned many months ago.
¯\_(ツ)_/¯
Tuesday, December 31, 2013 10:29 PM -
May be just me, answered means resolved and this question did not come to a solution as much as a dead-end--not resolved. Being unresolved (nothing marked by OP (me) as answer), I thought it wise to allow it to remain open for others to offer solutions.
However, if data collection is actively obfuscated by MSDN, I can see how this might be considered an integrity issue which causes incongruent policy. While I fervently disagree this to be an issue, I will capitulate and not question the matter any more.
Thursday, January 2, 2014 4:31 PM -
The simple answer is the same as before. It cannot be done. Leaving this open will not change that.
The reason it cannot be done is not about scripting but is about web site design. It is about a product that is not related to scripting.
No need to worry as this has already been moved to the dust bin.
¯\_(ツ)_/¯
Thursday, January 2, 2014 9:12 PM