You are not logged in.

#1 11 Feb 2011 05:26

PowerShell2737
Member
Registered: 11 Feb 2011
Posts: 1

Help Parsing HTML File

Hello,
I need to extract specific data from an HTML page.

Data I need:

http://pics.ABC123.com/thumbnails/78/99 … 337313.jpg
http://pics.ABC123.com/thumbnails/78/99 … 849839.jpg

So I found this command problem is that it is finding the substring and
then everything after it obviously I am new to PowerShell.

Get-Content TEMP.html | out-string |% {$_.substring($_.IndexOf('http://pics.ABC123.com/thumbnails/'))}

I need it to find all instances of the link and end at the .jpg since everything after the /thumbnails/ is unknown.

Thanks in advance.

Offline

#2 15 Feb 2011 12:45

allal
Member
Registered: 10 Jan 2011
Posts: 48

Re: Help Parsing HTML File

try this:

get-content TEMP.html | where-object {$_ -match "http://pics.ABC123.com/thumbnails/78/99.*\.jpg" }

Last edited by allal (15 Feb 2011 12:46)

Offline

Board footer

Powered by