I recently needed to write a script to get some data from a XML file, in this case all IDs of Topic Elements. Decided to use Powershell but quickly ran into an issue due to the size of the uncompressed file… 127MB. Here’s my initial script:

# Get the ids
$File = .\myFile.xml
[xml]$topic = get-content $File
$arrayTopicIds = $topic.TopicResults.Topic | %{$_.Authors.Author} | %{$_.Id}
Needless to say, it consumed all my memory and froze my machine. Undeterred, I turned to the -FilterScript. Rather than use an XML object, I used the get-content commandlet to load the file into an array of end-line-delimited strings. And since i knew the pattern i was looking for i was able to use the -FilterScript command to filter the processed lines. At the same time also formatted the string to how i wanted it:
# Get the ids
#$topic = get-content $File | Where-Object -FilterScript { $_ -ilike “*Author id=*” } | %{ $_ - replace "" } | %{ $_.Trim() }
Simple and ended up running in under 4 minutes. Which was nice.