Yesterday I updated the link parser that is used by the results detail page to extract links to mp3s. It now has the ability to parse most pages that the previous version of the parser could not.
As I had mentioned in my previous post, the old link parser used regular expressions to extract links from a page. This method of parsing was very fast, but not very accurate (or accurate enough). So I decided to switch over to a DOM based parsing strategy , in that I convert the target site into a DOM structure and then pick out the links that I want using xQuery. The end result is more accurate results, giving the end user more mp3 links, but has slightly slower runtime.
One good thing about having a second strategy to rely on, is that I still have the first, in case the 2nd doesn't work. The link parser will fall back to the regular expression parsing method if for some reason it can't convert the target page into it's DOM.