Scrape flash

flashscreen-scraping

I'd like to scrape a website to programmatically collect any external links within any flash elements on the page. I'd also like to collect any other text, if possible, but the links are the important part. Is this possible? A freeware library/service to accomplish this task would be preferable, but if none is, how can I accomplish the task on my own? Is it possible to get the source code and pull from that?

Best Answer

Decompiling the Flash source would let you see the ActionScript part of the Flash file, which I've found to often contain info like links.

A free decompiler is Flare. It's command line only, and works fine. It won't decode some of the info in newer Flash formats (>CS3 I think). It dumps all the AS into one file.

Sothink SWF Decompiler is a more sophisticated commercial program. It will work fine with any Flash file I've tried and the results are quite thorough and well organized. it's GUI based and I don't know if it is easily automated.

With Flare, since it's a command line tool, one could easily write a script to obtain the SWF, decompile it, grep for 'http://', and log the results.

Related Topic