Ruby Image Grabber
Posted by oxaric on December 7, 2008
Yesterday I coded an image grabber in Ruby. It takes a URL, gets the HTML file, and downloads every image it finds identified in the HTML file. It has one extra option that takes the minimum file size of a picture to download. It is not a web crawler and will not follow links to grab other images but I plan to create an image web crawler based upon this code and hopefully I’ll have that up soon.
Something to note is that it does not have the ability to download images referenced by php script. For example on certain forums images are displayed with php script and have a reference similar to “show.php?image_file_name.jpg”. This program will not travel to other links and so it is not able to download php referenced images.
However, if there is a gallery of images that has thumbnails with direct links to the bigger image this program will grab both images.
A neat feature is that the program takes into consideration special HTML codes and should have no problem with ‘coded’ URLs or foreign language image names.
Normally I’d put the source code up here but the program contains special ascii characters and the formatting for displaying the code isn’t working. I think it’s worth your time to download the program and give it a shot. ;)
Click to directly download grabimages.rb
More Information:
It grabs these image types:
.jpg
.jpeg
.png
.bmp
.gif
.tif
.tiff
Usage: ruby grabimages.rb [URL] [Download Path] [Option: Minimum Picture Size, in kB]
Usage Example:
Something to note is that it does not have the ability to download images referenced by php script. For example on certain forums images are displayed with php script and have a reference similar to “show.php?image_file_name.jpg”. This program will not travel to other links and so it is not able to download php referenced images.
However, if there is a gallery of images that has thumbnails with direct links to the bigger image this program will grab both images.
A neat feature is that the program takes into consideration special HTML codes and should have no problem with ‘coded’ URLs or foreign language image names.
Normally I’d put the source code up here but the program contains special ascii characters and the formatting for displaying the code isn’t working. I think it’s worth your time to download the program and give it a shot. ;)
Click to directly download grabimages.rb
More Information:
It grabs these image types:
.jpg
.jpeg
.png
.bmp
.gif
.tif
.tiff
Usage: ruby grabimages.rb [URL] [Download Path] [Option: Minimum Picture Size, in kB]
Usage Example:
| ~/test> ruby grabimages.rb www.yahoo.com download/ |