Using Ruby’s http library - download and process web pages - I
Ruby has excellent networking support. Ruby has low level networking features such as sockets and tcp/ip protocols. It also has a high level API for handling protocols such as http and ftp. In this post we will look the Ruby http library. We also look at how this library can be used to download and process web pages.
1. Downloading a web page using Ruby
Following code illustrates using net/http library for downloading the Google’s home page. You should see the google homepage html in console!
Now in my machine, this returns text which says “document has moved”. This is because google is send a redirect to www.google.co.in. Following code shows how we can handle http redirect.
Now how do we rewrite this if we need to use a proxy server to connect to internet? In Ruby it is pretty simple. Check out the new version below.