Mechanize is the obvious choice if you need to scrape websites in Ruby, but it can be confusing to use. Particularly if you're new to web scraping, or new to Ruby.
Documentation which shows the methods, but not what they do or how to use them. Different layers of abstraction… is that a Mechanize method, or a Nokogiri method? Where are these Net::HTTP::Persistent errors coming from?
Wasting your time figuring out the code… how to use meta_refresh, locate elements, or load cookies. Spending days muddling your way through creating a simple scraper.
Or just how to find a specific link and click it—you'd think that would be really easy!
Web scraping is a useful skill to have as a Ruby developer
“If you can write a scraper which performs against any real world data, your new rate is $100 per hour.” — Patrick McKenzie (patio11)
Web scraping is one of those practical development skills that has a variety of real-world applications. If a business (read: your employer or your client) can use it to save time or generate revenue then that makes it an in-demand skill worth knowing about.
Shipping a Ruby project that uses Mechanize demonstrates you have that skill.
Wasting hours scouring the web for outdated tutorials and broken examples doesn’t demonstrate anything to anyone. Neither does getting bogged down in obscure errors.
What if… you knew what the different methods do and how to use them?
What if… you could build a basic web crawler, and test your code?
What if… you could get your new web scraping project finished so you could spend more time developing your Ruby skills and more time polishing your side projects?
Get the Ruby Mechanize Handbook—an easy to follow introduction to scraping websites with Mechanize
With the Ruby Mechanize Handbook you’ll learn…
- How to build a basic web crawler
- How to deal with navigation errors like Mechanize::ResponseCodeError
- How to follow links and handle international domain names
- How to extract links, text, and microdata from the web page content
- How to download files and save all the images on a page
- How to export CSV files, PDF documents, and XLSX spreadsheets
- How to automate forms, upload files, select dropdowns, and toggle the right checkboxes
- How to setup logging and debug Mechanize request headers
- How to debug SSL requests with Charles proxy, no OpenSSL::SSL::SSLError exceptions
- How to automatically detect DOM changes when your scraper inevitably breaks
- How to save, load, and manually add cookies
- How to run your scraper code in background jobs with sidekiq
- How to throttle your scraper to support rate limited requests
- How to test your code without hitting the network
Get up to speed quickly, and unlock the power of web scraping with Ruby. Get your copy:
Nice things people have said…
“I’ve also read your handbook, thanks for writing it! It’s indeed handy and right to the point; a brisk info dump without filler. It’s appreciated!” — Kasper Timm Hansen
“Well worth the $39, it’s cut my web scraping project down considerably. Cheers!” — Jonathan Korty
“It is really informative and a great start for beginners! I like how you expand the code base gradually and each line of code really taught me something!” — Faye Fang
“The book is a a great way to quickly get up and running with web scraping for junior to experienced developers.” — William Kennedy
“Being completely new to web scraping and Ruby reading your book was easier to grasp than digesting the documentation(s). Simply put, being overwhelmed with course work and a regular job it’s just been super handy to have your reference manual on hand so I can get to the meat without skimming!” — John Mayo
Questions you might have…
Who is this handbook for?
Ruby developers who want to learn about crawling and scraping websites with Ruby Mechanize. If you’ve been struggling to get your Mechanize code to work, this is for you.
What format is the handbook?
DRM-free PDF, ideal for on-screen reading on your desktop, your laptop, or your tablet. (There are no plans to offer print, ePub, or Kindle versions for the foreseeable future.)
Can I get the handbook for my entire team?
Sure! You can get a site license (25 seat maximum) for $299, with alternative invoicing and payment options available. Email me to buy a site license.
Is the payment process secure?
Yep! Credit card payments are processed securely through Stripe, with everything transmitted securely over TLS/SSL as you’d expect.
What if I’m not happy with the handbook?
Not a problem. If you’re not sure you received your money’s worth from the handbook just email me and I’ll refund your purchase in full.