R – Using ruby and nokogiri to select ahrefs based on part of the URL


I have a document containing ahref links I want to extract. The link I want can be identified by part of the url they link to. There are other links that are similar which I want to discard.

The urls of the links I want are of the format


I want to search for links containing the h1=. Is this possible?

Best Answer

You can just do a normal find on the document's set of A-tags.

document.search('a').find {|link| link['href'].include? 'h1='}