Javascript – Nokogiri: Sort Array of IDs according to order in HTML document

domjavascriptnokogirirubyruby-on-rails

I have an unsorted Array holding the following IDs:

@un_array = ['bar', 'para-3', 'para-2', 'para-7']

Is there a smart way of using Nokogiri (or plain Javascript) to sort the array according to the order of the IDs in the example HTML document below?

require 'rubygems'
require 'nokogiri'

value = Nokogiri::HTML.parse(<<-HTML_END)
  "<html>
    <head>
    </head>
    <body>
        <p id='para-1'>A</p>
        <div id='foo'>
            <p id='para-2'>B</p>
        <p id='para-3'>C</p>
            <div id='bar'>
                <p id='para-4'>D</p>
                <p id='para-5'>E</p>
                <p id='para-6'>F</p>
        </div>
         <p id='para-7'>G</p>
        </div>
        <p id='para-8'>H</p>
    </body>
    </html>"
HTML_END

In this case the resulting, sorted array should be:

['para-2', 'para-3', 'bar', 'para-7']

Best Answer

I don't know what Nokogiri is, but if you have the HTML code as a String, than it would be possible to get the order with regexp matching, for example:

var str = '<html>...</html>'; // the HTML code to check
var ids = ['bar', 'para-3', 'para-2', 'para-7']; // the array with all IDs to check
var reg = new RegExp('(?:id=[\'"])('+ids.join('|')+')(?:[\'"])','g') // the regexp
var result = [], tmp; // array holding the result and a temporary variable
while((tmp = reg.exec(str))!==null)result.push(tmp[1]); // matching the IDs
console.log(result); // ['para-2', 'para-3', 'bar', 'para-7']

using this code you have to be careful with IDs containing regexp meta-characters. They should be escaped first.