Javascript – Parse an HTML string with JS

domhtmlhtml-parsingjavascript

I want to parse a string which contains HTML text. I want to do it in JavaScript.

I tried the Pure JavaScript HTML Parser library but it seems that it parses the HTML of my current page, not from a string. Because when I try the code below, it changes the title of my page:

var parser = new HTMLtoDOM("<html><head><title>titleTest</title></head><body><a href='test0'>test01</a><a href='test1'>test02</a><a href='test2'>test03</a></body></html>", document);

My goal is to extract links from an HTML external page that I read just like a string.

Do you know an API to do it?

Best Answer

Create a dummy DOM element and add the string to it. Then, you can manipulate it like any DOM element.

var el = document.createElement( 'html' );
el.innerHTML = "<html><head><title>titleTest</title></head><body><a href='test0'>test01</a><a href='test1'>test02</a><a href='test2'>test03</a></body></html>";

el.getElementsByTagName( 'a' ); // Live NodeList of your anchor elements

Edit: adding a jQuery answer to please the fans!

var el = $( '<div></div>' );
el.html("<html><head><title>titleTest</title></head><body><a href='test0'>test01</a><a href='test1'>test02</a><a href='test2'>test03</a></body></html>");

$('a', el) // All the anchor elements