• Register
100 points
5 1

The answer is Tree data structure is generated from parsing an html document.

The word Parsing is defined as the dividing the document into components and check their syntactic roles if it does not fullfill then it generates error which is known as “Parse error”.The job of “HTML parser” is to parse the html markup into parse tree.The parse tree is tree of DOM(Document Object Model) elements and attribute nodes.It is basically the object-representation of html document.The root of the tree is “Document”.Root node has child nodes like head and body .Body node will have other child node like div node and further div node has h1 & p nodes as children.Head node has meta and title nodes as children.

You can see in below mentioned example.


<!DOCTYPE html>
<html lang="en">
    <meta charset="UTF-8">
    <meta http-equiv="X-UA-Compatible" content="IE=edge">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <div class="box">

The parsing algorithm which is given by HTML5 has two stages:

  1. Tokenization
  2. Tree construction.

Tokenization parsing the inputs into tokens.Tokenizer recognizes the token by searching head tag at first till ending of tag.For each tag it will search their respective close tag and generate token accordingly.Then after generating the tokens, it will give to tree constructer to build the tree and with the help of tree nodes it will create DOM that we discussed above.After that script execution checks the syntactic errors and go back to DOM if some synactic error is there and further  the skeleton of document will shown on the webpage.

This how parsing of an HTML document works on high-level basis.There are lots of things in parsing process if we go deep dive in this topic but as of now this article gives you enough understanding and motivation of working with HTML.