How browsers work

Introduction

  • Browsers are the most widely used software.
  • Main function of a web browser is to present the web resource you choose by requesting it from a server and displaying it in the browser window.
  • Location of resource is specified using a Uniform Resource Identifier(URI)
  • The way the browser inteprets and displaysHTML files is specified in the HTML and CSS specs, which are maintained by the W3C.
  • Specs do not define UI elements a browser must have, but lists some common elements, i.e address bar, status bar, tool bar.

Browser High Level Structure

  • Main components:
    • The user interface:
      • includes address bar, back/forward button, bookmarking menu, all display parts of browser except window where you see the requested page.
    • The browser engine
      • marshals actions between UI and rendering engine.
    • The rednering engine
      • responsible for displaying requested content..i.e HTML or CSS.
    • Networking
      • for network calls such as HTTP requests, using different implementations for different platforms behind a platform-independent interface.
    • UI backend
      • used for drawing basic widgetslike combo boxes and windows.
      • exposes a generic interface that is not platform specific.
      • underneath it uses operating system user interface methods,
    • Javascript interpreter
      • parse and execute Js.
    • Data storage
      • persisitent layer
      • Browser may need to save all sorts of data locally, i.e cookies.
      • also support storage mechanisms such as localstorage, indexedDB, webSQL and filesystem
  • Chrome rus multiple instances of the rendering engine...for each tab.

Rendering Engines

  • I.E uses Trident, Firefox uses Gecko, Safari uses Webkit, Chrome and Opera use Blink(Webkit fork)
  • Webkit is an open source rendering engine which started as an engine for Linux platform before mod by Apple to support Mac and Windows

The main flow

  • rendering engine will start getting the contents of the requested document from the networking layer, usually done in 8KB chunks.
  • After that this is the basic flow of the rendering engine
    • Parsing HTML to construct the DOM tree --> Render tree construction --> Layout of the render tree --> Painting render tree
  • This is a gradual process.
  • For better user experience, the rendering engine will try to display contents on the screen as soon as possible.
  • Parsing, Grammars, Parser - Lexer combination.
  • Document --> Lexical Analysis --> Syntax Analysis --> Parse Tree
  • Parsing process is iterative, parser asks for next token, try to match it, if rule matches, added to parse tree and another token requested.
  • Raise an exception if the token doesnt match
  • Translation
  • Context Free Grammar is a grammar that can be entirely expressed in BNF.
  • Types of parsers
    • Top-down parser
      • examine high-level structure of the syntax and try to find a rule match.
    • Bottom-up parser
      • start with input and gradually transform it into syntax rules, starting from low level rules until high rules are met.
      • also called a shift reduce parser
  • Generating parsers
    • tools that generate parsers, feed them grammar of your language, vocabulary and syntax rules and they generate a working parser.
    • Webkit uses Flex for creating a lexer and Bison for creating a parser.
  • HTML parser
    • not a context free grammar
    • formal format for it is Document Type Definition
    • as opposed to XML, its more forgiving hence the lack for stiff definition
    • DOM
      • output tree is a tree of DOM element and attribute nodes
      • object representation of the HTML document and the interface of HTML elements to the outside world like Javascript.
      • has one-to-one relation to the markup.
    • HTML parsing is reentrant, source may change during parsing
    • contains two stages: tokenization and tree construction
    • Algo is expressed as a state machine, each state consumes one or more characters of the input stream and updates the next state according to those characters.
    • This means the same character may yield different results for the correct next state, depending on the current state.
    • Browser tolerance errors
  • CSS parsing
    • CSS is a CFG laguage hance parsers can be built for it.
    • Webkit uses Flex and Bison parser generators to create parsers automatically from CSS grammar files.
    • Firefox uses a top down manually written parser.
  • Script processing
    • initially was synchronous, parses and executed immediately script tag is reached.
    • defer keyword now included, execute after document has been parsed.
    • speculative parsing: onyl for external resources.
    • style sheets
  • Render Tree construction
    • while DOM tree is being constructed, browser constructs another tree, the render tree.
    • This tree is of visual elements in the order in which they will be displayed
    • It is the visual representation of the document
    • Purpose of this tree is to enable painting the contents in their correct order.
    • Firefox - frames, Webkit - renderer or render object.
    • A renderer knows how to lay out and paint itself and its children, usually represents a rectangular area usually corresponding to a node's CSS box as described by CSS2 spec.
    • render tree relation to DOM tree is not one-to-one as non-visual elements will not be inserted to render tree...i.e head, also els with display none and visibility hidden also elements with multiple components i.e select elements.
  • The flow of constructing the tree.
    • in firefox presentation is registered as a listener for DOM updates.
    • presentation delegates frame creation to the FrameConstructor and the constructor resolves style and creates a frame.
    • In webkit, process of resolving the style and creating a renderer is called "attachment", every node has an attach method.
    • Attachment is synchronous, node insertion to the DOM tree calls the new node "attach" method.
  • Style computation
    • calculating the style properties of each element, visual properties of each render object.
  • Sharing style data
  • Using the rule tree to compute style contexts
  • Style sheet cascade order
    • cascade order is from low to high
      • Browser declarations
      • User normal decl
      • Author normal dec
      • Author imp dec
      • User imp dec
    • declarations with the same order will be sorted by specificity and order they are specified.
      • Specificity
        • count 1 if the declaration it is from is style attribute rather than a rule with a selector,0 otherwise(= a)
        • count the number of ID attributes in the selector(= b)
        • count the number of other attributes and pseudo-classes in the selector(= c)
        • count the number of elements names and pseudo-elements in the selector(= d)
      • concatenating the four number gives the specificity
      • after rules are matched they are sorted according to cascade rules, Webkit uses bubble sort for small lists and merge sort for larger ones.
  • Webkit uses a flag to mark whether all top level style sheets have been loaded.
  • Layout
    • when renderer is created and added to the tree, it does not have a position and size, calculating these values is called layout or reflow.
    • HTML uses a flow based layout model, meaning its possible to compute geometry in a single pass.
    • Elements later in the flow dont affect those earlier in the flow.
    • The coordinate system is relative to the root frame. Top and left coordinates are used.
    • Layout is a recursive process, begins at root renderer and continues through all frame hierarchy, computing geometric information for each renderer that requires it.
  • Dirty bit system
    • in order not to do a full layout for every small change, browsers use a dirty bit system, a renderer that is changed or added marks itself and its children as dirty needing layout.
    • there are two flags, dirty and children are dirty for when renderer itself is okay but at least one of its children is dirty.
  • Global and incremental layout
    • layour can be triggered globally
      • screen resizing
      • global style change that affects all renderers, font size change
    • incremental layout is triggered when renderers are dirty, i.e when new renderes come in via network and need to be appended to render tree.
  • Asynchronous and snychronous layout
    • incremental layout is done asynchronously
    • global layout will be triggered synchronously
  • Optimizations
    • use of cache in cases of resizing and or change in renderer position
  • Layout process
    • parent renderer determines its own width.
    • parent goes over child and
      • place the child renderer(sets its x and y)
      • calls child layout if needed
      • parent uses childs accumulative heights and heights of margins and padding and sets its own height, used by parent renderer's parent.
      • sets its dirty bit to false.
      • width calculation
        • renderer's width is calculated using the container's block width, the renderer's style widht property, the margins and borders.
    • line breaking
  • Painting
    • render tree is traversed and renderers paint() method is called to display content on the screen, use the UI infrastructure component.
    • can also be global or incremental
    • Painting order
      • background color
      • background image
      • border
      • children
      • outline
  • Rendering engine is single threaded.
  • Everything except network operations happens in a single thread, in FIrefox and Safari this is the main browser thread, In chrome it is the tab process main thread
  • Network operations happen in sevral parallel threads, 2-6 max connections
  • Browser main thread is an event loop , infinite loop that keeps the process alive, waits for events and processes them.
  • CSS2 visual mode.
    • canvas: the space where the formatting structure is rendered.
    • box model
    • positioning scheme: normal, float, absolute
    • box types: block , inline
    • layered representation: z-index.