How a Browser Works: A Beginner-Friendly Guide

You have surely worked with the browser every day to access web content but have you ever wonder how the browser works internally?
You type website domain/url in the address bar
You hit enter
and boom, a webpage is displayed
In less than a second, a browser has perform a dozens of tasks to display a single webpage.
Assume browser like a factory, when you just told what I need and there are multiple factory workers/machine working together and create a product for you.
Browser works the same.
What happen you type a URL and press Enter?
When Enter is pressed:
Checks in cache:
Browser checks if it already access this site earlier. This check is just to get the IP address of the hosted server
DNS lookup:
If the IP is not available in cache, it request to DNS to get the IP.
TCP connection:
Once the DNS provide the IP, browser start establishing a connection with the server.
Server responds with files.
Browser does its calculations and display the webpage content which comes from the server.

In this blog, we mainly focus on the calculations done by the browser from getting data from the server to display page to the viewer.
Main parts of a browser (high-level overview)
Browser is not a single program but a team of mini programs working together. Let’s first understand the overview of browser components.
User Interface:
It belongs to the component of browser which is visible directly for the interaction with the browser. It include everything expect where the website is displayed. Examples:
Address bar
Back/forward buttons
Bookmark menu
Settings
Tabs

Browser Engine
The browser engine is also called the brain/core of the browser. It act as a middleman between the User Interface and the Rendering Engine.
It receives commands from the browser User Interface and give instructions to the Rendering Engine to do some operations.
Rendering Engine
It is responsible for displaying the requested content on the screen. It parses the HTML (Hypertext Markup Language) and CSS (Cascading Style Sheet), and combine them to build. We will discuss more about this in details in further sections
Networking Layer
This component handles all network communications and HTTP requests. It manages DNS lookups (translating URL to IP addresses), handles TCP/IP connections, and fetches resources (HTML, images, CSS) from servers
JavaScript Interpreter
It is a engine that parses and executes JavaScript code. It interprets, compiles, and runs JavaScript to add interactivity to web pages
Disk API
As name suggests, it is used to store data in storage which a browser/web page needed. There are multiple components in browser data storage such as Cookies, LocalStorage, SessionStorage, IndexDB, and file system access.
Browser Engine vs Rendering Engine
| Browser Engine | Rendering Engine |
| Handlers actions between the UI and the rendering engine. | Parse the request HTML and CSS content and display on the screen |
| Manage task like networking, security and overall rendering process | Manage creating DOM, CSSOM, frame tree, repaints and painting process |
| Examples: Chromium, Servo | Example: Blink (Chrome), Webkit (Safari) |
How browser download HTML, CSS and JS files?
The process begins with when a user enters the URL in address bar and press Enter.
Browser perform a DNS look to find the servers IP’s address.
Once browser get the IP, it sends a HTTP GET request to the server.
Browser Networking Layer create a connection for the data transfer.
Now, the browser start receiving the HTML data, the browser engine instructs rendering engine to start parsing the HTML data and build the DOM (Document Object Model)
As it parses the HTML, it founds the tag referencing to the external JS/CSS file links such as
<link>or<script>Browser again sends the HTTP request to the corresponding servers of the link for the CSS and JS.
Once the CSS file is downloaded, the rendering engine starts parsing the CSS and generate a CSSOM (CSS Object Model). Until the CSSOM is created, browser stops rendering the HTML, to avoid rendering the page incorrectly and repaint.
Once JS file is downloaded, the JS Interpreter start compiling the JS downloaded and execute it. By default, the browser stops the parsing of HTML and CSS, but you can change the behaviour by using
async/deferattribute in thescripttag.If you want not to block the CSS and HTML parsing until the JS file get download, use
deferotherwise useasync
Now when all the necessary files has been downloaded, the rendering engine combines the DOM and CSSOM and with all other processing like frame constructor and reflow, it paints and display the web content on the browser.
Structural flow of Rendering Engine

Parsing
HTML parsing (HTML → DOM creation)
When the browser receives an HTML file, it first understand the structure instead of directly jump into displaying the content on screen.
Let’s first understand what parsing is?
Parsing is a way of reading the code, understanding the structure and breaking it into pieces.
In real world analogy, it is same like reading an english sentence from a book, understanding and separating the noun, pronoun, verb and adjectives. A HTML is look like this:
<body>
<div>
<h1>This is a heading</h1>
<p>This is a text</p>
</div>
<strong>Hey, It's me</strong>
</body>
Here, body and div is a container for h1 and p tag.
After parsing, the browser creates a tree like structure called Document Object Model (DOM).
body
/ \
strong div
/ \
h1 p
It cannot be visualise but in the browser backend, the structure is almost like the above.
CSS Parsing (CSSOM creating)
While the HTML builds structure, CSS builds the rules for styling.
Example CSS:
h1 { color: blue; }
p { font-size: 16px; }
The browser parse CSS by breaking it into the selectors and properties.
This become another tree like structure which called as CSSOM, but instead of elements it stores styling instructions.
DOM + CSSOM → Render Tree
Now the browser rendering engine has
Structure (DOM)
Style rules (CSSOM)
It combines them into a render tree. Render tree is something which contains only visible elements and their final styles.
So, in case if the element has display: none, it will not appear in the render tree.
Frame Tree / Reflow
Now the browser figures out where everything goes.
It calculates:
Width
Height
Position
Margins
Padding
Screen size responsiveness
This step is called Layout or Reflow.
Layout changes when window resizes, font load, DOM changes
Painting
After layout, the browser starts adding visuals:
Colors
Fonts
Borders
Shadows
Images
At this stage, elements become visually rich but still exist in layers.
Display
At this step the user actually sees the content on the page.
Layers are merged
Pixels are sent to the screen
Conslusion
HTML → DOM (Structure)
CSS → CSSOM (Style Rules)
DOM + CSSOM → Render Tree (Visible Elements)
Render Tree → Layout (Measure & Position)
Layout → Paint (Add Colors & Fonts)
Paint → Display (Pixels on Screen)




