What happens when you type a URL in your browser and press ENTER?

Deyber Castañeda
codeburst
Published in
11 min readJan 5, 2021

--

It does not matters what profession or occupation you come from it is likely that you once have used your browser to search something, either a concept, how to fix something or simply check your favorite social network.
But, What is happening under the hood?
What is really our computer doing when we type the URL of the site we want to get?

The answers to those questions are the aim of this article.

Let’s go through the process step by step.
Suppose that you want to access holbertonschool.com from your browser:

Get input from Keyboard

When you type holbertonschool.com in the browser, you type it from your computer’s keyboard, and as you are typing it, your keyboard emits an event — that is, it signals the operating system (OS) that a state has been changed, and your OS records this change and responds to it. It’s like the moment you touch a hot pan — Your brain processes this and prepares your body for response (move hand away).

Event Handling

There are flags and keycodes to detect when a key is pressed, map which key is pressed and generate a response accordingly. Different keys on your keyboard evoke different responses. So when a key is pressed, the kernel signals the OS that it needs its immediate attention and resources. The OS makes the CPU (or central processing unit, a.k.a., the brain of the computer) respond to it by suspending its current activities, saving its state and executing the interrupt handler function. If you had typed “holbertonschool.com” in a text editor, your OS would have let the dedicated text editor program handle your interactions with it. But since you typed in the browser, it will let the browser application handle it.

What is a browser?

Let’s say you want to order food, and you choose to use Rappi (A famous company of food delivery in Colombia). While your food is getting ready, the Rappi driver goes to the restaurant — then grabs your food and delivers it to you. Voila! In minutes, you have your food.

But who is actually serving you — the restaurant or Rappi? In this case, Rappi is the middleman, letting the restaurant serve you.

Similarly, the browser is the medium that allows you to make a request and lets a server serve you. It’s software installed and running on your computer that lets you search the Internet. It takes your input, creates and sends a request, gets the response and serves you.

But wait — How does a Rappi driver know which restaurant to go to and how to find it? (Of course, Google Maps.) Now, how does your browser know which server to send the request to? Yes, you guessed it right, it needs to find its address. So, it queries the DNS (Domain Name Server) to find the IP.

DNS lookup

The DNS is the Internet’s version of Google Maps. It routes you to your destination. Your computer or your router knows the address of the DNS server. When you type the URL in a browser for the first time, it sends a request to the DNS server, which responds back with the IP address of the web server hosting, for example, holbertonschool.com. This value is usually then cached or gets added into the list of known hosts, so your browser doesn’t have to do this lookup every time.

DNS(Domain Name System) is a database that maintains the name of the website (URL) and the particular IP address it links to. Every single URL on the internet has a unique IP address assigned to it. The IP address belongs to the computer which hosts the server of the website we are requesting to access. For example, www.google.com has an IP address of 209.85.227.104. So if you’d like, you can reach www.google.com by typing http://209.85.227.104 on your browser. DNS is a list of URLs, and their IP addresses, like how a phone book is a list of names and their corresponding phone numbers.

The primary purpose of DNS is human-friendly navigation. You can easily access a website by typing the correct IP address for it on your browser, but imagine having to remember different sets of numbers for all the sites we regularly access? Therefore, it is easier to remember the name of the website using a URL and let DNS do the work for us by mapping it to the correct IP.

To find the DNS record, the browser checks four caches.

● First, it checks the browser cache. The browser maintains a repository of DNS records for a fixed duration for websites you have previously visited. So, it is the first place to run a DNS query.

● Second, the browser checks the OS cache. If it is not in the browser cache, the browser will make a system call to your underlying computer OS to fetch the record since the OS also maintains a cache of DNS records.

● Third, it checks the router cache. If it’s not on your computer, the browser will communicate with the router that maintains its’ own cache of DNS records.

● Fourth, it checks the ISP cache. If all steps fail, the browser will move on to the ISP. Your ISP maintains its’ own DNS server, which includes a cache of DNS records, which the browser would check with the last hope of finding your requested URL.

You may wonder why there are so many caches maintained at so many levels. Although our information being cached somewhere doesn’t make us feel very comfortable when it comes to privacy, caches are essential for regulating network traffic and improving data transfer times.

OSI model

Now that your browser knows the IP address of the server, it needs to find a way to pass this request all the way to the server. When you place the order, it’s not just you interacting with Rappi. There’s another end being managed — with the need to check that the restaurant is ready to accept the order, handle billing and payments, find the most reliable driver, and so on. Similarly, there’s a lot of stuff that needs to be managed for smooth communication between browser and server.

There’s something called an OSI (Open System Interconnection) model that standardizes communication between different computing machines. It describes the flow of information from one computer to another. It defines seven layers, and the interplay of these layers magically brings, for example, holbertonschool.com from server to your machine. At both ends (client and server), these layers are followed, but there is a difference in the flow of which layer kicks in first. When your browser sends the request, communication starts at the application layer and goes down to the physical layer — whereas in the server, while receiving the request, it would start at the physical layer, going up. On the other hand, when a server is responding to your browser’s request, it would go from application layer to physical layer — and when your computer receives the response, it would first go to the physical layer all the way back to the application layer.

7. Application layer: consists of protocols that directly interact with the end user. A protocol defines how different applications across machines communicate with each other. If you are requesting a web page, HTTP (Hyper Text Transfer Protocol) will handle it, and if you are sending an email, SMTP (Simple Mail Transfer Protocol) will handle it. So in the case of holbertonschool.com, your browser generates a HTTP request. Don’t confuse the browser as part of the application layer. The role of application layers comes in when your browser creates a HTTP request. This HTTP request is part of the application layer.

6. Presentation layer: Depending on your request (image, video, text, GIF, etc.), this layer converts and presents the data in readable format. In the case of holbertonschool.com, when your machine received it, the presentation layer would kick in to render it as a HTML page.

5. Session layer: responsible for establishing, maintaining, and terminating the session between devices. For example, when you are doing video chat, the time you enter into the chat to the time you leave it is one complete session, given there were no interruptions during that interval. However, in the case of holbertonschool.com, HTTP uses lower layer protocol, instead of session layer protocols.

4. Transport layer: takes care of the reliability, safety and security of the path taken between the request and response. Here, the transportation, delivery and assembling of data takes place. When you are requesting holbertonschool.com, essentially, you are not sending any data, but the role of this layer is more evident when you receive the data. The data your machine receives comes divided into packets with a sequence number assigned to each packet, called data payloads. This layer makes sure that you have received all packets and reassembles them in order. As I mentioned above, HTTP uses the TCP(Transfer Control Protocol) transport layer protocol instead of session layer protocols for establishing and maintaining a connection from your machine to the server to ensure reliable delivery. For security, it uses SSL (Secure Sockets Layer), which encrypts all data passed between browser and the web server, making all communications private and integral. In HTTP requests, it’s the job of TCP protocols to ensure fast and efficient delivery. In a similar way, Rappi has to make sure that all the requests are served well and are distributed across drivers.

3. Network layer: This organizes and routes the data. It also decides which transfer protocols to use. So in the case of holbertonschool.com, the best path to route the data between your machine and web server is determined by the IP (Internet Protocol).

2. Data link layer: In this layer, data is broken down into pieces. So when the server sends holbertonschool.com, it doesn’t send the entire page all at once; rather, the data link layer segments it, encapsulates it and transmits it as packets (data payloads) through the physical layer. It is not necessary that the packets be delivered directly to your machine. It may travel from network to network, passing through many machines before reaching you. So in this case, IP addresses with all of these hops are translated to hardware addresses, at the data link layer.

1. Physical layer: The physical layer deals with the actual connectivity between your machine and the server. The hardware and signaling and encoding mechanisms required to form the actual connection are defined at this layer, and the data received from the server is in the form of raw bits. Try ifconfig command in your terminal to check out the network interface configuration of your system.

So far, I have mainly talked from the client perspective. It’s time to understand what happens at the server end.

Once the browser receives the correct IP address, it will build a connection with the server that matches the IP address to transfer information. Browsers use internet protocols to build such connections. There are several different internet protocols that can be used, but TCP is the most common protocol used for many types of HTTP requests.

To transfer data packets between your computer(client) and the server, it is important to have a TCP connection established. This connection is established using a process called the TCP/IP three-way handshake. This is a three-step process where the client and the server exchange SYN(synchronize) and ACK(acknowledge) messages to establish a connection.

1. The client machine sends a SYN packet to the server over the internet, asking if it is open for new connections.

2. If the server has open ports that can accept and initiate new connections, it’ll respond with an ACKnowledgment of the SYN packet using a SYN/ACK packet.

3. The client will receive the SYN/ACK packet from the server and will acknowledge it by sending an ACK packet.

Then a TCP connection is established for data transmission!

Web server

This might mean a physical machine or software. In the case of holbertonschool.com, both work together to make sure that the website is accessible (up and running). At the hardware level, a web server is a machine (or collection of machines) that stores a website’s component files (e.g. HTML documents, images, videos, stylesheets, and JavaScript files) and delivers them to you. This is called hosting. At the software level, a web server, known as the HTTP server, controls how users access these hosted files. It processes and answers incoming requests. When you request holbertonschool.com, the HTTP server checks whether the requested URL matches any existing files, and if found, sends the files’ content back to the browser, or sends a “404 Not Found” error.

The server response contains the web page you requested as well as the status code, compression type (Content-Encoding), how to cache the page (Cache-Control), any cookies to set, privacy information, etc.

Example HTTP server responses:

If you look at the above response, the first line shows a status code. This is quite important as it tells us the status of the response. There are five types of statuses detailed using a numerical code.

● 1xx indicates an informational message only

● 2xx indicates success of some kind

● 3xx redirects the client to another URL

● 4xx indicates an error on the client’s part

● 5xx indicates an error on the server’s part

So, if you encountered an error, you can take a look at the HTTP response to check what type of status code you have received.

Load balancing

Popular websites have to serve several thousands of concurrent requests and return correct text, image and video responses to them. To serve a large number of requests, the content is usually distributed across multiple servers. A load balancer sits in front of these servers and acts as a traffic cop to direct traffic to the right server. It makes sure that no server is overloaded, and provides high availability and reliability by ensuring all requests are served. If a server goes down, it starts redirecting the requests to different servers that are online.

Firewall

Web servers use a firewall to protect the system against breaches and attacks. For example , if a bad source starts flooding the web server with a large number of concurrent requests, the firewall will detect the problem and block requests from that IP address to keep them from reaching the web server.

Application Server

An application server is a server specifically designed to run applications. The “server” includes both the hardware and software that provide an environment for programs to run.

A web server is designed — and often optimized — to serve webpages. Therefore, it may not have the resources to run demanding web applications. An application server provides the processing power and memory to run these applications in real-time. It also provides the environment to run specific applications. For example, a cloud service may need to process data on a Windows machine. A Linux-based server may provide the web interface for the cloud service, but it cannot run Windows applications. Therefore, it may send input data to a Windows-based application server. The application server can process the data, then return the result to the web server, which can output the result in a web browser.

Database

A database is an organized collection of structured information, or data, typically stored electronically in a computer system. A database is usually controlled by a database management system (DBMS). Together, the data and the DBMS, along with the applications that are associated with them, are referred to as a database system, often shortened to just database.
In the case of the Rappi example, the database is like the storage warehouse of the restaurant that is serving you.

Conclusion

I hope that with this overview you have more clarity about what happen when you type something in your browser and see that it is not like a magic box.

Some useful resourses:

--

--