What happens when you type
https://www.holbertonschool.com in your browser and press '
The experience we get when browsing the web is a rather simplistic one, stream lined. We write the domain name of the website we want to visit, hit enter, and that’s it. What is going on under the hood is way more convoluted than that. In order to see what is happening, I will try to explain what goes on from the moment you hit enter, to the moment your browser displays the desired website on screen. Before I can do that though, I will need to explain certain concepts first.
The infrastructure of the web is such that most interactions on it imply a client and a server. Clients request data, and servers store and serve that data. You could consider your browser (Chrome, Firefox, Safari) a client. Server is a term that may refer to the physical hardware hosting that data, as well as the software that emulates said hardware, or the software that actually does the serving. Things get more complicated if we talk about bigger websites, since they can be hosted in several servers, and their interactions can be rather complex. With that in mind, let’s look at what happens when you type https://www.holbertonschool.com in your browser and press enter.
The first thing your browser does is convert the domain name into its IP address. Websites aren’t actually identified with names but with numbers, but since numbers are harder for us to remember, we created a kind of directory in which we can look by name and get a number. This is done via a DNS (Domain Name System) request. So, our browser actually asks the DNS for the IP address corresponding to the particular domain name we type in.
The next step when or browser tries to request from the server is the firewall. Most servers have some sort of protection against hackers or attacks, and in most cases, this kind of protection is a firewall. It is basically a software that sets certain rules of connection, figuring out who can access the server. It basically decides if it’s safe for the server or not to allow certain connections.
There’s also another protocol for safety, and that is the HTTPS (HyperText Transfer Protocol Secure.) It is a transfer protocol whose information is being encrypted. If you take a look at your browser, right where you write the domain name of the site you want to access, you may have noticed that certain sites display, next to their domain, a small padlock. This means that the information being transferred is using a secure protocol. This information could nonetheless be intercepted by a hacker, but since it’s encrypted it would be hard for the person intercepting the information to understand it.
The next thing servers generally have, especially servers for big websites, is a load balancer. Imagine that your website has a lot of clients requesting information at the same time, so much so that they saturate the capacity of the server and the website simply goes down. What a load balancer allows you to do is avoid putting all your eggs on the same basket and serve your site from various servers. It functions with algorithms that determine how much of a burden to put in each of the servers depending on their capacity and how they are being used. If someone wanted to put your site out of commission and you have only one server, they would just have to attack that one server, this couldn’t be possible with a load balancer. If you needed to update your site, you could make the changes you need in one of the servers, while the site is still active from the other servers. These cases are known as SPOFs (single point of failure) and should be avoided.
We now got into Holberton’s web server, which is a program that serves static content, like html pages or basic, noninteractive pages. Apache is one of the most used. Its responsibility is to figure out where the information corresponding to the address requested is, and serve it as an HTTP or HTTPS response. This only works for basic websites, though. For more complex, interactive sites, like Holberton’s site, our server will also need an application server. It is a software that can interact with databases and manage information in specific ways. Think for example of YouTube, you access YouTube with an account. You can also create, delete and modify your account. This is all taken care of by an application server.
If the app server interacts with a database, then we can infer that there must also be a database in Holberton’s server. A database is just a collection of data, and a database management system (DBMS) is what’s in charge of manipulating said data. MySQL is an example of a popular database system.
Finally, there’s a set of rules for how the information should be transmitted. This is taken care of by the TCP (transmission control protocol) which determines how the server and client are to interact, how the data should be transferred, packaged and received.
All of those are the steps involved from hitting enter on our browser and having the website displayed to us. So, to recap, our browser contacts the DNS to get the IP address of the site we are trying to access and uses that address to connect with it, generally through a secure protocol, surpassing the firewall and being redirected by the load balancer. Once the request arrives the server, it uses both the web and application server to respond with the information requested, which our browser interprets and prints on screen.
This is an overhead view and each of these topics is immensely complex, but I hope you have now a clear idea of what’s going on under the hood.