ADS

Featured

How do web applications handle user sessions?

In the beginning of the Internet, websites only served static content to users, containing information that was disseminated to all users, without taking into account who, why, and why they were seeing certain information.

But everything started to change with dynamic websites and pages, and mainly because of the primary resource that made the Internet what it is today: User session.

Let's understand a little more about what it is and how Web applications manage user sessions, in a generalized context, for simple understanding.


When a Web application is accessed, no matter what technology is used, by default, they all respond to a request from a browser, and deliver certain content, whether static or dynamic.

Static content is content that is not modified for users, at least, frequently, or by user, such as javascript files, style files (css), images, photos, icons, videos, and specific applications (applets, calculators, calendars, flash, etc.).

Dynamic content, on the other hand, has the characteristic of being processed during the moment that the user makes the request. A website that you enter on a certain page, and each time you access it, the content is different; not because it was updated by someone, but because it was processed at the same time as the request.

In the early days of dynamic pages, it was quite common for you to enter a website with a blue color, and randomly, enter a different layout, or regionalized with other items appearing, or even detect whether the user is in Brazil or abroad, and direct to the version in other languages.

But dynamic pages would not be as beneficial as the main Internet invention: the user session.

What is a user session? It is nothing more than the identification of a specific request, and the allocation of variables for that first request, identifying the sequencing of access continuity of that initial request, which can be a person or a robot.

A web browser, when used by a user, a person, for example, when accessing the Google website, the user types "www.google.com", the browser will be responsible for making the request, and how it does it this request?

1. First it resolves the text "www.google.com" on the domain name resolution servers, returning an IP address, and then opens a TCP connection to the destination of this address, and sends the request:

GET / HTTP/1.1
Host: www.google.com

With this, the web server will be responsible for responding with the content, no matter if it is static or dynamic, this depends on its technology, but it will respond, in fact, according to the Web standard, which would be something like:

200 Ok
Content-Type: text/html
Content-Lenght: 9
Hello World

And the user will see "Hello world" in the browser. But in this example, the web server did not create a user session, that is, it delivered the content, without being able to track future actions. Now notice what happens next:


GET / HTTP/1.1
Host: www.google.com

With this, the web server will be responsible for responding with the content, no matter if it is static or dynamic, this depends on its technology, but it will respond, in fact, according to the Web standard, which would be something like:

200 Ok
Content-Type: text/html
Content-Lenght: 9
Set-Cookie: abcdefg
Hello World

At this time, the web server delivered a cookie to be saved in the browser, and this web server stored this cookie in a new area in the server's RAM, to store application information for this user. Then, all next browser requests to the same address will be formed with a cookie.

GET / HTTP/1.1
Host: www.google.com
Cookie: abcdefg

And then the web server receiving this cookie, can personalize access, delivering new content. The area of RAM reserved by this space limited by this cookie, can be used for variables, which have their dynamic values.

Supposing that internally on this web server, there is an application that creates a variable "accountant", which calculates "accountant = accountant + 1", adding to each access, and that every time there is a accountant greater than 1, appear after the "Hello world", the text ", you visited X times!", We will have the following answer:

200 Ok
Content-Type: text/html
Content-Lenght: 36
Olá mundo, você visitou por 2 vezes!

Note that there was no delivery of a new "set-cookie", as the session has already been determined. Notice the following interactions:

GET / HTTP/1.1
Host: www.google.com
Cookie: abcdefg

200 Ok
Content-Type: text/html
Content-Lenght: 36
Olá mundo, você visitou por 3 vezes!

GET / HTTP/1.1
Host: www.google.com
Cookie: abcdefg

200 Ok
Content-Type: text/html
Content-Lenght: 36
Olá mundo, você visitou por 4 vezes!


And consecutively, until, if there is no control, the variable will burst, generating a stack overflow, which is an increase above the allowed. A common integer, in most programming languages, can only be noticed up to 65535. This can vary depending on the programming language.

In any type of technology, determining a user session is deterministic to be able to create web applications, of any type.

Without a user session, it is impossible to be able to assign a flow to this user.

In Europe, websites need to explicitly warn users that websites have cookies to track the user, and essentially made sure that all websites have this warning, because currently, all websites are robust web applications that need to have a user session, even if the user is not actually "logged in".

This is because the way that web applications are constituted, in their beginnings, made session a standard, and technologies such as ASP and ASP.Net, for example, create session for the user, even if the server only has a static file with the extension ".asp" or ".aspx". Just because this file is processed by a handler, or a processor that will interpret the file, by default, before it even begins to interpret it, does it already deliver this cookie to the user.

Most of the time, it is impossible to disable the delivery of the session cookie of a certain technology so that it is not possible to track this user, at least, if it is only a static file.

However, currently, even a static file can be an application, which passes through a handler to process the request.

A handler is nothing more than something between the "get" and the "200" of the web server, which can modify and / or alter the content of the response.

In PHP when installed on the IIS server, the handler is configured for all files with the extension ".php", and then when the web server application receives a request with a request to a file with this extension, it passes the data from this file to the handler and the data output is sent to the end user.

This is also the same for ASP, ASP.Net, Java, etc.

In the case of Java, the situation is a little different with regard to the handler, as it is called a Servlet, and takes care of everything that comes to it, not just certain files or extensions. Even static files can be served by Servlets, who are there to write the contents of a dynamic image, for example.

Java applications can use session or not, but it is not possible to use session variables without determining the user's session, that is, without allocating a space in the application's RAM memory to save these variables. Without doing this, what is changed in session, is lost with each new request.

Applications in general, in any language, no matter what, need to link the cookie delivered to the end user, and a memory area, the difference being that they can be made in other ways, or the pattern can be ignored, without purposes of use.

Applications that need to work for the end user, and cannot use cookies, by any type of law, can make use of a method (which exists in ASP.Net by default, if you disable cookies) which is the use of placing the session of the user in the URL, and then, all the links that the person clicks, will have this same URL containing this session. This is the simplest way to track without cookies, but also insecure, as it is easily exposed to information in the URL.

Let's see below how the web server does this control:


GET / HTTP/1.1
Host: www.google.com
200 Ok
Content-Type: text/html
Location: www.google.com/abcdef/


GET /abcdef/ HTTP/1.1
Host: www.google.com
200 Ok
Content-Type: text/html
Content-Lenght: 9
Hello World

GET /abcdef/ HTTP/1.1
Host: www.google.com
200 Ok
Content-Type: text/html
Content-Lenght: 36
Hello world, you visited it 2 times!


Here is a classic way to be able to track the user, without using cookies. On the Web server side, it is always treated in the same way, generating a unique "abcdef" for each new request, without duplicating, and can never duplicate.

The memory area, for various types of technology, by default, is tied to a specific process, an executable, which is taken care of and handled by the handler or servlet, who are responsible for generating the cookie or code to track the user, and create the RAM space for that cookie.

Robots on the internet can also have a user session, and in general, robots do not store the session variable or cookie that the browser asks for, so whenever a robot accesses, and accesses again, it is created on the application side, new cookies, and new areas and spaces in RAM.

A bad intentioned robot can access a website more than 50 thousand times, and fill the RAM memory with unidentified accesses, as an example, accessing it without delivering the "cookie", or without redirecting to the URL containing a session code. The application, as in the case, would open 50 thousand spaces in the RAM memory for this, consuming too much memory, depending on what this handler needs to save this user.

Corporate applications, such as Java EE, use many objects in memory, usually service requests, contracts, virtual tables, all stored in session objects, and usually consume a lot of memory for each user.

The session treated so far does not concern whether the user is "logged in" or "authenticated", but only from the beginning of the session, how it is treated and generated as a basis for building web applications.

Authentication is nothing more than saving a session variable in RAM memory, identifying whether the user is authorized or not to access this application, and as in the example of counters, the application always checks whether the user needs to authenticate or not , pointing to the login screen.

Authenticating a user is nothing more than taking advantage of the user session and saving one more variable, among several that may or may not already exist.

It is based on the principle of session, is that you can go out buying products in virtual stores, placing products in the cart, and only after clicking on "buy", you actually register on the site.

You do not need to be authenticated or logged in to the site to be able to choose your products, in fact, you are already in a user session since viewing the homepage of the site.

Some programmers in early career, confuse this information from "user session" to "user authentication" a lot, and they are different things, since the session is the primary identification of a request and the allocation of variables in memory, and authentication , is the manipulation of these variables within an existing session.

It is common to find programmers who use programming languages ​​trying to authenticate users, and are unable to do this because the user session was not created, quite common in languages ​​like PHP, which require a specific code for the handler to start the session mechanism in order to be able to save variables, or else this data is lost.

This is generally seen as a problem, as more primitive languages ​​treat the user session transparently, and when using authentication, the session is already there ready for use.

Languages ​​like Python, Java, PHP, and the like, need to specifically note the use of the session embedded in the technology. This is because, there are frameworks that can provide other forms of user session, other than the standard, as per business rule, or specific government laws.

No comments