Application security is crucial right from the early stages of web application development. Developers and other team members involved in the product design stage should at least be aware of the need for web application security. However, recent studies and discussions show that developers do not actively pay attention to the security of their code unless directed otherwise.
In this article, we focus on research conducted by the University of Bonn into the behavior patterns of students and developers who write code. We then examine some of the methods currently used to secure code and suggest a better, more secure solution.
Password-Storage Field Study on Students
In 2017 and 2018, researchers at the University of Bonn conducted research on two groups, with 20 Computer Science students each, about their habits when writing code to securely store passwords. The students were asked to develop a registration program for the university’s social network. One of the groups was explicitly told to store user passwords securely; the other group was not.
According to the results of the study:
- The participants in the group that was not instructed to securely store user passwords securely didn’t really make any effort to implement secure code
- Two participants from this group initially attempted to create a secure implementation, but then abandoned their efforts
- Of the 20 participants in the group that was explicitly asked to code a secure program, 12 implemented some level of security
The researcher wondered: How would the results compare if the study was conducted with web application software developers instead?
Though the student participants in the study knew that they were part of a research activity, they probably weren’t aware that security was part of the test. In fact, out of the 28 participants in total who completed the task, 15 of them implemented no security measures, probably because they assumed the exercise was just for testing purposes and not for real users.
Conducting the Same Research on Developers
Later, the same research was conducted on freelance developers hired from freelancer.com by the same researchers at the University of Bonn.
- The researchers posted an advertisement on freelancer.com, purporting to be a start-up company. They specifically asked for the assistance of developers on the registration function of a sample social network website.
- The 43 applicants were split into two groups. One of the groups was explicitly told to store the user passwords securely, while the other group was not.
- By the end of the submission deadline, an examination of the code of 18 participants revealed that it failed to include a secure password storage mechanism. Fifteen of these participants, 68%, belonged to the group that was not specifically instructed to implement secure code.
The results of the study conducted on developers from freelancer.com confirms the tendency, for developers not to implement security measures for storing user passwords, unless otherwise directed. In addition, participants also adopted the wrong approach to secure coding or opted for out-of-date implementations.
Developers Must be Directed to Write Secure Code
Both studies demonstrated that specifically requiring developers to write secure code plays a crucial role in the design of the application. The majority of freelance developers who were hired into a low budget program otherwise neglected to implement secure code.
In addition, when both studies were set side by side, the results achieved from developers who knew that the program had a real-time use had alarmingly similar outputs when compared with those that didn’t not know about real-time use!
Developers Don’t Have Enough Information About Security
The terms 'encryption' and 'hashing' are used interchangeably by developers. However, the code that they write betrays their true understanding of these terms. Eight out of 43 participants in the Bonn study used Base64-encoding as a means of secure password storage. Since Base64-encoding isn’t secure, those developers and students will need to improve and upgrade their knowledge on secure coding as they currently use out-of-date methods. Developers must follow the news and developments in web security. Those who do not will make dangerous mistakes, like the ten participants in the research who employed the weak hashing algorithm, MD5.
Copying and Pasting Code
Similar to the study conducted on students, the results of the study conducted on developers demonstrated that a total of 17 out of 43 developers copied solutions for security implementations from the internet.
Earlier in 2014, Rebecca Balebako conducted research on The Privacy and Security Behaviors of Smartphone App Developers. The project was based on an interview of 13 developers and a survey of 228 developers. The findings revealed that developers believed that they stored data securely. Yet the previous research conducted by the University of Bonn strongly suggested that this belief does not match what happens in the workplace in reality.
Here are the methods used by developers to secure user data, as captured in the research by the University of Bonn:
You can read more about the research in their paper, “If you want, I can store the encrypted password.” A Password-Storage Field Study with Freelance Developers.
Insecure Password Storage Methods
Encoding, encryption and hashing are different actions, yet these words are often used interchangeably. As seen in the research by the University of Bonn, developers often make the mistake of using weak methods – such as using encoding or encryption instead of hashing, or using inadequate hashing algorithms – in the hope of making user input secure. We need to examine these terms to find out the meaning, capability and limit of each.
The Difference Between Encoding and Encryption
Sometimes when a text is encoded, the output seems cryptic. For this reason, encoding is often mistaken for an encryption process. The main difference between encoding and encrypting is that, when encoding, you can directly reverse the output to access the original text. With encryption, however, you usually need a key to decrypt the text.
As observed in the University of Bonn research, and in data leaks, Base64 is one of the most common forms of encoding schemes. Encoding schemes are designed to encode the text with a different character set.
In the Base64 encoding scheme, the encoded output consists of 64 human-readable characters, usually A-Z a-z 0-9 and additional special characters. The purpose of using Base64 encoding is not to store passwords in an encrypted way. Rather, it is to display the data without characters that can't be printed or to avoid characters that have a special meaning within the program that processes the output.
For example, when you encode ‘Netsparker’ with Base64, you get
TmV0c3Bhcmtlcg==. The easily recognizable character set of Base64, and sometimes the padding with an equals sign at the end, can easily reveal the used encoding scheme to an attacker.
Encryption is another password storage method. It is technically more secure than encoding, yet still completely inadequate for storing passwords securely. Input is encrypted using a specific key and can only be decrypted using that same key. However, if attackers acquire the key, they can decrypt all the passwords back to their cleartext state. So if developers do not need the cleartext versions of the passwords – and usually they absolutely don't – they should be using hashing and salting instead to store user passwords and other valuable data.
What are Hash Algorithms and Salting?
Hashing involves processing and passing the password through a function, which maps the data to a hash sum of a fixed length. Usually, the password cannot be restored back to the original text by reapplying the operations of the hash function in the reverse order.
For example, when you hash ‘Netsparker’ with the SHA-256 hash function you get the following:
Storing passwords in the database after hashing them is one security method. However, although it seems safer to store them in cleartext, it has disadvantages too.
Using a Strong Hashing Algorithm
The strength of a hashing algorithm depends, among other factors, on how easy it is to craft two messages with the same hash sum, also known as ‘collision’. It’s important to point out that only weak hashing algorithms are susceptible to this issue. In February 2017, Google engineers managed to produce the same output from two separate PDF files using the SHA-1 algorithm, previously considered to be theoretically impossible to break. If you want to know how an attacker can figure out which hashing algorithm you use, you can read more about the Collision Based Hashing Algorithm Disclosure on the Netsparker blog.
Another weak hash algorithm is MD5. Two separate strings, when hashed with MD5, may produce the same output.
d1 31 dd 02 c5 e6 ee c4 69 3d 9a 06 98 af f9 5c
2f ca b5 87 12 46 7e ab 40 04 58 3e b8 fb 7f 89
55 ad 34 06 09 f4 b3 02 83 e4 88 83 25 71 41 5a
08 51 25 e8 f7 cd c9 9f d9 1d bd f2 80 37 3c 5b
d8 82 3e 31 56 34 8f 5b ae 6d ac d4 36 c9 19 c6
dd 53 e2 b4 87 da 03 fd 02 39 63 06 d2 48 cd a0
e9 9f 33 42 0f 57 7e e8 ce 54 b6 70 80 a8 0d 1e
c6 98 21 bc b6 a8 83 93 96 f9 65 2b 6f f7 2a 70
d1 31 dd 02 c5 e6 ee c4 69 3d 9a 06 98 af f9 5c
2f ca b5 07 12 46 7e ab 40 04 58 3e b8 fb 7f 89
55 ad 34 06 09 f4 b3 02 83 e4 88 83 25 f1 41 5a
08 51 25 e8 f7 cd c9 9f d9 1d bd 72 80 37 3c 5b
d8 82 3e 31 56 34 8f 5b ae 6d ac d4 36 c9 19 c6
dd 53 e2 34 87 da 03 fd 02 39 63 06 d2 48 cd a0
e9 9f 33 42 0f 57 7e e8 ce 54 b6 70 80 28 0d 1e
c6 98 21 bc b6 a8 83 93 96 f9 65 ab 6f f7 2a 70
The output for both of the inputs above is
79054025255fb1a26e4bc422aef54eb4 (Hex Digest).
Strong Algorithms May Not Be Enough
There is another important point to note about hashing, apart from the fact that hash algorithms are stronger if they don’t produce the same output for separate strings. When their input is the same, their results will be the same. So called Rainbow Tables are a problem for users that use common passwords. Attackers don't have to bruteforce each any every password hash of every user. Instead they only have to generate a list of hashes, which are mapped to their plaintext input once.
For example, a password such as
123456qwerty will have the following SHA 256 version on the database:
Attackers can discover this from a pre-made list.
How to Store Passwords Securely
The best way to store passwords is to hash and 'salt' them. Salting is the name given to the process by which a random value is generated for each user and then added to their plaintext password before hashing. Since the salt value is random and added to the plaintext value, the same password will result in totally different hash sums, and therefore values on the pre-made hash lists will not match those that are found in the database, in case of a data breach.
Here is the process that should be completed for each user password.
protected_password = hash(password + salt)
A correct implementation of the salt process would ensure that:
- The salt value is unique for each user
- The password is stored in a separate table or database
- The salt value is changed when the user changes their password
As we have observed from the University of Bonn’s research results, if we want the security to be written into the design of web applications by default, we need to start now to educate both students and developers about the latest, best, and most secure solutions. Otherwise, a lackadaisical attitude toward a lack of knowledge – even among developers – about handling and securing user data will cause confusion, and worse, open a back door for malicious users.