passman (beta-v1.0) source code   |   a simple and secure password manager for Linux (GPLv3 license)   |
  passman (beta-v1.0) documentation   |   WIP documentation   |
Cryptography can be defined as the art/discipline or study of methods for secure communications. More specifically, the analysis and design of mathematical algorithms and protocols that allow for clear data to be rendered unintelligible. Though mathematics is a key element in this discipline, cryptography gathers a myriad of other fields: physics, electronics, computer science, ... that combine together to provide solutions to the security requirements of the different components of a communications system.
Nowadays, cryptographic methods are deployed to secure and authenticate almost every electronic transaction that goes through the gargantuan connected network of devices on which relies modern 21st century life: the Internet. Phone communications, bank transactions, web browsers, networking protocols, ... all rely heavily on standardized cryptographic algorithms and protocols that allow for a certain level of confidentiality as well as act as a deterrent for whoever whishes to eavesdrop on a communications channel. Imagine if anybody with a radio antenna could extract the content of banking communications emitted through WiFi or 4G and falsify it. The world would be in chaos! Cryptography is the only thing standing between our fairly ordered modern digital world and such chaos.
Cryptography is a field in constant evolution, in form and in content, and there are many references available on the web for you to try and wrap your intellect around the enormous wealth of notions and concepts that make up this discipline. Every security problem is multi-faceted and requires solid notions in mathematics, software engineering, electronics, and hardware architecture. This said, I would recommend any individual wishing to dive deep into the world of cryptography and security to broaden their spectrum of knowledge as much as possible and to try to adopt a systemic approach.
This article is an endeavor at concentrating a certain amount of knowldge around the subject into a single source with an emphasis on certain technical aspects related to use cases, implementation, weaknesses, and attacks.
Enjoy!
Hashing consists in using a cryptographic one-way hash function to generate a unique non invertible cryptographic signature for a given input.
To be considered secure, cryptographic hash functions must ensure that the generated signature exhibits certain properties regardless of the input.
For example, one of the desired properties of hash functions is the avalanche effect which ensures that if a single bit is altered in an input, the generated
signature will differ extensively from the original signature of the unaltered input. From the example below, we can easily observe that the two byte streams "Hello" and "Gello",
which only differ by one bit, present very different SHA2-256 signatures.
sha2_256(Hello) --> 185f8db32271fe25f561a6fc938b2e264306ec304eda518007d1764826381969 sha2_256(Gello) --> bb083e1491c3d7fb97fb235b0e1b65e73190730c3293b9ca4d305f631f267833
if hash(a) = hash(b) then a = b
d131dd02c5e6eec4 693d9a0698aff95c 2fcab58712467eab 4004583eb8fb7f89 55ad340609f4b302 83e488832571415a 085125e8f7cdc99f d91dbdf280373c5b d8823e3156348f5b ae6dacd436c919c6 dd53e2b487da03fd 02396306d248cda0 e99f33420f577ee8 ce54b67080a80d1e c69821bcb6a88393 96f9652b6ff72a70The second byte stream:
d131dd02c5e6eec4 693d9a0698aff95c 2fcab50712467eab 4004583eb8fb7f89 55ad340609f4b302 83e4888325f1415a 085125e8f7cdc99f d91dbd7280373c5b d8823e3156348f5b ae6dacd436c919c6 dd53e23487da03fd 02396306d248cda0 e99f33420f577ee8 ce54b67080280d1e c69821bcb6a88393 96f965ab6ff72a70In the case of MD5, not only were colliding byte streams found, but colliding files can be generated at will. This means that an attacker can swap a valid file with a corrupt one that has the same signature. For example, this could be used to swap signed certificates, or packages available for download. Below is a diagram showcasing a round of the MD5 hashing algorithm.
Keyed hashing is a variant of hashing where a secret is added to the input byte stream (message), along with other parameters, in order to allow for authentication. The generated signature is therefore dependent on a secret and can only be reproduced by the entities knowing the secret. Keyed hashing is used for example to allow parties to authenticate and verify encrypted messages.
Suppose A wants to transfer a message M to B. Suppose A and B have already decided on an encryption key K.
Let E = encrypt(M, K) be the message M encrypted using the shared secret K.
If A only transfers E, B has no means for authenticating or verifying whether the received data is valid and hasn't been corrupted by an attacker.
Suppose A decides to send E and a hash H = hash(E) in order to allow the receiver B to verify the transfered data.
An attacker can still modify E and recompute H using the altered ciphered text. Therefore, simply attaching a signature/hash/digest does not
provide strong authentication or verification.
One solution is to use keyed hashing. In this case, A will compute H = HMAC(E, K) and transfer it to B alongside E.
Now, the signature is dependent on the shared secret making it almost impossible to alter the data without detection. Upon receiving the data from A,
B will then compute the keyed hash H' = HMAC(E, K) of the received encrypted data and compare it to the received hash.
Symmetric ciphers are a family of ciphers that use the same secret key to encrypt and decrypt a byte stream. The main issue with symmetric ciphers is the key exchange which requires both parties to share the same key prior to any other exchange. This issue is generally resolved by using public key ciphers which will be discussed in a separate section.
Block ciphers are a category of symmetric ciphers that operate on a byte stream by blocks of a fixed size. This means that if the length of the byte stream is not a
multiple of the block size, padding will be applied. There are many block ciphers in nature each implementing different data scrambling schemes and mitigations
against known attacks.
For example, AES and Serpent use substitution/lookup tables (S-Boxes) which can be targeted by timing attacks. On the other hand, the designers of Threefish
avoided S-Boxes in order to make the algorithm more resistant against such attacks. Below, is a table showing some of the most common block ciphers and their
parameters.
Algorithm |   Block sizes in bits   |   Key sizes in bits   | Number of rounds | Structure |
---|---|---|---|---|
Rijndael (AES) | 128 | 128, 192 or 256 | 10, 12 or 14 (depending on key size) | Substitution-permutation network |
Serpent | 128 | 128, 192 or 256 | 32 | Substitution-permutation network |
Threefish | 256, 512 or 1024 | 256, 512 or 1024 | 72 or 80 (for 1024-bit blocks) | Mix-and-permute |
Twofish | 128 | 128, 192 or 256 | 16 | Feistel network |
Blowfish | 64 | 32 to 448 | 16 | Feistel network |
Kyznyechik | 128 | 256 | 10 | Substitution-permutation network |
Camellia | 128 | 128, 192 or 256 | 18 or 24 | Feistel network |
As you can observe, these algorithms make use of different scrambling structures (Feistel networks, ...) and can handle different key sizes and, in the case of Threefish, different block sizes as well. None of these algorithms have been completely broken yet and are still considered secure under certain conditions. For example, no effective or substancial attacks have been demonstrated on Blowfish, but it is no longer recommended for data larger than 4GiB given that its block size of 64 bits is considered too small and might be vulnerable to birthday attacks. Below is a figure showcasing the difference between a Feistel network and a Substitution-permutation network.
In a Feistel network, a block is split into two parts: a left part L0, and a right part R0. The right part is then fed into a round founction Fi which output is XORed to the left block L0. The left and right blocks are then swapped before being used in the same manner as earlier with a new round function for the next iteration/round.
Stream ciphers are another category of symmetric ciphers that operate directly on the bytes of a byte stream rather than blocks of bytes. Therefore, these ciphers do not require padding and
are mainly used in cases where the clear message's length cannot be known in advance. Stream ciphers are generally faster and much easier to implemented in hardware and are widely used to secure
communications between small devices (micro-controllers, IoT, ...). For example, Chacha20 is a stream cipher that provides the same - some may argue higher - security level than AES.
It was designed by Dan. J. Bernstein and uses a 256-bit key and a 96-bit (IETF version) or 64-bit (original version) nonce. It has been adopted, in combination with Poly1305 authentication,
in TLS and DTLS, and was popularized by Google and OpenSSH adoption. Below is a diagram showcasing the inner workings of Chacha20+Poly1305. This scheme allows for the encryption and authentication
of a given plaintext and also associated data (AEAD - Authenticated Encryption and Additional Data).