PROJECT3331 consists of a set of open source auditable and documented cryptographic tools developed with the aim of being secure against known attacks (side-channel and timing attacks, brute-force attacks, ...) and also allow for learning cryptography and how to implement cryptographic algorithms.

skryb (beta-v1.0) source code	SKRYB, a simple markup language and compiler/translator for writing documentation and technical articles ( GPLv3 license)
skryb (beta-v.10) documentation	SKRYB documentation (WIP)

Basics of cryptography

What is cryptography and why does it matter?

Cryptography can be defined as the art/discipline or study of methods for secure communications. More specifically, the analysis and design of mathematical algorithms and protocols that allow for clear data to be rendered unintelligible. Though mathematics is a key element in this discipline, cryptography gathers a myriad of other fields: physics, electronics, computer science, ... that combine together to provide solutions to the security requirements of the different components of a communications system.

Nowadays, cryptographic methods are deployed to secure and authenticate almost every electronic transaction that goes through the gargantuan connected network of devices on which relies modern 21st century life: the Internet. Phone communications, bank transactions, web browsers, networking protocols, ... all rely heavily on standardized cryptographic algorithms and protocols that allow for a certain level of confidentiality as well as act as a deterrent for whoever whishes to eavesdrop on a communications channel. Imagine if anybody with a radio antenna could extract the content of banking communications emitted through WiFi or 4G and falsify it. The world would be in chaos! Cryptography is the only thing standing between our fairly ordered modern digital world and such chaos.

Cryptography is a field in constant evolution, in form and in content, and there are many references available on the web for you to try and wrap your intellect around the enormous wealth of notions and concepts that make up this discipline. Every security problem is multi-faceted and requires solid notions in mathematics, software engineering, electronics, and hardware architecture. This said, I would recommend any individual wishing to dive deep into the world of cryptography and security to broaden their spectrum of knowledge as much as possible and to try to adopt a systemic approach.

This article is an endeavor at concentrating a certain amount of knowldge around the subject into a single source with an emphasis on certain technical aspects related to use cases, implementation, weaknesses, and attacks.

Enjoy!

Digests

Hashing

Hashing consists in using a cryptographic one-way hash function to generate a unique non invertible cryptographic signature for a given input. To be considered secure, cryptographic hash functions must ensure that the generated signature exhibits certain properties regardless of the input. For example, one of the desired properties of hash functions is the avalanche effect which ensures that if a single bit is altered in an input, the generated signature will differ extensively from the original signature of the unaltered input. From the example below, we can easily observe that the two byte streams "Hello" and "Gello", which only differ by one bit, present very different SHA2-256 signatures.

		
		sha2_256(Hello) --> 185f8db32271fe25f561a6fc938b2e264306ec304eda518007d1764826381969
		sha2_256(Gello) --> bb083e1491c3d7fb97fb235b0e1b65e73190730c3293b9ca4d305f631f267833

Another key property of hash functions is signature uniqueness. In other words:

if hash(a) = hash(b) then a = b

If this condition is not satisfied, the hash function is not considered secure given that it is no longer insuring signature uniqueness for distinct inputs. Many collisions have been discovered for multiple digestshashing algorithms, for instance: MD4, SHA0, MD5, and SHA1, which have all been deprecated now and should by no means be used in a cryptographic protocol. For example, the two byte streams below generate the same MD5 signature: 79054025255fb1a26e4bc422aef54eb4.

The first byte stream:

	      d131dd02c5e6eec4 693d9a0698aff95c 2fcab58712467eab 4004583eb8fb7f89
	      55ad340609f4b302 83e488832571415a 085125e8f7cdc99f d91dbdf280373c5b
	      d8823e3156348f5b ae6dacd436c919c6 dd53e2b487da03fd 02396306d248cda0
	      e99f33420f577ee8 ce54b67080a80d1e c69821bcb6a88393 96f9652b6ff72a70

The second byte stream:

	      d131dd02c5e6eec4 693d9a0698aff95c 2fcab50712467eab 4004583eb8fb7f89
	      55ad340609f4b302 83e4888325f1415a 085125e8f7cdc99f d91dbd7280373c5b
	      d8823e3156348f5b ae6dacd436c919c6 dd53e23487da03fd 02396306d248cda0
	      e99f33420f577ee8 ce54b67080280d1e c69821bcb6a88393 96f965ab6ff72a70

In the case of MD5, not only were colliding byte streams found, but colliding files can be generated at will. This means that an attacker can swap a valid file with a corrupt one that has the same signature. For example, this could be used to swap signed certificates, or packages available for download.

Keyed hashing (MAC/HMAC)

Keyed hashing is a variant of hashing where a secret is added to the input byte stream (message), along with other parameters, in order to allow for authentication. The generated signature is therefore dependent on a secret and can only be reproduced by the entities knowing the secret. Keyed hashing is used for example to allow parties to authenticate and verify encrypted messages.

Suppose A wants to transfer a message M to B. Suppose A and B have already decided on an encryption key K.
Let E = encrypt(M, K) be the message M encrypted using the shared secret K. If A only transfers E, B has no means for authenticating the exchange or verifying whether the received data is valid and hasn't been corrupted by an attacker.
Now, suppose A decides to transfer E and a hash H = hash(E) in order to allow the receiver B to verify the transfered data. An attacker can still modify E and recompute H using the altered ciphered text. Therefore, simply attaching a signature/hash/digest does not provide strong authentication or verification.
One solution is to use keyed hashing. In this case, A will compute H = HMAC(E, K) and transfer it to B alongside E. Now, the signature is dependent on the shared secret making it almost impossible to alter the data without detection. Upon receiving the data from A, B will then compute the keyed hash H' = HMAC(E, K) of the received encrypted data and compare it to the received hash.

Symmetric ciphers

Symmetric ciphers are a family of ciphers that use the same secret key to encrypt and decrypt a byte stream. The main issue with symmetric ciphers is the key exchange which requires both parties to share the same key prior to any other exchange. This issue is generally resolved by using public key ciphers which will be discussed in a separate section.

Block ciphers

Block ciphers are a category of symmetric ciphers that operate on a byte stream by blocks of a fixed size. This means that if the length of the byte stream is not a multiple of the block size, padding must be applied. There are many block ciphers in nature each implementing different data scrambling schemes and mitigations against known attacks. For example, AES and Serpent use substitution/lookup tables (S-Boxes) which can be targeted by timing attacks if improperly implemented. On the other hand, the designers of Threefish avoided S-Boxes in order to make the algorithm more resistant against such attacks. Below, is a table showing some of the most common block ciphers and their parameters.

Algorithm	Block sizes in bits	Key sizes in bits	Number of rounds	Structure
Rijndael (AES)	128	128, 192 or 256	10, 12 or 14 (depending on key size)	Substitution-permutation network
Serpent	128	128, 192 or 256	32	Substitution-permutation network
Threefish	256, 512 or 1024	256, 512 or 1024	72 or 80 (for 1024-bit blocks)	Mix-and-permute
Twofish	128	128, 192 or 256	16	Feistel network
Blowfish	64	32 to 448	16	Feistel network
Kyznyechik	128	256	10	Substitution-permutation network
Camellia	128	128, 192 or 256	18 or 24	Feistel network

As you can observe, these algorithms make use of different scrambling structures (Feistel networks, ...) and can handle different key sizes and, in the case of Threefish, different block sizes as well. None of these algorithms have been completely broken yet and are still considered secure under certain conditions. For example, no effective or substancial attacks have been demonstrated on Blowfish, but it is no longer recommended for data larger than 4GiB given that its block size of 64 bits is considered too small and might be vulnerable to birthday attacks. Below is a figure showcasing the difference between a Feistel network and a Substitution-permutation network.

Source: researchgate

In a Feistel network, a block is split into two parts: a left part L₀, and a right part R₀. The right part is then fed into a round founction F_i which output is XORed to the left block L₀. The left and right blocks are then swapped before being used in the same manner as earlier with a new round function for the next iteration/round. On the other hand, in an SPN (Substitution Permutation Network), a block from the plaintext is XORed with the key before entering an S-Box (Substitution box) or a P-Box (Permutation box) which output is then used as input for the next round until the final round generates the final encrypted form of the block. Generally, the key is never used directly but goes rather through a key expansion step that generates a set of unique subkeys used for each round (K_i in the SPN diagram abobe).

Stream ciphers

Stream ciphers are another category of symmetric ciphers that operate directly on the bytes of a byte stream rather than blocks of bytes. Therefore, these ciphers do not require padding and are mainly used in cases where the clear message's length cannot be known in advance. Stream ciphers are generally faster and much easier to implement in hardware and are widely used to secure communications between small devices (micro-controllers, IoT, ...). For example, Chacha20 is a stream cipher that provides the same - some may argue higher - security level than AES. It was designed by Dan. J. Bernstein and uses a 256-bit key and a 96-bit (IETF version) or 64-bit (original version) nonce. It has been adopted, in combination with Poly1305 authentication, in TLS and DTLS, and was popularized by Google and OpenSSH. Below is a diagram showcasing the inner workings of Chacha20+Poly1305. This scheme allows for the encryption and authentication of a given plaintext and also associated data (AEAD - Authenticated Encryption and Additional Data).

Source: wikipedia

Block cipher modes

ECB - Electronic Code Book
CBC - Cipher Block Chaining
CFB - Cipher Feedback
OFB - Output Feedback
CTR - Counter
GCM - Galois/Counter Mode
XEX - Xor-Encrypt-Xor
XTS - XEX with Tweak and ciphertext Stealing

Asymmetric ciphers

Asymmetric ciphers are another category of cryptographic algorithms. They are generally based on algorithms requiring a key pair to encrypt and decrypt messages. One key is kept secret (the private key) and the other key is shared publicly (the public key) to allow for outsiders to securely communicate with the key holder.
For example, suppose a user A with the key pair (Pub_A, Priv_A) wants to communicate a message M to a user B with the key pair (Pub_B, Priv_B). User A will first download user B's public key and use it to encrypt M: E = Encrypt(M, Pub_B) and then proceed by transfering E. On reception, B will use its private key to cancel out the encryption and recover the clear message: M = Decrypt(E, Priv_B).
A more secure scheme that could also allow for the sender to be authenticated is when A encrypts the message twice using its private key first and B's public key after: E = Encrypt(Encrypt(M, Priv_A), Pub_B). This way, only B can remove the final encryption layer using its private key Priv_B, and then remove the first layer using A's public key: M = Decrypt(Decrypt(E, Priv_B), Pub_A). The last step allowing for B to authenticate A given that only A hold the private key that can be cancelled using the public key Pub_A.

Asymmetric ciphers are generally very slow and computationally demanding and are mainly used to perform key exchange and signing. The most

Signatures

Signatures are generally a cryptographic way to validate data or authenticate entities during an exchange. The signing process generally involves hashing and coupled with asymmetric key scheme.

Standards, best practices, and recommendations

Hashing
Encrypting
Signing

Random numbers
Passwords

Steganography
Attacks and cryptanalysis

Algorithmic cryptanalysis

Brutefore attack
Birthday attack
Related-key attack
Mod-n cryptanalysis
Differential cryptanalysis
Integral cryptanalysis
Linear cryptanalysis
Slide attack
XSL

Implementation attacks

Side-channel attacks
Man-In-The-Middle attacks
Power analysis
Acoustic cryptanalysis

Mitigations

Weakened standards!
Post Quantum Cryptography (PQC)

Related subjects

Computer architecture
Electronics and embedded systems
Cryptography, cryptanalysis, and steganography
Low level programming: amd64, aarch64, riscv, ...
Cybersecurity: pentesting, 0-day hunting, malware analysis, ...
High Performance Computing: parallelization, performace profiling and optimization, ...

skryb (beta-v1.0) source code

skryb (beta-v.10) documentation

Basics of cryptography

What is cryptography and why does it matter?

Digests

Symmetric ciphers

Asymmetric ciphers

Signatures

Standards, best practices, and recommendations

Random numbers

Passwords

Steganography

Attacks and cryptanalysis

Weakened standards!

Post Quantum Cryptography (PQC)

Related subjects

References