# A private blockchain for files: part 2

Posted on 11 April, 2021

Technical details of the blockchain file system.

This series is split into two parts. Part 1 describes the high levels details of the program I made and what it can do. Part 2 goes into the technical details of the C# program.

Please see my GitHub repository for the full code. This post only gives an overview of the classes.

# Design

The Entity Relationship Diagram for the program is very basic:

As one would expect, there is a single blockchain which can own multiple blocks, and those blocks can own multiple tokens. There are a few extra helper classes: Hasher, Utilities, MerkleTree and PseudoToken. The actual program uses more classes, including several frontend classes and a few classes for loading from JSON. I’m not going to describe those here.

I wrote the code from the bottom up: first PseudoTokens, then Tokens, then Blocks and finally Blockchains. However I am going to described it from the top down, as I think that is better for understanding.

# Classes

## Utilities and Hasher

The Utilities class has two functions for converting to and from Unix dates to C# dates.
It has a Bytes_add_1 function which is used in the proof of work algorithm to increment an array of bytes.

The Hasher class abstracts the SHA-256 algorithm and handles the conversion between bytes, hexadecimal numbers and strings. It itself uses the inbuilt C# System.Security.Cryptography class to abstract the hashing.

A string such as “1a” can either be interpreted as a string or as a hexadecimal number. If it is interpreted as a string, its byte representation is found with ASCII values: {49, 97}. If it is interpreted as a hexadecimal number, it needs to be converted to a decimal number first: $1a=1(16) + 10 = 26$. Hence the byte representation is {26}. It is important to not mix these two situations because they result in different hashes.

## Blockchain

This is a stub of the Blockchain class:

The only fully public attributes are Name and BlockchainDirectory. They are not included in any hashes or data checks. The rest are set privately. MakeBlock() is a helper function to ensure that blocks increment their indices properly and carry over previous hashes. Otherwise blocks can be created indepedently of the blockchain with their own constructors.

CommitBlock() does multiple checks before committing a block:

• Indices must be sequential.
• The previous block hash must match the value stored in the new block.
• The previous block creation time must be before the new block’s creation time.
• The new block’s creation time must be before the system time - it cannot be in the future.
• The blockchain must be verified

After these checks it will call the ProofOfWork() function. Finally it will append the block to the list of blocks.

Verify() works by calling each block’s Verify() function.

## Block

This is a stub of the Block class:

Again the attribute for the directory is public. Proof of work is not essential in this program, so Target and Nonce are both public variables. Otherwise, all other attributes are set privately.

StageToken() adds a token to the Dictionary Tokens. It converts a PseudoToken, which is a struct with information about the Token, to a fully fledged Token which is linked to a valid file path.

The Block hash is calculated by hashing the Block header. This is an 80 byte array which is created by concatenating the following attributes together:

bytes
version 4
previousHash 32
MerkleRoot 32
TimeStamp 4
Target 4
Nonce 4

The MerkleRoot is a single hash which represents all of the Token hashes. Here is a graphical representation of the Merkle Tree:

Each of the Token hashes from L1 to Ln are paired together and hashed, then each of the resultant hashes are paired and hashed, and so on until only one hash remains. The class MerkleTree implements this hashing tree. It does not store the actual tree; it just returns the final hash.

Verify() does checks for the Block as well as calling each Token’s Verify() function.

## Token

This is a stub of the Token class:

The token is linked to a file, but I’ve chosen not to store the file data in it. It only stores the file name. The file data is loaded each time it is needed. This requires knowning the directory where the file resides but I’ve chosen not to store it. It therefore needs to be passed to the Print() and Verify() functions.

If a file is not found during printing, a warning is printed. But if a file is not found during verification, an error is thrown.

The Author and UserName attributes are manual inputs. The TimeStamp is the creation time of the token. I’ve thought about extracting the metadata from the file as well e.g. file creation date and file author. However I don’t think it will add much value.

Serialise() is the equivalent of a block’s GetHeader(). It creates a byte array which is then hashed. This byte array has a variable length and is formed as follows:

bytes
FileName variable
Author variable
FileHash 32
TimeStamp 4

The plain text information is converted to bytes according to ASCII codes, and the FileHash is converted from hexadecimal. As I mentioned before, it’s important to not this mix these two conversions.

## PseudoToken

A PseudoToken is a struct which holds all the data for creating a Token. This is the whole struct:

# Conclusion

Those are all the main classes in my file. I am also proud of my command line interface, but that is enough for its own post.