Share all-sizes files using reconstructable manifests#
Chopper is a CLI application which has been built to provide a simple way to exchange all-sizes files over the internet.
It relies on generic free storage providers which offer lot of little space: Chopper basically distributes all the parts of the files over those storages building a manifest that keep tracks of all the pieces.
The resulting manifest file can then be exchanged and used to let Chopper reconstruct the original file.
How to use
The way it can be used is very simple. If you want to chop a file (upload it and generate an exchangeable file):
If you want to increase chop redundancy:
chopper -r 3 filename.mp4
If you want to rebuild a file starting from its manifest:
chopper build filename.chop
Chopper is written using Python and has been thought to be free from every external plugin. So you won't need anything else but its source code to run it.
Installation from repositories is only available for Solus-Project users which have enabled Theca repository:
Run latest source code
git clone https://github.com/streambinder/chopper.git
make install || python3 setup.py install
Chopper operates in two different modes: chopping and glueing mode. Since the latter is very easy and most of the engineering relies on the first one, that will be ignored.
The first part of the process relies on chopping the file in several chunks, whose size depends on which storage provider is going to be used.
Basically, Chopper will use
Knife component to always request another chunk: this chunk is a byte sequence. In case of binary files, this chunk is translated to simple text using base85 algorithm. The single drawback of this, is that it results in a bigger overall file size. For instance, for every 1024 bytes of binary data requested,
Knife will provide 1280 bytes of base85 encoded text.
The consequence, in case of binary files, is that for every X file data requested for a chunk, the effective payload will be slightly bigger: (X / 4) * 5.
That's because base85 algorithm needs an additional byte for every 4 bytes it's going to encode.
This means that chopping a 10MB file will, at least, upload 12.5MB of data.
Uploading and downloading chunks
The chunk upload phase is actually depending on the provider the chunk is being pushed onto.
In fact, every storage provider is extending a generic (and abstract)
Provider class which imposes to define many methods, such as the most important
download() ones, but also many others to make them be properly handled by the whole process, such as the following:
enabled(): used to indicate whether the provider is usable or not
nice_name(): used to represent in a human-readable way the provider
is_supporting(): used to ask a provider if it's actually able to handle a URI to download content (chunks) from it
max_chunk_size(): used to indicate the maximum byte size sequence allowed by the provider
throttle(): used to throttle requests to the provider when a
ThrottlingException gets caught
The entry gets a payload as argument - which is assumed to be at max as the maximum size allowed by that provider - and pack it up in a request that will be done the way the provider class is taught to.
Every upload call must return a URI - if no exception is thrown - that can be passed to the
download() method to get the content back.
The download method accepts a URI string as argument and must always return a byte sequence.
Supporting new providers
The whole thing has been thought to be as extendible as possible: this means any provider can use its very own logic inside every method it must implement and override.
A manifest is a base64 encoded content which represents a JSON structured this way:
In order to assure more redundancy over the data, Chopper has been taught to provide the possibility to upload every chunk on several storage providers: this obviously increase the amount of data that is being pushed, along with an increase of the probability the file will be kept safe.