I need a remote backup software.
This software must have a client that run on windows and allow users to backup their data on a server that runs linux.
So it must be made of a client (windows) and a server (linux).
The windows client allows user to choose what to backup, then just click backup and the data gets copied to the server.
User can connect to server with his client and restore files on his computer.
User must be able to encrypt the data he send to the server, the data must be encrypted BEFORE uploading, so that the person that controls the server has no way to access the data.
The client must work with "chunks". This means that it should not just check if file has been modified, and if so upload the whole file, just check the differences between the remote file and the local file and only upload the differences. My idea was to split the local file in "chunks" (like 4k) then do a checksum and do the same on the server, and upload only if the checksum is different, then when the file is copied, do a md5 on the whole file to ensure it has been copied properly. I'm not a programmer, so I think there may be better ideas. This part is by far the most tricky part of this project, and the one I care more about, as I want a software that allows users to do a backup every day without taking hours, just by uploading the changes.
Please answer with your ideas on how to solve this problem
1) Complete and fully-functional working program(s) in executable form as well as complete source code of all work done.
2) Deliverables must be in ready-to-run condition, as follows (depending on the nature of the deliverables):
a) For web sites or other server-side deliverables intended to only ever exist in one place in the Buyer's environment--Deliverables must be installed by the Seller in ready-to-run condition in the Buyer's environment.
b) For all others including desktop software or software the buyer intends to distribute: A software installation package that will install the software in ready-to-run condition on the platform(s) specified in this bid request.
3) All deliverables will be considered "work made for hire" under U.S. Copyright law. Buyer will receive exclusive and complete copyrights to all work purchased. (No GPL, GNU, 3rd party components, etc. unless all copyright ramifications are explained AND AGREED TO by the buyer on the site per the coder's Seller Legal Agreement).
* * *This broadcast message was sent to all bidders on Friday Jan 7, 2005 6:22:10 PM:
I'll answer in public to all the questions about online backup software:
- If user changes password this should not affect the files.
=> My solution (if you find a better one you are welcome)
The server will generate a key, that will not be saved in clear on the hard disk of the
server. The key will be sent to the client when it requests it, but NEVER be saved on the
hd of the client. The first time the user will be asked to choose a password, that will be
transmitted over a secure stream to the server, that will then encrypt the key and save it
to hd. The key will be stored only encrypted on server hd. It'll stay decrypted only in
ram of both server and client when needed. If user wants to change password, the key on
the server is decrypted, then reencrypted with the new password.
- History must be kept
- Number of users must be unlimited. In order to make the system perfectly scalable, the
client will be able to choose which backup server to use. So when one server is too
overloaded, I can simply install all the server software on another server and give the
new address to the other users.
- The server must allow complex administration, like user quotas, megabytes transferred,
- Client must be able to recover individual files, entire directories, or full backup. It
must be possible also to recover old files revision (history)
- The fact that I mention 10,000$ in the maximum bid amount do not means I plan to spend
this kind of money. Please bid fair amount.
- The best idea I heard so far (IMHO) for incremental backup is making a given number of
cehcksum chunks, and the size of the chunk should depend of the size of the file. Then go
lower and lower splitting that chunk in smaller chunks. User must be able to configure:
- Chunk % (size of chunk in % of file size), so if I set 50% a file of 1mb will have 2
chunks of 500k
- Minimum chunk size: The minimum chunk size for example 32k
So with that information what will happen is that file will be cut in 2 chunks,
comparaison will be made, if chunk is the same, it's kept, if it's different it's divided
again in 2, then again in 2, etc.... until we reach 32k chunks. when we reach 32k chunks
and they differ, we just copy them from client to server.
A file may change it's size, or a chunk may be moved, but some patterns may still be
present. So what the software should be able to do is roughly to take chunks on server,
take some a small part of them, like 50 bytes, then look in the file if it can find these
50 bytes, if he can, he must try with several sizes by multiples of minimum chunk size. so
for example you take 50 bytes from the file on the server, then you search for them on the
client, i f you find them you try to take a checksum of the 32k that follow these 50 bytes,
and you check if it matches on the file that is on the client. if it does you grow by the
split ratio (so 2) so you check if you can match 64k, 128k, 256k, etc... until it doesn't
match. then when it stop matching you start going back by dividing by the split ratio (2)
the last addition until you reach the minimum chunk size, then you copy the whole chunk.
so for example:
on a file that is 10mb, you manage to match the 50 bytes
you try to match the following 32k, it match
you try 64
you try 128
you try 256
then it doesn't match
so we know correct size is between 512 and 1024
so we divide by 2 the last increment (512)
so we try to match 512+256= 768
it doesn't match
so we divide by 2 the last increment
so we increment by the opposite of the split value (0.5) the added value:
it doesn't match
so we have to divide by split value the added value (64)
this is 32, so it's the minimum chunk size
so we take a chunk size of: 640
This should be made several times at different places.
I'm not a programmer nor a mathematician, I hope someone will come with a better idea, my
ultimate goal is to transfer as little data as possible from client to server by using
- If you are able to do it, I would like to have also a small monitor that stays in the
traybar and that check which files are changes, and which changes are made, and put these
changes aside for uploading to the server later. this would be the ultimate compression
solution. if this traybar would be able to do backup realtime this would be even better.
- A compression before upload/download would be great
- Files on server must be stored in a proprietary format. A mysql database is available
but NOT for storing files, on ly file information (names, size, everything you want but not
the binary data)
- NO VISUAL BASIC