G
Guest
After losing a bit of code when a power cut happened I decided I wanted a
version control system that is second to none. SourceSafe doesn't fit my
needs (at home, anyway - it does fine at work) as I have to remember to check
in the files, have to keep all related projects in relative directory
structures, etc etc. and various other reasons, such as me not having a
licensed copy and too poor to afford one.
The criteria must be that whenever I save a file, it is backed up. Without
me having to do ANYTHING else. Other than make sure the service is running.
And if it's a new version of an existing file, it has to keep a record of
both versions.
I made a recent post about using my own home-grown encryption recently,
after being over 100 replies I can now see why that's not a good idea.
But I think my own home-grown compression algorithm IS a good idea, as I've
tested it, and after adding only 3 files my compression ratio was about 0.37,
what's more, there's nobody to be "up against" working in the other direction.
I'm not going to use huffman compression or RLE compression - because
they're for bytes. I'm only going to be storing my code files - .cs files,
..c/.cpp files, .h files, even .xml files - so I'm going to use my own type of
compression - word compression.
Basically , the filesystemwatcher in the service watches files, and any text
file that gets created or changed, it puts it into the database.
The definition of a text file is one that doesn't have any zero bytes in it,
so most operating system files will be out straight away.
The interesting bit is the definition of "put it in the database" -
basically using regular expressions, I parse it into words. Then, for each
word, if it's not already in the database, it puts it in the word table, but
if it's already in there, then it just stores a foreign key to the
already-existing row.
Now, you can imagine that words like "System" would be quite popular, hence
wouldn't take up much space. Also, the text between words can be just stored
as another word, so "}\n\t" would probably be quite popular aswell.
To keep the db small enough to always be able to add new/changed files, I'd
use some filtering and purging techniques.
Basically, has anyone got any suggestions on this, or opinions on whether it
will be likely to work, how useful it would be, whether anyone would be
interested in using it it it ever came to production?
And I don't think it will, but if anyone reckons it will sell, let me know
'cos I might aswell start writing it in unmanaged C right now. (The regex bit
already is.) If not, then I might just release the source code to all and
sundry anyway and see what they think.
Opinions / ideas ?
version control system that is second to none. SourceSafe doesn't fit my
needs (at home, anyway - it does fine at work) as I have to remember to check
in the files, have to keep all related projects in relative directory
structures, etc etc. and various other reasons, such as me not having a
licensed copy and too poor to afford one.
The criteria must be that whenever I save a file, it is backed up. Without
me having to do ANYTHING else. Other than make sure the service is running.
And if it's a new version of an existing file, it has to keep a record of
both versions.
I made a recent post about using my own home-grown encryption recently,
after being over 100 replies I can now see why that's not a good idea.
But I think my own home-grown compression algorithm IS a good idea, as I've
tested it, and after adding only 3 files my compression ratio was about 0.37,
what's more, there's nobody to be "up against" working in the other direction.
I'm not going to use huffman compression or RLE compression - because
they're for bytes. I'm only going to be storing my code files - .cs files,
..c/.cpp files, .h files, even .xml files - so I'm going to use my own type of
compression - word compression.
Basically , the filesystemwatcher in the service watches files, and any text
file that gets created or changed, it puts it into the database.
The definition of a text file is one that doesn't have any zero bytes in it,
so most operating system files will be out straight away.
The interesting bit is the definition of "put it in the database" -
basically using regular expressions, I parse it into words. Then, for each
word, if it's not already in the database, it puts it in the word table, but
if it's already in there, then it just stores a foreign key to the
already-existing row.
Now, you can imagine that words like "System" would be quite popular, hence
wouldn't take up much space. Also, the text between words can be just stored
as another word, so "}\n\t" would probably be quite popular aswell.
To keep the db small enough to always be able to add new/changed files, I'd
use some filtering and purging techniques.
Basically, has anyone got any suggestions on this, or opinions on whether it
will be likely to work, how useful it would be, whether anyone would be
interested in using it it it ever came to production?
And I don't think it will, but if anyone reckons it will sell, let me know
'cos I might aswell start writing it in unmanaged C right now. (The regex bit
already is.) If not, then I might just release the source code to all and
sundry anyway and see what they think.
Opinions / ideas ?