Light database

  • Thread starter Alexander Muylaert
  • Start date
S

Stefano Lanzavecchia

Also it is thread-safe (but not multi-threaded, nor does it implement a
particularly granular locking strategy), so there is no need to put in
locking yourself.

In Perst there's a "PersistentResource" (check the demo called TestConcur)
which shows how access to the storage can be made thread-safe by adding
resource locks. As I was saying, this only works as long as the access is
done in the same process by different threads. If one needs the database to
be shared from multiple processes or even separate physical machines, then
one's out of luck since Perst opens the file in exclusive mode and won't
allow other processes to access the data. To work round this limitation one
would be forced to write a server (using .NET remoting, probably).

As far as speed is concerned, I found it quite good. As a hobbist project I
am writing a web based forum which uses Perst as its storage backend and I
can tell you it easily beats everything else I tried, also considering the
fact that I don't need an intermediate O/R layer since Perst is effectively
an Object database.
 
S

Stefano Lanzavecchia

And I'd say it's faster than COM's structured storage technology... but of
course it won't be as fast as commercial databases, simply because (a) it's
totally managed code and (b) it uses reflection to deserialize rows
(instances), which itself is quite slow.

True. On the other hand: (a) it uses an intelligent caching mechanism (b) it
does not use reflection directly to marshal the fields of the persisted
classes, but it pre generates classes to do so (have a look at the file
"codegenerator.cs"). The strategy is analogous to the one explained in the
first part of this article:
http://msdn.microsoft.com/msdnmag/issues/04/07/NETMatters/default.aspx
 
J

John Wood

Yeah the code generation stuff is quite recent... I was the one who got him
to generate it in a background thread after startup, because it was taking
minutes to generate all the code for my 1000+ classes!

I have to say Perst is very very cool... he should really be out promoting
it!
 
J

John Wood

Interesting you're using it for a web-based back end. The biggest problem I
have with Perst is that it doesn't support concurrent index access... you
basically have to cache the result from the index somewhere with a mutex
protecting the operation, you can't traverse it bit-by-bit in parallel with
another thread. To me that hinders the scalability of it somewhat... how
many users have you had running the system in parallel?
 
K

Klaus H. Probst

Never seen one, sorry. I was going to write one last year but life intruded
=)
 
S

Stefano Lanzavecchia

John Wood said:
Yeah the code generation stuff is quite recent... I was the one who got him
to generate it in a background thread after startup, because it was taking
minutes to generate all the code for my 1000+ classes!

I see :) Nice!
I have to say Perst is very very cool... he should really be out promoting
it!

Well, maybe he does not want it to become too popular. After all Perst is
free and he is alone: a large user base would certainly mean a support
nightmare. Anyway since he's not doing, here we are doing a little bit of
advertisement for him ;) May Kostantin forgive us if he is, indeed, trying
to keep a low profile.
 
J

John Wood

Konstantin is a funny (and nice) guy... you'll ask for something, he'll tell
you that it's really difficult or impossible to implement, or that it's a
ridiculous idea, you'll send him a reply agreeing, and retracing your
request... then a few days later he'll release what you asked for, without
even telling you!

He frequently amazes me with the speed at which he develops things. God
knows where he finds the time, seeing as he has kids too!

I believe his next endevour is to write (yet another) source control system
in Java / C#.
 
F

Fred Mellender

I don't know if you need the locking and transaction mechanisms of a true
database (you do if you have multiple users going at it).

If not, have you considered just using the SortedList or HashTable class and
the serialization mechanism in C#. This assumes that the whole dataset will
fit into virtual memory. Given these restrictions, I have found that
serialization of an object containing an instance of SortedList solved a
similar problem I had.
 
S

Stefano Lanzavecchia

Klaus H. Probst said:
Never seen one, sorry. I was going to write one last year but life intruded
=)

Life is always so troublesome ;)
Anyway, recently I was struck by the idea of using the managed extensions to
C++ to do exactly that. I had the idea when I saw a SQLite wrapper built
that way: just compile the C++ code with the /clr switch and write the
classes you need around it. The managed bits to be written in C++ could be
kept to a minimum by writing very simple wrappers and then
extending/deriving the classes in C#. What kept me from doing it are the
following list of reasons:
- I am a bit lazy;
- I know next to nothing about managed C++;
- I know next to nothing about Berkeley DB which means that my wrappers
might be less than optimal.

By the way, the guys responsible for the ADO.NET data provider for SQLite
have switched back to using C-dll import but because they plan a Pocket PC
release for which, I guess, there is no managed C++
(http://sourceforge.net/project/shownotes.php?release_id=248431).

I would certainly love to see a Berkeley DB managed wrapper!
 
S

Stefano Lanzavecchia

Interesting you're using it for a web-based back end. The biggest problem
I
have with Perst is that it doesn't support concurrent index access... you
basically have to cache the result from the index somewhere with a mutex
protecting the operation, you can't traverse it bit-by-bit in parallel with
another thread. To me that hinders the scalability of it somewhat... how
many users have you had running the system in parallel?

As I said, so far it's only a spare time project. I only tested the system
with one user and don't expect to use have more than 50 very quiet users. As
long as I can get limit the access time to the backend (the bit of logic
that MUST be serialised because Perst is non-concurrent) to one hundredth to
one tenth of a second I would still obtain subsecond response time which
would be good enough for my application. I can see that the application is
inherently non-scalable but when I set off I had this set of self-imposed
requirements:
- only free 3rd part libraries and tools (VS.NET excluded);
- managed code: I don't mind interop but I don't have time to play with it,
so I could accept a solution using SQLite as long as somebody else did the
wrapping for me);
- zero cost deployment: no servers; in other words, everything had to be
file based.

The second requirement ruled out most RDBMs and OODBs. I was left pretty
much only with Access, SQLite and Perst. Since I did not want to write the
O/R layer and did not want to use an O/R tool that required a strict
discipline being the project in a very liquid state, Access and SQLite were
also ruled out. I considered the fact that I couldn't really scale Perst
unless I wrote my own server, but since the expected load is very light I
told myself that I could easily revisit the backend problem later if the
user base grew too much. I tried the .NET bindings of Gigabase (another one
of Kostantin's creations) and found that I could get quite far, but, as I
said, I didn't want to have to install a separate server, even if that
simply meant copying an executable on the web server and firing it up. Plus,
the .NET bindings of Gigabase don't support schema changes, so everytime I
added a property to one of my persisted classes I would have needed to
recreate my database.
Goods (yet another one of Kostantin's creations) also has a .NET binding but
being it based on transparent persistence based on ContextBound remoting
it's too slow for my goals.

The way I see it, the perfect solution for my problem, one that would scale
quite well up to a thousand users, would probably be SQLite coupled with a
good O/R tool that would let me play around freely and would easily let me
change my schemas without having to reset the database everytime I
whimsically felt like adding a new property to one of my persisted classes.
Alas, it's already quite hard to find a free O/R tool that leaves so much
freedom to the developer, much less likely to find one that supports SQLite.
An earlier version of my forum was based on Access with an O/R layer built
by the excellent DTM
(http://www.evaluant.com/en/solutions/dtm/default.aspx).
I guess DTM is too advanced to work with SQLite though I am still hoping.

Anyway, to make my long story short: at the moment the requirements for my
application fit perfectly the performance I can get from Perst even if I
serialise every request to the backend (I am using the PersistentResource to
lock the critical sections). If one day I need more, I hope that by then
things will have improved in the world of (file-based) OODBs in the .NET
world.
For instance, I am keeping an eye on
http://www.versant.net/eu_en/products/fastobjects_net/index which is far
from being free but could very well be worth it.
 
S

Stefano Lanzavecchia

He frequently amazes me with the speed at which he develops things. God
knows where he finds the time, seeing as he has kids too!

Yup! Recently I pointed to him a bug on a Saturday morning. He acknowledged
and released a fixed version a few hours later (release 2.26 with just that
bugfix). I diffed v2.25 and 2.26 to find an interesting sequence of changes
in some quite fundamental parts of the core. Amazing...
 
S

Stefano Lanzavecchia

Or try SQLite (http://www.sqlite.org/). Wrappers are available, ADO.NET
and
non-ADO.NET styles. It's very fast, and easy to use.

I hope those wrappers will be updated as soon as SQLite v3 is released
because BLOB support would certainly tickle my fancy...
 
K

Klaus H. Probst

Yeah, but that's going to be several magnitudes slower than using a real DB
like Berkeley.

BDB is *fast*. I have code that uses it from C++ and Python and it's scary
fast. Even if you don't need the locking functionality, transactions and
everything else.

Besides, BDB is just a mondo key-value pair storage system. You can adapt it
to just about anything.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top