jducoeur | Looking for opinions about unconventional databases

You're viewing

jducoeur's journal
Create a Dreamwidth Account Learn More

Reload page in style: site light

This one is for the hardcore techies in the audience. Does anybody have any *practical* experience with the NoSQL world? Querki doesn't fit the traditional SQL mold very well, so I'm comparing NoSQL options -- in particular, document-centric data stores, which are broadly the best match for our needs. So far, three options look particularly intriguing:

MongoDB -- very fast, a tad unreliable architecturally, lots of query power

CouchDB -- super-reliable by design

Postgres with hstore -- today's discovery (which prompted this question), which lets you mix traditional columns with key-value bags

I haven't worked with any of them, and trying to find unbiased info about the gotchas and performance tradeoffs online is frustrating. (And StackOverflow tends to frown on this sort of "who's better?" question.)

So any and all informed opinions about these or similar options would be greatly welcomed. Thanks...

Flat | Top-Level Comments Only

From:

goldsquare.livejournal.com

Isn't there also an open source distributed database that Google uses internally, and which they have released?

Largely I'm a SQL and tongs kind of guy. :-)

From:

jducoeur

BigTable, yes. I doubt that it matches my needs very well, and Apache Cassandra is probably more nicely open-sourcey in the same general mode, but I'm at least pondering those.

SQL's a terrible fit for this data, though. Querki's user-level data just plain isn't *square* in the way SQL wants -- it is, quite deliberately, non-tabular. The interesting thing about Postgres' hstore system is that it allows you to fit such irregular data side-by-side with more conventional tables...

From:

marphod.livejournal.com

From:

jducoeur

While the user-level data may not be relational in the way SQL wants (and I'd posit that that is only true of some of the user-level data), certainly the back-end data is. Users, settings, and user groups, access rights and owned objects, external references, and the like.

Oh, sure. It's always been an assumption that there would be a MySQL or Postgres instance to manage all of the critical stuff that needs to be transactional. (Which is really the important part for user data. Relational I could probably live without; transactions, not so much.)

The big advantage of MongoDB and CouchDB over traditional SQL is speed, speed, and speed. This begs the question; how important is speed to Querki? I may be wrong, but at least in the initial stages, it doesn't seem to be as necessary.

Yes and no. Querki is mostly an in-memory database itself, so the speed requirements are peculiar. Write speed is semi-irrelevant, so long as it's not pathological, since nothing important is going to block on that. More specifically, I don't care about latency in the slightest, and I only care that throughput is good enough to broadly keep up. By the nature of the actor-centric architecture, little speed hiccups on write shouldn't matter much.

OTOH, *read* speed is deathly critical -- when a Space has gone cold and gets freshly touched, I need to be able to sweep it into memory as fast as possible. That's a very large number of objects that I have to read in quickly. And there, I *suspect* that Postgres has the edge. (The few comparisons I've seen suggest so, although I have yet to see a really good benchmark.)

(Also, assuming you put an abstraction layer over the DB backend, switching DBs shouldn't be that hard.)

True in principle, and that's certainly the plan. Still, it would be a hassle at best, so I'd rather find the most appropriate solution now, and not have to revisit that decision in a year...