My favorite answer is as what the first sentence in this thread suggested. Use an Adjacency List to maintain the hierarchy and use Nested Sets to query the hierarchy.
The problem up until now has been that the coversion method from an Adjacecy List to Nested Sets has been frightfully slow because most people use the extreme RBAR method known as a "Push Stack" to do the conversion and has been considered to be way to expensive to reach the Nirvana of the simplicity of maintenance by the Adjacency List and the awesome performance of Nested Sets. As a result, most people end up having to settle for one or the other especially if there are more than, say, a lousy 100,000 nodes or so. Using the push stack method can take a whole day to do the conversion on what MLM'ers would consider to be a small million node hierarchy.
I thought I'd give Celko a bit of competition by coming up with a method to convert an Adjacency List to Nested sets at speeds that just seem impossible. Here's the performance of the push stack method on my i5 laptop.
Duration for 1,000 Nodes = 00:00:00:870
Duration for 10,000 Nodes = 00:01:01:783 (70 times slower instead of just 10)
Duration for 100,000 Nodes = 00:49:59:730 (3,446 times slower instead of just 100)
Duration for 1,000,000 Nodes = 'Didn't even try this'
And here's the duration for the new method (with the push stack method in parenthesis).
Duration for 1,000 Nodes = 00:00:00:053 (compared to 00:00:00:870)
Duration for 10,000 Nodes = 00:00:00:323 (compared to 00:01:01:783)
Duration for 100,000 Nodes = 00:00:03:867 (compared to 00:49:59:730)
Duration for 1,000,000 Nodes = 00:00:54:283 (compared to something like 2 days!!!)
Yes, that's correct. 1 million nodes converted in less than a minute and 100,000 nodes in under 4 seconds.
You can read about the new method and get a copy of the code at the following URL.
http://www.sqlservercentral.com/articles/Hierarchy/94040/
I also developed a "pre-aggregated" hierarchy using similar methods. MLM'ers and people making bills of materials will be particularly interested in this article.
http://www.sqlservercentral.com/articles/T-SQL/94570/
If you do stop by to take a look at either article, jump into the "Join the discussion" link and let me know what you think.
I use a package.json
for my packages and a config.js
for my configuration, which looks like:
var config = {};
config.twitter = {};
config.redis = {};
config.web = {};
config.default_stuff = ['red','green','blue','apple','yellow','orange','politics'];
config.twitter.user_name = process.env.TWITTER_USER || 'username';
config.twitter.password= process.env.TWITTER_PASSWORD || 'password';
config.redis.uri = process.env.DUOSTACK_DB_REDIS;
config.redis.host = 'hostname';
config.redis.port = 6379;
config.web.port = process.env.WEB_PORT || 9980;
module.exports = config;
I load the config from my project:
var config = require('./config');
and then I can access my things from config.db_host
, config.db_port
, etc... This lets me either use hardcoded parameters, or parameters stored in environmental variables if I don't want to store passwords in source control.
I also generate a package.json
and insert a dependencies section:
"dependencies": {
"cradle": "0.5.5",
"jade": "0.10.4",
"redis": "0.5.11",
"socket.io": "0.6.16",
"twitter-node": "0.0.2",
"express": "2.2.0"
}
When I clone the project to my local machine, I run npm install
to install the packages. More info on that here.
The project is stored in GitHub, with remotes added for my production server.
Best Answer
I've been testing PouchDB, RxDB (which relies on PouchDB with RxJS streams for queries), Realm-JS (native database like sqlite3), FireStore. NeDB doesn't support remote sync.
I won't go into performance metrics details on each database, but PouchDB has been very slow and heavy on memory when querying more than 20.000 items (tried with indexeddb/websql adapter).
RxDB was generally much faster with that many items, especially when subscribing to query changes (tried with indexeddb/websql adapter too). Also schema and migrations are very handy.
FireStore is a good choice and comes with very easy setup of server and client components, but you'll need to be comfortable to run on a google platform. There is some flexibility with firebase functions if you want some control over server logic and there is customizable ACLs for your collections. Speed has been good and on par with RxDB. Comes with a very good auth module if you want it.
If you want to be able to scale much much more on the client side, which you probably don't, or if you want to make complex queries, I'd really recommend using something like realm. You'll need to compile for each platform like you've experienced with sqlite3, but there is sync, offline persistence, rich queries and great performance. There is just no way that javascript based solutions like PouchDB, RxDB or FireStore can compete, even with sqlite backends, since much of the computation will still happen in your precious JS thread. realm is doing much of its heavy lifting on the native library. I've been able to do LIKE "abc" queries on 100.000 items, returning hundreds of results, within less than hundred milliseconds and without noticeably freezing my UI or pumping up memory usage heavily. Supports client migrations too, which is nice.
In the end, there are multiple answers: 1. Want to host everything yourself and don't need massive scale client side (you can sync against subsets of your server data with filters), RxDB ist very good and comes with nice set of features. 2. Want very easy setup, nice modules like auth, server functions etc, and don't need massive scale on the client (can also sync against server data subsets with filters), FireStore is great. 3. Need lots of lots of data on the client, realm is my preference. I personally really dislike the realm sync platform for its pricing model (though technically it's cool), but the database itself is free and maybe you could try and implement a custom sync.
Take my results with a grain of salt, I've had some very specific challenges like large, non-relational collections and fulltext-search, your use-case will probably differ a lot.