it all depends but you can guesstimate something with this approach: take a look at iostat and look at iops/sec on your disk. if you have typical database you're most probably limited by number of random seeks/sec, not the bandwidth.
- in maintenance window - run xtrabackup without throttling and see again what numbers of iops/sec can your system generate. say it's x.
- after that look how many iops/sec is typical for system during off-peak hours. say it's y.
based on this make some estimation how many iops/sec can you dedicate to backup job. i would calculate it as x - 2 * y or x - 3 * y to leave some headroom for spikes.
i think xtrabackup's parameter will be linearly proportional to iops/sec but not equal - so in last step use trial and error to tune throttle value so iostat shows you desired number of operations/sec.
alternativly use ionice [ a little about it here ], give your backup job low priority and do not throttle it at all. i'm doing it for rdiff-backup jobs - works quite nicely. note that ionice [ afaik ] works only with some io schedulers in linux.
First, you need to know which DBMS features you want and how well the systems do this. For example:
MySQL is fast for certain types of applications so you can service a high transaction volume on modest hardware
SQL Server comes with a good set of reporting tools, so you may not need to purchase third party tooling for this. However, it only runs on Windows.
Oracle has a JVM built into the server (do you really want to pay per-CPU Oracle licensing to run a java app?), has good support for large databases through good table partitioning, bitmap indexes and a variety of features that facilitate data warehouse applications.
Various database systems may or may not support the XA protocol for distributed transactions
Postgres has a spatial index and support for extensions and stored procedures in a variety of languages.
Teradata has a shared-nothing architecture with no central bottlenecks so it can scale out to an arbitrarily large data set.
Various SQL dialects supported by different system have greater or lesser feature sets or particular strengths.
Once you know which of the various proprietary or open source DBMS platforms can support your application and how well they do it you can decide which you want to use.
All of the open-source DBMS platforms have credible support offerings available either through the vendor or third parties. Needless to say these support offerings are commercial so they are not free. If you really do not need vendor support you could view an open-source system as free, but this is going to be an unusual situation. There is one corner case of particular interest, which is discussed below.
Open-source systems also de-couple support from the vendor - credible third-party support offerings are available for most if not all major open-source DBMS products.
Various DBMS platforms allow extensions to be developed - in fact, this technology was first pioneered on Postgres by Stonebreaker et. al. and was the main driver for the development of that system. Different platforms have greater or lesser support for this -
Oracle and SQL Server have some limited support.
Postgres has extensive support for extension right throughout the system.
Informix Online havs support for extensions known as 'blades' derived from Illustra (which was itself an early commercialised version of Postgres).
MySQL has a plug-in architecture that supports third-party storage engines.
If you have this particular requirement you may find that the open-source systems offer more flexibility. For example, there are several data warehousing products based on modified versions of Postgres.
Thus, open-source vs. proprietary is not a choice between free (as in beer) and paid for, rather is a matter of features, cost, confidence in the support options and control.
Best Answer
Re: PBXT vs XtraDB (InnoDB)
I phrased that question to Paul McCullagh directly. You can read his response here:
http://www.mysqlperformanceblog.com/2009/11/20/paul-mccullagh-answers-your-questions-about-pbxt/
To paraphrase: PBXT is a generic OLTP engine, so it does overlap considerably with InnoDB. Most of the direct 'better use cases' are not known yet.
Re: TokuDB vs XtraDB (InnoDB)
I think these are a little different. While TokuDB does have some properties that may be good for OLTP where it really shines is:
a) When you're dealing with so much data, that your inserts slow down from your indexes no longer fitting in memory (a 'classic' B-Tree problem for which TokuDB does not suffer).
b) When you need to have a lot of adhoc indexes on data.
Vadim talks about this here: http://www.mysqlperformanceblog.com/2009/04/28/detailed-review-of-tokutek-storage-engine/
--
From your description I'm going to maybe make an assumption and say that XtraDB and PBXT are the most obvious choices. Both will work. In XtraDB's favor is that it's been around for longer.
(Disclaimer: I work for Percona, authors of XtraDB).