Persistency and Performance Guarantees: Difference between revisions

From MemCP
Jump to navigation Jump to search
 
(8 intermediate revisions by the same user not shown)
Line 1: Line 1:
MemCP gives the user several Guarantees for persistency and performance. These are the guarantees:
MemCP gives the user several Guarantees for persistency and performance. These are the guarantees:


== Persistency Guarantees ==
== Performance Guarantees and Read Speed ==
There are three persistency modes per table which are:
MemCP guarantees that:
 
* All reading operations are from RAM and are fast
* Insert/Update without any unique key check or foreign key check will scale over shards
* Aggregate functions like SUM, AVG will scale perfectly with the amount of CPUs


With these guarantees, you can add more CPU cores whenever you have more data to ensure the same query time even with growing data.
Please consider that memory bandwith still might be a bottleneck. Even modern 128 core CPUs do not scale better than 8 cores for uncompressed data (usually 40 GiB/s bandwith on tested machines). '''MemCPs [[Columnar Storage]] scales beyond that limit.'''
== Persistency Guarantees and Write Speed ==
There are five persistency modes per table which are:
* ENGINE = cache
* ENGINE = memory
* ENGINE = memory
* ENGINE = sloppy
* ENGINE = sloppy
* ENGINE = logged
* ENGINE = logged
* ENGINE = safe
* ENGINE = safe
In comparison to other RDBMs, MemCP can do this setting per table. MySQL/MariaDB for instance only allows a global setting which is sometimes not what you want.
In short, these engines guarantee different performance:
* cache+memory -> RAM speed, no persistency
* sloppy+logged -> HDD speed, persistency guarantees under circumstances
* safe -> Sync speed (slow) and persistency guarantees even in case of power outage during queries
=== ENGINE = cache ===
* all data is held in memory and only in memory
* in case of a crash, all data is gone
* in case the server is running out of RAM, the data can be deleted, too
* the schema is saved on disk
* after a recovery, the table starts empty
* fastest way to store data
* use it for session data, observer handles, caches and other data that can be recreated by the software


=== ENGINE = memory ===
=== ENGINE = memory ===
Line 13: Line 41:
* all data is held in memory and only in memory
* all data is held in memory and only in memory
* in case of a crash, all data is gone
* in case of a crash, all data is gone
* while the server is running, the data will be kept alive
* the schema is saved on disk
* the schema is saved on disk
* after a recovery, the table starts empty
* after a recovery, the table starts empty
Line 39: Line 68:
** in case of a power outage or kernel crash, data might be lost
** in case of a power outage or kernel crash, data might be lost
* allows buffering of the log file in return for the risk of data loss
* allows buffering of the log file in return for the risk of data loss
* use it for data where you need a high update performance but cannot afford losing the last 15 minutes of your work
* use it for data where you need a high update performance but cannot afford losing the last few seconds of your work


=== ENGINE = safe ===
=== ENGINE = safe ===
Line 51: Line 80:
* IO bound (limited to ~1,700 write operations per second)
* IO bound (limited to ~1,700 write operations per second)
* use it for accounting data or any data that must not be lost
* use it for accounting data or any data that must not be lost
== Performance Guarantees ==
MemCP guarantees that:
* Insert/Update without any unique key check or foreign key check will scale over shards
* Aggregate functions like SUM, AVG will scale perfectly with the amount of CPUs
With these guarantees, you can add more CPU cores whenever you have more data.

Latest revision as of 13:20, 9 March 2026

MemCP gives the user several Guarantees for persistency and performance. These are the guarantees:

Performance Guarantees and Read Speed

MemCP guarantees that:

  • All reading operations are from RAM and are fast
  • Insert/Update without any unique key check or foreign key check will scale over shards
  • Aggregate functions like SUM, AVG will scale perfectly with the amount of CPUs

With these guarantees, you can add more CPU cores whenever you have more data to ensure the same query time even with growing data.

Please consider that memory bandwith still might be a bottleneck. Even modern 128 core CPUs do not scale better than 8 cores for uncompressed data (usually 40 GiB/s bandwith on tested machines). MemCPs Columnar Storage scales beyond that limit.

Persistency Guarantees and Write Speed

There are five persistency modes per table which are:

  • ENGINE = cache
  • ENGINE = memory
  • ENGINE = sloppy
  • ENGINE = logged
  • ENGINE = safe

In comparison to other RDBMs, MemCP can do this setting per table. MySQL/MariaDB for instance only allows a global setting which is sometimes not what you want.

In short, these engines guarantee different performance:

  • cache+memory -> RAM speed, no persistency
  • sloppy+logged -> HDD speed, persistency guarantees under circumstances
  • safe -> Sync speed (slow) and persistency guarantees even in case of power outage during queries

ENGINE = cache

  • all data is held in memory and only in memory
  • in case of a crash, all data is gone
  • in case the server is running out of RAM, the data can be deleted, too
  • the schema is saved on disk
  • after a recovery, the table starts empty
  • fastest way to store data
  • use it for session data, observer handles, caches and other data that can be recreated by the software

ENGINE = memory

  • all data is held in memory and only in memory
  • in case of a crash, all data is gone
  • while the server is running, the data will be kept alive
  • the schema is saved on disk
  • after a recovery, the table starts empty
  • fastest way to store data
  • use it for session data, observer handles, caches and other data that can be recreated by the software

ENGINE = sloppy

  • all data is held in memory
  • the main storage is mirrored on disk
  • the delta storage is RAM-only
  • a main storage rebuild is triggered every 15 minutes, so data older than 15 minutes are guaranteed to be persistent
  • in case of a crash, the delta storage is gone, the main storage is recovered
  • after a recovery, some datasets or deletions that happend in the last 15 minutes before the crash may be gone
  • extremely fast way to store data without the fear of losing them in normal operation
  • use it for frequently updated tables with unimportant data like usage statistics or sensor data

ENGINE = logged

  • all data is held in memory
  • the main storage is mirrored on disk
  • changes to the delta storage are logged on disk files
  • an operation succeeds even if data is not synced to disk permanently yet
  • in case of a crash, data might be recovered
    • in case of a crash of the process, all data will be recovered
    • in case of a power outage or kernel crash, data might be lost
  • allows buffering of the log file in return for the risk of data loss
  • use it for data where you need a high update performance but cannot afford losing the last few seconds of your work

ENGINE = safe

  • all data is held in memory
  • the main storage is mirrored on disk
  • changes to the delta storage are logged on disk files
  • an operation only succeeds after the data is synced to disk permanently
  • in case of a crash, all data will be recovered
  • introduces delays for each transaction since the system has to wait for a write fence
  • IO bound (limited to ~1,700 write operations per second)
  • use it for accounting data or any data that must not be lost