Main Page: Difference between revisions

From MemCP
Jump to navigation Jump to search
No edit summary
No edit summary
 
(18 intermediate revisions by the same user not shown)
Line 1: Line 1:
= MemCP – A Modern In-Memory Columnar Database =


=== What is memcp? ===
'''MemCP is a high-performance, in-memory, column-oriented database designed for modern workloads.''' 
[[File:Webapps.svg|left|frameless]]
It provides a lightweight, developer-friendly alternative to traditional relational databases such as MySQL, with a focus on speed, compression, and direct API integration.
memcp is an open-source, high-performance, columnar in-memory database that can handle both OLAP and OLTP workloads. It provides an alternative to proprietary analytical databases and aims to bring the benefits of columnar storage to the open-source world.


memcp is written in Golang and is designed to be portable and extensible, allowing developers to embed the database into their applications with ease. It is also designed with a focus on scalability and performance, making it a suitable choice for distributed applications.
----


== Key Features ==


===Features===
* '''High Performance''': NUMA-aware, parallelized query execution optimized for multicore CPUs, large caches, and NVMe SSDs. Handles both OLTP and OLAP workloads efficiently. 
* '''Columnar Storage''': Data is stored by column for improved compression, reduced memory footprint, and faster analytical queries. 
* '''In-Memory Operation''': Designed to keep data in memory, with configurable persistence backends for durability. 
* '''Built-in APIs''': Exposes RESTful endpoints directly from the database, reducing middleware overhead. 
* '''Compression''': Multiple strategies (bit-packing, dictionary encoding, sequence compression) reduce storage by up to 80% compared to MySQL/MariaDB. 
* '''Simple Deployment''': Start with a single <code>docker run</code> or <code>pm2 start</code> command. Lightweight footprint (~10MB). 
* '''Extensible''': Written in Go, with pluggable storage backends and custom frontend support (SQL, RDF, REST). 


*'''fast:''' MemCP is built with parallelization in mind. The parallelization pattern is made for minimal overhead.
----
*'''efficient:''' The average compression ratio is 1:5 (80% memory saving) compared to MySQL/MariaDB
 
*'''modern:''' MemCP is built for modern hardware with caches, NUMA memory, multicore CPUs, NVMe SSDs
== Why MemCP? ==
* '''versatile:''' Use it in big mainframes to gain analytical performance, use it in embedded systems to conserve flash lifetime
 
* Columnar storage: Stores data column-wise instead of row-wise, which allows for better compression, faster query execution, and more efficient use of memory.
Traditional relational databases were designed decades ago, optimized for spinning disks and single-core CPUs. 
* In-memory database: Stores all data in memory, which allows for extremely fast query execution.
MemCP rethinks the core design for today’s hardware and workloads:
*Build fast REST APIs directly in the database (they are faster because there is no network connection / SQL layer in between)[[File:MemCP Port.png|frameless]]
 
*OLAP and OLTP support: Can handle both online analytical processing (OLAP) and online transaction processing (OLTP) workloads.
* Real-time dashboards and analytics 
*Compression: Lots of compression formats are supported like bit-packing and dictionary encoding
* Data-heavy SaaS platforms 
* Scalability: Designed to scale on a single node with huge NUMA memory
* Embedded systems with limited resources 
*Adjustable persistency: Decide whether you want to persist a table or not or to just keep snapshots of a period of time
* High-throughput OLTP/OLAP hybrids 
<youtube>g29FR4Jwius</youtube>
 
----
 
== Quick Start ==
 
Clone and build MemCP from source:
 
<pre>
git clone https://github.com/launix-de/memcp
cd memcp
go get
make
pm2 start ./memcp ./data/
</pre>
 
Connect with MySQL tooling:
 
<pre>
mysql -u root -p -P 3307
Enter password: admin
</pre>
 
----
 
== MemCP vs. MySQL ==
 
{| class="wikitable"
! Feature
! MySQL
! MemCP
|-
| Storage Model
| Row-based
| Column-based (compressed)
|-
| Performance
| Good
| NUMA-optimized, in-memory
|-
| In-Memory Capable
| Limited
| Yes (default)
|-
| REST API Integration
| External
| Built-in
|-
| Installation Footprint
| ~150MB+
| ~10MB
|-
| Open Source
| ✅
| ✅
|}
 
----
 
== Architecture Overview ==
 
* '''Tables, Schemas, Columns''': Familiar SQL-style structures with a columnar physical layout.
* '''Transaction Model''': Supports both OLTP and OLAP semantics with delta + main storage. 
* '''Persistence''': Configurable storage backends (filesystem, S3, Ceph). 
* '''Frontends''': Multiple query interfaces:
  - SQL frontend (MySQL wire protocol + SQL over REST) 
  - RDF/graph query engine 
  - Custom APIs via in-database web apps 
 
----
 
== Documentation ==
 
* [[What is OLTP and OLAP]] 
* [[History of the MemCP project]] 
* [[Hardware Requirements]] 
* [[Persistency and Performance Guarantees]] 
* [[Comparison: MemCP vs. MySQL]] 
* [[Install MemCP with Docker|Install with Docker]]
* [[Compile MemCP from Source|Build from Source]] 
* [[Contributing]] 
* [[SQL over REST]] 
* [[In-Database WebApps|REST & Microservices]] 
 
----


https://www.youtube.com/watch?v=g29FR4Jwius


===Navigation===
===Navigation===
Line 32: Line 121:
*[[Persistency and Performance Guarantees]]  
*[[Persistency and Performance Guarantees]]  
*[[Current Status and Open Issues]]
*[[Current Status and Open Issues]]
*[[Comparison: MemCP vs. MySQL]]


====Getting Started====
====Getting Started====
Line 39: Line 129:
*[[Contributing]]  
*[[Contributing]]  
*[[Introduction to Scheme]]
*[[Introduction to Scheme]]
*[[Full SCM API documentation]]
====Administration====
* [[Deployment]]
* [[Migration from MySQL and PostgreSQL]]
* [[Settings]]
*[[Process Hibernation]]
*[[Performance Measurement]]
*[[MemCP Console]]


====Frontends====
====Frontends====
Line 45: Line 145:
*[[Supported SQL]]
*[[Supported SQL]]
*[[Advanced SQL Tutorial]]
*[[Advanced SQL Tutorial]]
*[[Replace MySQL with MemCP]]
*[[SQL over REST]]
*[[SQL over REST]]
*[[Database Tools compatibility with MemCP|Supported Tooling]]
*[[Database Tools compatibility with MemCP|Supported Tooling]]
*[[How SQL Operators are implemented on MemCP]]
*[[How SQL Operators are implemented on MemCP]]
*[[Add custom SQL operators to MemCP]]


=====RDF Frontend=====  
=====RDF Frontend=====  
Line 58: Line 158:


*[[In-Database WebApps|In-Database WebApps and REST Services]]
*[[In-Database WebApps|In-Database WebApps and REST Services]]
*[[MemCP for Microservices]]
*[[Websockets in MemCP]]
*[[Websockets in MemCP]]


====Administration====
==== Persistency Backends (= Storage) ====


* [[Settings]]
* [[File System]]
*[[Process Hibernation]]
* [[S3 Buckets]]
*[[Performance Measurement]]
* [[Ceph/Rados]]
* [[Cluster Monitor]]


====Internals====
====Internals====
Line 74: Line 176:
*[[Columnar Storage]]
*[[Columnar Storage]]
*[[Transactions]]  
*[[Transactions]]  
*[[Full SCM API documentation]]
===== SCM Documentation =====
* [[SCM Builtins]]
* [[Arithmetic / Logic]]
* [[Strings]]
* [[Streams]]
* [[Lists]]
* [[Associative Lists / Dictionaries]]
* [[Date]]
* [[Vectors]]
* [[Parsers]]
* [[Sync]]
* [[IO]]
* [[Storage]]


=====Optimizations=====
=====Optimizations=====
Line 82: Line 200:




[[File:Screenshot from htop.png|center|frameless|2490x2490px]]
----
 
== Further Reading ==


* [https://github.org/launix-de/memcp MemCP on GitHub] 
* [https://www.vldb.org/pvldb/vol13/p2649-boncz.pdf VLDB Research Paper] 
* [https://cs.emis.de/LNI/Proceedings/Proceedings241/383.pdf LNI Proceedings Paper] 
* [https://www.dcs.bbk.ac.uk/~dell/teaching/cc/paper/sigmod10/p135-malewicz.pdf Large Graph Algorithms] 


===Further Reading===
Additional blog posts on design decisions, compression techniques, and performance optimization are available on the [https://launix.de/launix/ Launix blog].
[https://github.org/launix-de/memcp MemCP on Github]


====Scientific====
----


*[https://www.vldb.org/pvldb/vol13/p2649-boncz.pdf VLDB Research Paper]
== Community ==
*[https://cs.emis.de/LNI/Proceedings/Proceedings241/383.pdf LNI Proceedings Paper]
*[https://wwwdb.inf.tu-dresden.de/wp-content/uploads/T_2014_Master_Patrick_Damme.pdf TU Dresden Research Paper]
*[https://www.dcs.bbk.ac.uk/~dell/teaching/cc/paper/sigmod10/p135-malewicz.pdf Large Graph Algorithms]
*https://wwwdb.inf.tu-dresden.de/research-projects/eris/


====How MemCP was built====
MemCP is an open-source project maintained by developers for developers. 
Contributions are welcome — whether in the form of bug reports, feature requests, or pull requests. 


*[https://launix.de/launix/how-to-balance-a-database-between-olap-and-oltp-workflows/ Balancing OLAP and OLTP Workflows]
See: [[Contributing]]
*[https://launix.de/launix/designing-a-programming-language-for-distributed-systems-and-highly-parallel-algorithms/ Designing Programming Languages for Distributed Systems]
*[https://launix.de/launix/on-designing-an-interface-for-columnar-in-memory-storage-in-golang/ Columnar Storage Interface in Golang]
*[https://launix.de/launix/how-in-memory-compression-affects-performance/ Impact of In-Memory Compression on Performance]
*[https://launix.de/launix/memory-efficient-indices-for-in-memory-storages/ Memory-Efficient Indices for In-Memory Storages]
*[https://launix.de/launix/on-compressing-null-values-in-bit-compressed-integer-storages/ Compressing Null Values in Bit-Compressed Integer Storages]
*[https://launix.de/launix/when-the-benchmark-is-too-slow-golang-http-server-performance/ Improving Golang HTTP Server Performance]
*[https://launix.de/launix/how-to-benchmark-a-sql-database/ Benchmarking SQL Databases]
*[https://launix.de/launix/writing-a-sql-parser-in-scheme/ Writing a SQL Parser in Scheme]
*[https://launix.de/launix/accessing-memcp-via-scheme/ Accessing memcp via Scheme]
*[https://launix.de/launix/memcp-first-sql-query-is-correctly-executed/ First SQL Query in memcp]
*[https://launix.de/launix/sequence-compression-in-in-memory-database-yields-99-memory-savings-and-a-total-of-13/ Sequence Compression in In-Memory Database]
*[https://launix.de/launix/storing-a-bit-smaller-than-in-one-bit/ Storing Data Smaller Than One Bit]
*[https://www.youtube.com/watch?v=DWg4nx4KVLo memcp Announcement Video]

Latest revision as of 12:52, 22 September 2025

MemCP – A Modern In-Memory Columnar Database

MemCP is a high-performance, in-memory, column-oriented database designed for modern workloads. It provides a lightweight, developer-friendly alternative to traditional relational databases such as MySQL, with a focus on speed, compression, and direct API integration.


Key Features

  • High Performance: NUMA-aware, parallelized query execution optimized for multicore CPUs, large caches, and NVMe SSDs. Handles both OLTP and OLAP workloads efficiently.
  • Columnar Storage: Data is stored by column for improved compression, reduced memory footprint, and faster analytical queries.
  • In-Memory Operation: Designed to keep data in memory, with configurable persistence backends for durability.
  • Built-in APIs: Exposes RESTful endpoints directly from the database, reducing middleware overhead.
  • Compression: Multiple strategies (bit-packing, dictionary encoding, sequence compression) reduce storage by up to 80% compared to MySQL/MariaDB.
  • Simple Deployment: Start with a single docker run or pm2 start command. Lightweight footprint (~10MB).
  • Extensible: Written in Go, with pluggable storage backends and custom frontend support (SQL, RDF, REST).

Why MemCP?

Traditional relational databases were designed decades ago, optimized for spinning disks and single-core CPUs. MemCP rethinks the core design for today’s hardware and workloads:

  • Real-time dashboards and analytics
  • Data-heavy SaaS platforms
  • Embedded systems with limited resources
  • High-throughput OLTP/OLAP hybrids

Quick Start

Clone and build MemCP from source:

git clone https://github.com/launix-de/memcp
cd memcp
go get
make
pm2 start ./memcp ./data/

Connect with MySQL tooling:

mysql -u root -p -P 3307
Enter password: admin

MemCP vs. MySQL

Feature MySQL MemCP
Storage Model Row-based Column-based (compressed)
Performance Good NUMA-optimized, in-memory
In-Memory Capable Limited Yes (default)
REST API Integration External Built-in
Installation Footprint ~150MB+ ~10MB
Open Source

Architecture Overview

  • Tables, Schemas, Columns: Familiar SQL-style structures with a columnar physical layout.
  • Transaction Model: Supports both OLTP and OLAP semantics with delta + main storage.
  • Persistence: Configurable storage backends (filesystem, S3, Ceph).
  • Frontends: Multiple query interfaces:
 - SQL frontend (MySQL wire protocol + SQL over REST)  
 - RDF/graph query engine  
 - Custom APIs via in-database web apps  

Documentation



Navigation

Introduction

Getting Started

Administration

Frontends

SQL Frontend
RDF Frontend
Custom Frontends

Persistency Backends (= Storage)

Internals

How things work in MemCP
SCM Documentation
Optimizations



Further Reading

Additional blog posts on design decisions, compression techniques, and performance optimization are available on the Launix blog.


Community

MemCP is an open-source project maintained by developers for developers. Contributions are welcome — whether in the form of bug reports, feature requests, or pull requests.

See: Contributing