Postgres Meets Analytics: CDC From Neon to ClickHouse Via PeerDB
Combining ClickHouse and Neon for real-time analytics on transactional data
If you’re building a data-driven application that handles large amounts of data, you may need to balance two different types of databases: a purpose-built operational DB for your transactions, and an analytical database for large-scale data analysis. A purpose-built analytical database can handle the fast analytical processing you may need (e.g., for real-time dashboards) without burdening the transactional database, which remains optimized for quick and precise row-level operations.
Combining Neon and ClickHouse: operational efficiency with large-scale analytics
A nice pair to know are Neon and ClickHouse. Neon is a serverless Postgres service that excels at handling transactional workloads with cost-efficiency thanks to features like autoscaling, scale-to-zero, and branching. On the other hand, ClickHouse is a high-performance columnar database optimized for real-time analytics and handling large volumes of data with no compromise on speed and efficiency. Both Neon and ClickHouse evolve from the same ethos of open source, which makes this combination even more compelling.
By combining Neon and ClickHouse, you get the best of both worlds—Neon for building low-latency transactional web applications (OLTP) and ClickHouse for blazing-fast analytics (OLAP). Here are a couple important use cases that this combination enables:
- Real-time customer facing analytics: You can continuously sync data from Neon to ClickHouse using Change Data Capture (CDC) and use ClickHouse to run real-time customer-facing analytics on your transactional data.
- Application source of truth+ Data Warehousing: By sending CDC to a data warehouse like ClickHouse, Neon can serve as the system of record for your web application. This keeps your application code simpler and Postgres lightweight, ensuring super low latency for your transactional applications. Simpler code and you still benefit from the analytical power of ClickHouse.
PeerDB brings Neon and ClickHouse closer together
Now that we understand Neon and ClickHouse can be a great duo for developers, let’s see how we can bring them together. The answer is PeerDB – PeerDB is a leading provider of Change Data Capture (CDC) solutions for Postgres. Their technology enables real-time data replication and synchronization, helping businesses keep their data up-to-date and accessible for analytics and decision-making.
A few months ago, ClickHouse acquired PeerDB to make it seamless for customers to move data from their Postgres databases to ClickHouse and power real-time analytics and data warehousing use cases.
PeerDB natively supports Neon Postgres as a source. This enables customers to replicate their transactional data from Neon to ClickHouse for real-time analytics.
Under the hood, PeerDB relies on Postgres logical replication for CDC. It implements multiple Postgres native optimizations to provide a 10x faster and more reliable replication experience to ClickHouse. The good news is that Neon supports Postgres logical replication out-of-the-box. Hence all the PeerDB optimizations can be extended to Neon out-of-the-box.
PeerDB is available in both open source and cloud offerings, and the ClickHouse team is actively working on integrating PeerDB into ClickPipes, the native ingestion service for ClickHouse. PeerDB will power the Postgres CDC connector in ClickPipes. Neon is planned to be supported as a source in ClickPipes.
Streaming data from Neon to ClickHouse using PeerDB
Now, let’s quickly walk through how easy it is to set up replication from Neon to ClickHouse using PeerDB. Visit the PeerDB docs for a complete step-by-step walkthrough.
Configuring Neon (origin)
Detailed instructions here
Create a user and publication in Neon
In the Neon SQL console, create a user for PeerDB with the necessary permissions and set up a publication to start tracking data changes:
Enable logical replication in Neon
Go to your Neon project settings in the console, and enable logical replication. Also, verify the settings for wal_level
, max_wal_senders
, and max_replication_slots
:
Get connection details
Copy the connection string details from your Neon dashboard:
Creating a Neon peer
In PeerDB’s UI, create a Neon Postgres peer using the connection details. Validate and create the peer:
Configuring ClickHouse (destination)
Detailed instructions here
Create a destination database
Via the ClickHouse SQL Console, create a database for syncing:
Create a ClickHouse user with the necessary permissions
Set up a peerdb_user
with permissions for INSERT
, SELECT
, DROP
, and ALTER
:
Create a ClickHouse peer
In PeerDB UI, configure the peer with your connection details, and click “Create” once validating the connection:
Setting up CDC from Neon to ClickHouse
Detailed instructions here
Once you’ve configured Neon and ClickHouse in PeerDB, the data synchronization follows these steps:
1- Go to the Mirrors tab in the PeerDB UI:
2- Create a new Mirror and select CDC as the mirror type for real-time replication
3- Name the Mirror (e.g., neon_to_clickhouse
)
4- Select the source and destination peers you configured for Neon and ClickHouse
5- Configure sync settings according to the instructions here (e.g., initial snapshot, publication, and replication slot names)
6- Select the Postgres tables you want to replicate
7- Validate and Create the mirror. This will start the continuous sync between Neon and ClickHouse, ensuring real-time data updates.
Getting started
To get started:
- Create accounts with Neon and ClickHouse.
- Sign up PeerDB Cloud for seamless CDC from Neon to ClickHouse.
If you have any questions, we’ll be happy to help in the ClickHouse Slack and the Neon Discord.