Developers have launched OpenDuck, a new open-source project designed to bring distributed DuckDB capabilities to any environment. The project implements the architectural principles of differential storage and hybrid execution, making the cloud-native features of MotherDuck available for independent use and extension.
OpenDuck allows users to execute queries that split processing between a local machine and a remote worker. Through a custom DuckDB extension, the system identifies which parts of a query plan can run locally and which require remote resources, using 'bridge operators' to move only necessary intermediate results across the network.
Open protocol and backend flexibility
The project utilizes a minimal gRPC and Arrow IPC protocol to stream results. Because the protocol is open, developers can replace the included Rust gateway with any backend that supports gRPC and Arrow, preventing vendor lock-in.
OpenDuck handles data through append-only layers using PostgreSQL for metadata and object storage for immutable data layers. This approach allows for consistent reads via snapshots, providing a single write path that supports many concurrent readers.
While inspired by MotherDuck's commercial cloud service, OpenDuck is not wire-compatible with it. The project functions as a standalone, self-hosted alternative with its own 'openduck:' attachment scheme and an open-source implementation of the StorageExtension and Catalog interfaces.
Unlike Arrow Flight SQL, which acts as a generic database protocol, OpenDuck is purpose-built for DuckDB. It integrates directly into the DuckDB catalog, allowing remote tables to participate in joins, CTEs, and optimization as if they were local files.