Data Management at Scale

Piethein Strengholt
19 min readJul 11, 2023

Over the last few years, decentralized architectures have emerged as the new paradigm for managing data at large. They are meant to scale distribution of data between teams, while aiming for higher value and a faster time to market.

In this article, I would like to unpack how to implement such a federated design. We cover many different things. We’ll begin with a short reflection on your data strategy, and whether you should start with a centralized or decentralized approach. Then we’ll go through the phases of implementing a data architecture, from setting the strategic direction, to laying the foundation, to professionalizing your capabilities.

Note that much content comes from the book Data Management at Scale 2nd edition. If you would like to learn more or see the depth, I encourage you to read the full version of the abstract below.

A Brief Reflection on Your Data Journey

Before you jump on the data-driven bandwagon, ensure you have a data strategy in place. Whether you’re starting small or have a large set of use cases to implement, without a plan you’re doomed to fail. I see countless enterprises fail because they’re unable to bring everybody onboard or to articulate their strategy; because they don’t include business users or lack support from senior leadership. I can’t emphasize this enough, but before you start implementing any change, ensure you have a balcony view and a clear map guiding you in the right direction.