> doesn’t end here. BigQuery has background processes that constantly look at al...

jandrewrogers · on April 27, 2016

There are a few ways to do it. It is not that difficult in principle, you need to collect selectivity statistics for both writes and queries and have a storage engine that is flexible enough to rewrite layouts on the fly. The mechanics are pretty simple, since rewriting a shard can be viewed as a trivial subset of splitting or replicating a shard under load. Some closed source databases also do this to one extent or another, adapting layout to load.

You can do adaptive layout rewriting at either the page or shard level, depending on the design. There are advantages and disadvantages to both models. Some designs can do layout conversion in place without the need for garbage collection but it is much trickier to do correctly.