Also radix is a pretty special case because it assumes you want to sort by some relatively uninteresting criteria (be honest, how often are you sorting things by a number and only a number?). What happens in the real world is that the size of fields you want to sort on tends to grow in log n. If you had half a billion John Smiths using your service you’d use some other identifier that is unique, and unique values grow in length faster than logn.
I’m glad other people are having this conversation now and not just me.
Here’s a different take: if you find yourself needing to sort a lot of data all the time, maybe you should be storing it sorted in the first place.
Stop faffing about with college text book solutions to problems like you’re filling in a bingo card and use a real database. Otherwise known as Architecture.
I haven’t touched a radix sort for almost twenty years. In that time, hardware has sprouted an entire other layer of caching. I bet you’ll find that on real hardware, with complex objects, a production-ready Timsort is competitive with your hand written radix sort.
I develop programs that take unsorted data as input, and sort them as part of the processing... I haven't touched a database in 10 years, because I do graphics programming, nothing that needs a database. Also, I don't actually use radix sort but a hand written CUDA counting sort that beats the crap out of radix sort for my given use cases, since counting sort is often used as part of radix sort, but simple counting sort is all I need.
> and use a real database.
I'm not going to use a database to store tens of billions of 3D vertices. And I'm not going to use a database to sort them, because it's multiple times, probably orders of magnitude faster to sort them yourself.
It's weird to impose completely out-of-place suggestions onto someone who does something completely different to what you're thinking of.
Radix sort works for numbers and so also for characters. Then it also works for lexicographic ordering of finite lists of any type it supports. So it can sort strings. But also tuples like (int, string, float). So it can actually sort all plain old data types.
MSB Radix sort is a pretty good fit for this John Smiths input and will definitely outperform a comparison-based sort that has to check that "John Smith" equals itself n log n times.
I’m glad other people are having this conversation now and not just me.