Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The downside is then you have many, many DBs to fight with, to monitor, to tune, etc.

This is rarely a problem when things are small, but as they grow, the bad schema decisions made by empowering DBA-less teams to run their own infra become glaringly obvious.



Not a downside to me. Each team maintains their own DB and pays for their own choices.

In the kitchen sink model all teams are tied together for performance and scalability, and some bad apple applications can ruin the party for everyone.

Seen this countless times doing due diligence on startups. The universal kitchen sink DB is almost always one of the major tech debt items.


> Not a downside to me. Each team maintains their own DB and pays for their own choices.

This is how you end up with the infamous "jira and confluence have two different markdown flavors" issue.


I don't think Jira and Confluence different markdown setup is due to them not sharing their databases. It is just poor product management from Attlasian.


My point is that forcing these arbitrary decisions is poor product management.


I’m a DBRE, which means it’s somehow always my fault until proven otherwise. And even then, it’s usually on me to work around the insane schema dreamt up by the devs.

Multi-tenant DBs can work fine as long as every app has its own users, everyone goes through a connection pooler / load balancer, and every user has rate limits. You want to write shitty queries that time out? Not my problem. Your GraphQL BFF bullshit is trying to make 10,000 QPS? Nope, sorry, try again later.

EDIT: I say “not my problem,” but as mentioned, it inevitably becomes my problem. Because “just unblock them so the site is functional” is far more attractive to the C-Suite than “slow down velocity to ensure the dev teams are doing things right.”


Or, you just avoid doing multi tenet from the start and none of those become your problem to unblock. What’s the downside?


Done that as well; it still becomes my problem because teams without RDBMS knowledge eventually break it, and… then I get paged.

Full Stack is a lie, and the sooner companies accept that and allow people to specialize again, and to pay for the extra headcount, the better off everyone will be.


I disagree I guess. Multiple companies I’ve worked at have broken up their shared db into many dbs that individual teams own the operations of, and it works just fine. At significant scale in traffic and # of eng. No central dbas needed - smaller databases require much less skills to manage. The teams that own them learn enough.


I agree. My gripe was everybody in the same schema with a global “app” user.


You forgot the modern mantra - dev team is always right!


Bad schema decisions are made regardless of whether you’re one database or 50. At least with many databases the problems are localized.


But then the DB Team – if you have one – is responsible for 50 databases, each full of their own unique problems.

This will undoubtedly go over poorly, but honestly I think every data decision should be gated through the DB Team (again, if you have them). Your proposed schema isn’t normalized? Straight to jail. You don’t want to learn SQL? Also straight to jail. You want to use a UUIDv4 as a primary key? Believe it or not, jail.

The most performant and referentially sound app in the world, because of jail.


No single team should not be responsible for all databases. If such team exists they will either become bottleneck for every other team (by auditing carefully each schema change) or become bloated and not utilized 90% of time, or (most common) they will become nearly useless or even harmful - they will not be really responsible and they will act as dumb proxy - they will introduce latency to the schema updates, but they will not bother to check them very well (why would they? they are not responsible for the whole product, just for the database), some DB refactoring/migrations will be totally abandoned because DB team will make them too painful.

DB team could act as an auditor and expert support, but they should never be fully responsible for DB layer.


> If such team exists they will either become bottleneck for every other team (by auditing carefully each schema change)

That’s the point. Would you send a backend code review to a frontend team? Why do DBs not deserve domain expertise, especially when the entire company depends on them?

> they are not responsible for the whole product, just for the database

I assure you, that’s a lot to be responsible for at scale.

> DB team could act as an auditor and expert support, but they should never be fully responsible for DB layer.

Again, the issue here is when the DB gets borked enough that a SME is required to fix it, they effectively do become responsible, because no CTO is going to accept, “sorry, we’ll be down for a couple of days because our team doesn’t really know how this thing works.”

And if your answer is, “AWS Premium Support,” they’ll just tell you to upsize the instance. Every time. That is not a long-term strategy.


What's the best non serial option for PKs in your view? Or do you prefer dual PK approach?


What’s wrong with uuidv4 as PK?


Serial integers always work better than any uuid as PKs, but the thing with uuid4 is that it disrupts any kind of index or physical ordering you decide to put on your data.

Uuids are really for external communication, not in-system organization.


FWIW this isn’t true anymore with newer uuid schemes like v7 that are roughly time sortable.


Serial index forces a synchronisation point on every entity that can create records. If this is only ever a single database that’s fine but plenty of apps can’t scale this way.


They don't. Clustered databases deal with parallel generation of them just fine.

They require periodic synchronization. What isn't a big deal at all and is required by many other database features.


If you have a sharded DB, each instance can get its own range of ints, which are periodically refreshed.

PlanetScale uses int PKs [0], and they seem to have scaled just fine.

[0]: https://github.com/planetscale/discussion/discussions/366


Anything non-k-sortable in a B[+,-]tree will cause a ton of page splits. This is a more noticeable performance impact in RDBMS with a clustered index (MySQL's InnoDB, MS SQL Server) [0], but it also impacts Postgres [1] in multiple [2] ways.

[0]: https://www.percona.com/blog/uuids-are-popular-but-bad-for-p...

[1]: https://www.cybertec-postgresql.com/en/unexpected-downsides-...

[2]: https://www.2ndquadrant.com/en/blog/on-the-impact-of-full-pa...


It's because I hate databases and programming separately. I would rather slow code then have to dig into some database procdure. Its just another level of separation thats too mentally hard to manage. Its like... my queries go into a VM and now I have to worry about how the VM is performing.

I wish and maybe there is a programming language with first class database support. I mean really first class not just let me run queries but almost like embedded into the language in a primal way where I can both deal with my database programming fancyness and my general development together.

Sincerely someone who inherited a project from a DBA.


The language you’re talking about is APEX. I believe it comes from Oracle and is the backend language for Salesforce development. You’ll like the first class database support but that’s about it.


> I mean really first class not just let me run queries but almost like embedded into the language

Not quite embedded into the OS, but Django is a damn good ORM. I say that as a DBRE, and someone obsessed with performance (inherent issues with interpreted languages aside).


The closest thing to what you're describing is Prisma in Node. It generates a Typescript file from your schema so you get code completion on your data. And it exists somewhere between a query builder and a traditional ORM.

I have worked in many languages with many ORMs and this has been my personal favorite.


Until Prisma can manage JOINs [0] there is no way I can recommend it.

[0]: https://github.com/prisma/prisma/discussions/12715


The support for JOINs is coming, currently under a feature flag [0]

[0]: https://github.com/prisma/prisma/issues/5184#issuecomment-18...


But the migration stuff is a horrible joke. No way to just rollback a broken migration. https://www.prisma.io/docs/orm/prisma-migrate/workflows/gene...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: