Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It's unfortunate to namesquat on 'boolean' if your elements have three possible values. Just call it 'ternary logic' and let individuals decide which system to use.


Who's name squatting boolean? The bool column is exactly what it claims to be, you just have the option of introducing unknowability if you define it allow nulls.


The distinction is that not all formal logic systems are Boolean. Meaning that it is nonsensical and confusing to use "Boolean" as a generic synonym for "truth value" in the same way that it's nonsensical to use "Pantone" as a generic synonym for "color value", including when the specific kind of color value you're talking about is CMYK or HSV and definitely not Pantone.


There are two values, TRUE and FALSE. Null is not a value, it the the lack of a value.

You have a list of people and you ask if they own a car. You didn't get around to asking George, so that, somehow means he owns a car because you are using boolean logic? Or does it mean he doesn't own a car, because you are using boolean logic?

No, it means you haven't gathered this data point and don't know.


If there are exactly two possible values, TRUE and FALSE, you're working with Boolean logic.

If there are three possible values, TRUE, FALSE and NULL (unknown), then you're probably working with something like Kleene logic. You can't truly be working with Boolean logic, though, any more than you can be doing integer arithmetic when 15.37 is a member of your domain.

To put it another way, if we're talking about the mathematical definition of boolean algebra and not just some programming language's quirky implementation that happens to be called "bool", then boolean values would by definition be non-nullable. That logic that allows nulls has been implemented using the same unified set of keywords and operator names is a pragmatic decision that simplifies the language implementation and spec, not a principled one that tries to be pedantic about mathematical terminology.


> That logic that allows nulls has been implemented using the same unified set of keywords and operator names is a pragmatic decision

That's why it's name-squatting. Rather than introduce a 'kleene' datatype & operations, and let the user opt-in, they decided that in our 'bool' world, the following is not a truism:

  a = a or a <> a


> No, it means you haven't gathered this data point and don't know.

This is how it should be.

> Somehow means he owns a car because you are using boolean logic?

This is how it unfortunately is. There are 3 people, and there are 3 people who don't have a NULL car. Therefore George has a car.

  CREATE TABLE people(name text, carId uuid);
  INSERT INTO people values('Bill', '40c8a2d7-1eb9-40a9-b064-da358d6cee2b');
  INSERT INTO people values('Fred', '3446364a-e4a5-400f-bb67-cbcac5dc2254');
  INSERT INTO people values('George', NULL);

  SELECT Count(*) FROM people WHERE name NOT IN (
    SELECT name FROM people WHERE carId = NULL
  );
Elsewhere people have argued that NULL propagates, so that your small unknowns infect larger queries. I could get behind that line of thinking, but the query above confidently returns 3.


> There are 3 people, and there are 3 people who don't have a NULL car.

This is not what you are asking with your query: as someone else stated, NULL is meant to be "UNKNOWN", or "it could be any valid value".

So nothing is ever equal to something that can be anything, because even another NULL (i.e. unknown) value is in general different.

So in the line

    SELECT name FROM people WHERE carId = NULL
the condition will always be false. Now if instead if meant to search for the rows where carId is actually unknown you have to write

    SELECT name FROM people WHERE carId is NULL
And your query will return as one may expect 2.


> So nothing is ever equal to something that can be anything

It's worse than that. It's something is neither equal nor not-equal to something else.

Whether you can implement something differently as a workaround is immaterial. It's weird, per the title.


But it is a boolean value, there's only two possible values TRUE and FALSE. But because it's SQL you can define any column as TYPE | NULL.

You could say that a boolean column with a NULL value is FALSE like how a lot of programming languages coerce it but if you wanted that you would just make a default of FALSE. The meaning of NULL in general being "value not specified" lends itself pretty nicely to "either true or false."


What I want is for e.g. "x OR y" where y is NULL (and/or of nullable type) to be an error rather than silently giving surprising results. Just like in a decent programming language I can't do x || y where x and y are of type boolean?, I have to explicitly handle the case where one or other of them is null (or e.g. write x!! || y!! - and that will at still error if either is null rather than silently evaluating to a funny result).


The meaning of NULL in general being "value not specified" lends itself pretty nicely to "either true or false."

You mean neither true or false?


I think I mean either. So yes NULL is a distinct value from true and false so I think it's also right to say it's neither true nor false. But the value NULL does represent is roughly "could be true or false, we don't know yet."


It could mean anything or nothing depending on context, which is part of the problem. (Plenty of people think the meaning of NULL is clear and obvious and consistent, unfortunately they give different answers about what that meaning is)


It could also be neither. It's whatever you define it to be. Null could mean you don't know if it's true or if it's false, or it could mean you know it's neither true nor false.


This is the map territory relation problem.

We bring in the 'Trinity of thought' a priori and forgot about the advantages and costs.

You have the principal of excluded middle, principal of identity, and the principal of non-contradiction.

If your problems territory fits things are convenient.

Once you introduce known, unknown, or unknowable unknowns, the classic model falls down.

Unfortunately you have to choose what to sacrifice based on the context of the problem at hand.

This is exactly where Rice's theorm, the halting problem etc.. arise.

AC from ZF(C) and IID from statistics bring PEM, which gives or forces the classical behavior.

The 'non-trivial' properties of Rice's theorm are exactly those properties where PEM doesn't hold.

The frame problem in machine learning is also related. As is Gödels (in) completeness theories.

Sometimes you can use failure as negation, other times you can use methods like accepting that NULL is an unknown.

But you have to decide what you can sacrifice and still solve real problems.

I think thinking of a problem and it's complement is better.

E.G P is the truthy T and co-P is the truthy F. Obviously P=co-P means that the traditional method works, but jump to NP vs co-NP and ot doesn't.

NP chained ORs, co-NP is chained ANDs is another way of thinking about it.


It could be true or false, but it’s unknown. For example. a user doing a survey is yet to fill in the answer. That doesn’t mean there is no answer, it’s just unrecorded.


Maybe GP was edited, but it doesn't use the word "boolean" anywhere.


Correct, I edited "boolean" out prior to ^^P's comment. My apologies.


A null boolean good very well mean true, if the default value is true (in the code if not in the DB).


Is this just being pedantic wrt uninitialized values?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: