Incorrect. There is plenty to discuss yet about self-driving tech's current state of play and how that compares to a trained human operator's abilities in identical situations. There's also the evergreen topic of liability in the case of accidents that has yet to be resolved to my satisfaction.
No, actually they probably can’t. There is no verifiable way to remove the data from the model apart from completely removing all instances of information from the training data. The project you linked only describes a selective finetuning approach.
Until you get models with completely disentangled feature spaces such that you know that the influence of a piece of data is completely removed (at the limit this is something like an embedding DB), there is absolutely no way you can claim you’ve removed the data from the model.
At most, these efforts will amount to data laundering where it will be impossible to prove that a piece of data was used to train the model, not provide conclusive proof that it was removed.
Libgen / Scihub or not, if the model can provide details about the book other than just high level info like the summary and no explicit deal with the publisher has been made, you can make a strong argument that it is plagiarism.
Even if bits and pieces of the book text are distributed across the internet and you end up picking up portions of the book, you still read the book.
It is extremely sad but ChatGPT will be taken down by the end of this year and replaced by a highly neutered model next year.
I'm not a lawyer and obviously we won't get any definite answer unless it actually goes to court, all of this is just hand waving and guessing.
But I think that unless GPT starts reciting large parts outside of the context of learning/education/research, reciting smaller snippets would fall into "fair use" and not be illegal.
I think you can. It is a separate "crime". You would get 2 cases one for fair use (which if you are quoting, commenting, reviewing, generally repurposing content and it is in fact fair) and second case for license/terms breach and/or illegally obtaining this piece of work(for example if you stolen it from bookstore).
If you recite enough small snippets, you make a large one.
Especially with ChatGPT you can probe the model by asking certain questions about the material at hand to see if it has seen the entire book.
Also you don’t have to be able to recite the book verbatim for it to have been in your training set. The snippets I am referring to are on the side of the training data
If I read a book and then write a summary, is that plagiarism? What's the difference? I am legitimately not familiar with copyright law, but real lawyers seem to think it is unclear whether training on copyrighted data is illegal (in Japan it's definitely not).
Yea you can create the nicest office of all time, still wouldn't want to live there or trade it for my own private personal space.
Way happier to work hard and contribute to the company when I feel like I don't have to be forced to work in an office that isn't conducive to my life.
Yea the CEO's response was tone deaf and not tactful at all.
However, it seems insane that people are complaining about this for the following reason:
1. Reddit is not profitable, it is literally bleeding money.
2. No Reddit = No 3P apps to access Reddit.
3. The discussion that should be had is whether it is sustainable for Reddit to keep running it's servers and whether the recent decisions are made in favor of additional growth or survival.
reddit Premium costs $6/month and eliminates ads. That's an upper bound on how much reddit values serving ads to users. Include an API key with reddit Premium that users can plug into the app of their choice. reddit gets the value per user, gets the analytics per-user, and it's hard to complain about price gouging given this was the previous price for ad-free service.
Reddit makes money from ads having reach and they're hoping that only a small percentage get premium so that they can continue to make money off of ads.
Why would they reduce the reach their ads are having by making it easy for people to opt-out?
Their API pricing probably reflects the money they are losing by not getting the users on their own platform and showing them ads.
Their hope is that people would switch over to the main app so that their advertisers can pay them more.
It's this perception that assumes that CEOs can do no right by the people if there is a financial interest in not doing so.
Whether you believe this is the case or not, you have to agree that all nuance goes out the window and the mob is only satisfied when it gets what it asks for, not what is in its best interests.
They are certainly having g a poor track record. Most of the CEOs and managers I talk to think that it is a noble pursuit for them to be profitable because the free market determines what is moral and in the public interest. Its a pretty commonly held belief in the business world - it's very similar to a religious faith. 9ne manager even believed that scientific discovery only occurred to fulfill the objectives of business managers. Most scientific research in the US is publicly funded, so it isn't even funded by business. But, he didn't accept that explanation.
I was half joking when I was talking about nationalizing social media, but we should ask ourselves why we as a society are destroying these tools that we all used and enjoyed using because a bunch of money changed hands. Twitter is undeniably worse off, so is Facebook. Surely we have proved that we can have these things if we wanted to. At the very least, we should have net neutrality.
I don’t think they’re mutually exclusive. Next word prediction IS reasoning. It cannot do arbitrarily complex reasoning but many people have used the next word prediction mechanism to chain together multiple outputs to produce something akin to reasoning.
What definition of reasoning are you operating on?
I can write a program in less than 100 lines that can do next work prediction and I guarantee you it's not going to be reasoning.
Note that I'm not saying LLMs are or are not reasoning. I'm saying "next word prediction" is not anywhere near sufficient to determine if something is able to reason or not.
Semantic reasoning, being able to understand what a symbol means and ascertain truth from expressions (which can also mean manipulating expressions in order to derive that truth). As far as I understand tensors and transformers that's... not what they're doing.
If you understand transformers, you’d know that they’re doing precisely that.
They’re taking a sequence of tokens (symbols), manipulating them (matrix multiplication is ultimately just moving things around and re-weighting - the same operations that you call symbol manipulations can be encoded or at least approximated there) and output a sequence of other tokens (symbols) that make sense to humans.
You use the term “ascertain truth” lightly. Unless you’re operating in an axiomatic system or otherwise have access to equipment to query the real world, you can’t really “ascertain truth”.
Try using ChatGPT with gpt4 enabled and present it with a novel scenario with well defined rules. That scenario surely isn’t present in its training data but it will able to show signs of making inferences and breaking the problem down. It isn’t just regurgitating memorizing text.
Oh cool, so we can ask it to give us a proof of the Erdős–Gyárfás conjecture?
I’ve seen it confidently regurgitate incorrect proofs of linear algebra theorems. I’m just not confident it’s doing the kind of reasoning needed for us to trust that it can prove theorems formally.
Just because it makes mistakes on a domain that may not be part of it's data and/or architectural capabilities doesn't mean it can't do what humans consider "reasoning".
Once again, I implore you to come up with a working definition of "reasoning" so that we can have a real discussion about this.
Many undergraduates also confidently regurgitate incorrect proofs of linear algebra theorems, do you consider them completely lacking in reasoning ability?
> Many undergraduates also confidently regurgitate incorrect proofs of linear algebra theorems, do you consider them completely lacking in reasoning ability?
No. Because I can ask them questions about their proof, they understand what it means, and can correct it on their own.
I've seen LLM's correct their answers after receiving prompts that point out the errors in prior outputs. However I've also seen them give more wrong answers. It tells me that they don't "understand" what it means for an expression to be true or how to derive expressions.
For that we'd need some form of deductive reasoning; not generating the next likely token based off a model trained on some input corpus. That's not how most mathematicians seem to do their work.
However I think it seems plausible we will have a machine learning algorithm that can do simple inductive proofs and that will be nice. To the original article it seems like they're taking a first step with this.
In the mean time why should anyone believe that an LLM is capable of deductive reasoning? Is a tensor enough to represent semantics to be able to dispatch a theorem to an LLM and have it write a proof? Or do I need to train it on enough proofs first before it can start inferring proof-like text?
I suspect you have adopted the speech patterns of people you respect criticizing LLMs of lacking “reasoning” and “understanding” capabilities without thinking about it carefully yourself.
1. How would you define these concepts so that incontrovertible evidence is even possible. Is “reasoning” or “understanding” even possible to measure? Or are we just inferring by proxy of certain signals that an underlying understanding exists?
2. Is it an existence proof? I.e we have shown one domain where it can reason, therefore reasoning is possible. Or do we have to show that it can reason on all domains that humans can reason in?
3. If you posit that it’s a qualitative evaluation akin to the Turing test, specify something concrete here and we can talk once that’s solved too.
Ok I actually thought about this a fair bit a few days ago and I think I have a good answer for this.
You’ve probably heard of the cheap bar trick that goes something like: “And what does a cow drink? Milk!”.
Irrespective of intelligence, humans tend to make silly cognitive errors like this because we are fundamentally pattern marchers.
In order to become a forerunner in a field, you necessarily have to be good at abstract pattern matching.
What happens as you age is that you no longer have the need to question assumptions because you know what’s real and what’s not. There’s also the decrease of white matter and an increase of grey matter which doesn’t help this.
As time goes on, certain assumptions change, essentially deprecating certain chunks of your crystallized learnings.
Some chunks of your thinking are still valid, so when you think something can be done, it most likely can be done.
However, if something falls outside your crystallized learning, you get a strong sense it’s wrong, when it might be because of your outdated assumptions.
You can try to hotswap the assumptions you have, but it becomes like Jenga the more years of experience you have in your field.
You either have to start from scratch and rebuild your lifetimes worth of learnings from the ground up or be super careful in reassessing everything you know
I wonder how much of this startup playbook is just a way to rationalize luck.
Companies that succeed may not necessarily do all of these things.
Companies that fail may do all of these things.
A person who'd be inclined to click this link is likely new to startups and wants some kind of structure, knowing how unpredictable things can be. In some sense, this is exactly "what people want" ala YC advice.
As long as they say a few milquetoast things that other established entrepreneurs cannot argue with, they have essentially used the (possibly random) initial successes they've had to build credibility in the eyes of potential founders.
There is no doubt that a large part of YC's successes are due to its network effects but you first need to get lucky in order to build it. That's the part they seem to leave out.
I don't mean to be too cynical. I think a lot of the advice in this article actually rings true and aligns with my personal experience. No advice on something as vague as success can be comprehensive and OpenAI may just be an exception. Especially as it's not a successful business yet and given their enormous raises, it'll be years before anyone can tell if they're actually able to swim on their own.
For example, when I was young I worried that ideas were precious and had to be protected in case I told someone and they stole it. Now I recognize the fundamental truth Sam expresses in the article: the best ideas sound bad, and the majority of people will just roll their eyes if you tell them what you want to do. I've directly had that experience with several successful projects so this isn't just nodding along to something that sounds vaguely aspirational.
Whilst OpenAI seems to violate a lot of what's in the article, in that way it's example of it. They're in their current position because groupthink within the AI space came to state that scaling up language models wasn't worth doing and academically uninteresting compared to developing more sophisticated neural architectures. OpenAI said, no, let's just try spending shittons of money and engineering on brute forcing enormous model sizes with an ordinary-ish transformer network and see what happens. Right up until they started getting these amazing results they were sort of out on their own in being so committed to that approach.
The best ideas sound bad.
Really bad ideas also sound bad.
Many good sounding ideas are also hugely successful businesses.
Maybe the etiology of success had nothing to do with how bad sounding the idea is?
Also in the case of OpenAI, being in the field, I'd strongly argue that it wasn't that crazy sounding of an idea as people might conjure in retrospect.
Google had been throwing resources at LLMs and been using them in production for several years before OpenAI started to do it. MANY papers written about scaling laws and how we haven't reached the limit of scaling data / compute yet.
OpenAI is only relevant because they iterated quickly on releasing a product. Absolutely nothing visionary about the technology honestly. The question that should be asked here is why OpenAI was the first to release a product a few other companies were capable of creating..
Everything OpenAI has done is low hanging fruit. The logical progression of LLMs is chat, zero-shot learning, multimodal, connect it to web / other knowledge bases, planning, personal customization, multi-agent systems, etc. There is a vast amount of research in several vectors in this area..