I often find this kind of advice too vague to really be useful. “Have taste” in the problems you work on isn’t very actionable. (Unless perhaps you list examples of good and bad taste.)
I’ll admit that I may just be immature at research as almost all my experience has either been attempting to replicate research or to put it into practice in production systems.
"Having taste" is mostly about predicting the future. Which problems are worth studying, which problems you are capable of solving, and which solutions turn out to be important, in retrospect. If there was an actionable way of developing taste in something, the activity itself would probably be so predictable that it would not be a particularly good research topic.
Taste is mostly about having a good intuition on the topics where your intuition is worth following. It tends to develop with experience. But if you want to develop the kind of taste that helps picking good research topics, you need the right kind of experience for that field of research. Experience that turns out to be of the right kind, in retrospect. If your experiences and interests align (again in retrospect), you will probably develop a good taste for research problems in your field of interest. But that requires some amount of luck, in addition to everything else.
That seems even less actionable, and somewhat misaligned with the OP article. “Taste” implies an ability to distinguish between a good example and a bad one. If it’s only recognizable in retrospect then it’s just another name for survivorship bias.
If you need actionable guidelines, you may not be the right person to do research. At least not now.
Research is all about studying topics of uncertain value. You have to commit to a project long before you can say if it's actually worth doing.
Taste comes with deliberate effort and experience. It doesn't tell you that a topic is definitely worth studying, but it increases the likelihood that you will guess right.
What is the point of writing the prescription to “have good taste” then?
Either the reader already has it, in which case there’s no point in being told that. Or the reader doesn’t, in which case you have declared that good taste cannot be taught.
Perhaps the author’s next article should be How to win the lottery: be lucky which is just about as actionable.
It's helpful to tell people that they are in uncharted territory and can't rely on running on autopilot even if you don't have a new map to give them. Whether they can make their way or not is unclear, but the first step is just making sure they understand that they're now in a place where they need to make their own way and can't fully rely on existing maps. Otherwise they might not even realize they need to start asking "am I doing the right thing right now" by themselves.
You can cultivate good taste by intentionally taking in a lot of information about what's in the field, and what you like and what you don't like about it. This could be commenting on elements of film, fashion, photography, but it can also be having a sense of what you like to see stylistically in a contract, in a framework, or in corporate culture.
I recall reading an interview about a legendary developer, and the majority of the interview was not focused on his coding decisions or the structures he built, but it was about a notebook that he kept with voluminous notes about what was good and what wasn't. That notebook is a materialized version of 'taste', and it's certainly something almost anyone could put together with enough effort and time.
> But if I had to summarize it in one sentence, it would be that taste comes from practicing the skill of research, keeping your focus always on identifying what works and what doesn't.
Instead of following general guidelines, focus on figuring out what works and what doesn't in each specific situation. Keep doing that for many years, and your taste will develop. Remember that you are training your intuition, not developing a set of exact rules.
Eh. I think my point is that the OP is presented as a “how to” (literally: “how to do important research”) and then it immediately dodges the question by saying “have good taste”. That does not help anyone do important research or improve the quality of the research they do; it’s a cop out.
If I wrote about “how to paint great art” or “how to cook great meals” or “how to build great things” then it would be silly to say “have good taste”—even if that’s part of the answer. It won’t help anyone else to improve in any of those endeavors.
Unrelated, but I see the use of the phrase "taste" as having a strong Twitter / e/acc smell (in a negative way).
I tend to associate it with folks who are prepared to victim blame researchers for not adapting to the "new economy" as being people who have "bad taste" or "low agency", maybe as a way to rationalize/justify the upcoming inqeuality that AI will create.
Basically a recycling of the way "IQ"/smarts/hard-work has historically been used to justify disproportionate rewards for the upper class.
(Obviously a gigantic stretch on my part, and not saying the author is in this camp, but just wanted to vent somewhere)
Yes this does not work as well for math or physics. The biggest problem in math is arguably the Riemann Hypothesis . Good luck getting up to the speed on the literature on that . You can invest a lifetime trying to solve the biggest problems in physics or math and get nowhere. You may have to choose more modest goals.
Any notes on the problems with MLX caching? I’ve experimented with local models on my MacBook and there’s usually a good speedup from MLX, but I wasn’t aware there’s an issue with prompt caching. Is it from MLX itself or LMstudio/mlx-lm/etc?
It is the buffer implementation.
[u1 10kTok]->[a1]->[u2]->[a2]. If you branch between the assistant1 and user2 answers then MLX does reprocess the u1 prompt of let's say 10k tokens while llama.cpp does not.
I just tested with GGUF and MLX of Qwen3-Coder-Next with llama.cpp and now with LMStudio. As I do branching very often, it is highly annoying for me to the point of being unusable. Q3-30B is much more usable then on Mac - but by far not as powerful.
I think this is the actual “bitter lesson”—the scalable solution (letting LLMs bang against the problem nonstop) will eventually far outperform human effort. There will come a point—whether sooner or later—where this’ll be the expected norm for handling such problems. I think the only question is whether there is any distinction between problems like this (clearly defined with a verifiable outcome) vs the space of all interesting computer programs. (At the moment I think there’s space between them. TBD.)
So…great for prototyping (where velocity rules) but somewhere between mixed to negative for critical projects. Seems like this just puts some mildly quantitative numbers behind the consensus & trends I see emerging.
I'm seeing parallels between this and factory-assembled houses.
Input costs are lower and velocity is higher. You get a finished product out the door quicker, though maintenance is more expensive. Largely because the product is no longer a collection of individual parts made to be interfaced by a human. It is instead a machine-assembled good that requires a machine to perform "the work". Therefore, because the machine is only designed to assemble the good, your main recourse is to have the machine assemble a full replacement.
With that framing, there seems to be a tradeoff to bear in mind when considering fit for the problem we're meaning to solve. It also explains the widespread success of LLMs generating small scripts and MVPs. Which are largely disposable.
I know approximately nothing about approximately everything. Claude seems pretty good at those things. But in every case I’ve used Claude Code for something I do know about it’s been unsatisfactory as a solo operator. It’s not useless, but it is basically useless for anything serious unless you’re very actively guiding it.
I think it has a lot of potential value and will become more useful over time, but it’ll be most useful when we can confidently understand the limitations.
I know a lot about Typescript and its ecosystem. I’ve taught it to students, and worked on it at companies whose names you’d recognize. Claude Code is better than I am at some things that I know deeply, in some cases. It does stupid things on occasion (like use global mutable state), but it is still more useful than not. So, I guess it depends on how you define “better”, but I’ve learned things I didn’t know, and it allows me to do projects and experiments that I’d otherwise be too lazy to do.
I had been working in civil service for the US Navy for about 10 years in operations research & systems engineering. It was very hard to break out of that role to any private industry—especially for the ML roles I wanted, which I think was partially because my undergrad degree was MechEng.
OMSCS allowed me to add MSCS to my resume, with additional networking and work experience details as a TA for the algorithms and Computational Photography courses. Suddenly I started getting a lot more calls back. About 6 months after graduation I had moved to the SFBay (to work for Udacity) and within 2 years I was an ML engineer at Apple where I remain today. I don’t think any of that would’ve happened without OMSCS.
Whoa! Incredible! Talk about a OMSCS success story. Thank you for sharing – this is seriously going to serve as motivation fuel for me to get back into it.
I was the head of enterprise curriculum in 2018 and an OMSCS grad in 2016. This was a weird time to work for Udacity and the company went thru a major shakeup in 2019. The “breakup” with GT happened before the focus on enterprise and the enterprise focus was somewhat short-lived as the CEO was replaced just as enterprise was ascending as the primary revenue stream. COVID was rough for Udacity, and content production was commoditized.
In 2013-2016 Udacity was very actively collaborating with GT and had in-house content production. The projects were designed by highly experienced instructors in direct partnerships with real companies to make them realistic and relevant, and there was a small army of hand-picked mentors and graders to review and provide feedback.
Unemployment was _relatively_ high at that time, so individual consumers were eager to invest their own time & money to upskill and differentiate themselves. By 2018 unemployment hit record lows and suddenly it was _employers_ who were struggling to attract talent and wanted to differentiate themselves by offering upskill training as a benefit along with highly intentional training programs to organically grow the hard-to-hire talent from their existing workforce. This precipitated a shift from huge growth in the consumer side to growth in the enterprise business.
Contemporaneously, platforms like Udemy and Pluralsight commoditized content creation. Pluralsight bragged that it cost them $15k to launch a new course—orders of magnitude less than it cost us in house. Udacity pivoted away from high quality in house production to more partnerships with external content creators and identified the project grading and mentorship services as the largest cost drivers of ongoing course support costs.
As growth wasn’t tracking fast enough, Udacity closed most of the international offices—except India—then had two rounds of layoffs where the remaining content production was practically eliminated, and the mentorship and grading were commoditized by transferring the programs to the Udacity India office to administrate. All the hand-picked and trained graders and mentors were eliminated.
Then COVID hit. (I was gone by then.) I heard Udacity raised a debt round, but I think they were stuck against headwinds from the past few years. Eventually they were acquired for an “undisclosed sum”.
So what could have brought in more business? IMO, focusing on what was working for us, not trying to pivot into what worked for someone else. The problem I think is that we weren’t on track to make a reasonable return on all the money raised. We were trying to swing for the fences, even if it meant eventually striking out.
I imagine what it means is basically, "Before COVID, universities had to collaborate with Udacity to produce these courses and manage course credits/online degrees. Now they realized that they can easily do it themselves (perhaps at the institution level)"
Nah. There was some of that as the tools available to unis improved alongside Udacity, but it was a very intentional choice. The business with GT made $X/year while the consumer & enterprise businesses brought in $20X/year. It seemed like we could maybe double the OMSCS or scale linearly with effort by making more partnerships, meanwhile the other lines scaled faster with much less effort. Terminating this partnership was just one of the business lines that got cut off to focus everything on the lines that were growing much faster.
When I say something like this it usually means “I don’t want to dictate your job to you. You’re here because you’re smart, ambitious, and capable. We’ve talked at length in team settings and 1:1 about our goals. What do you think are the problems that need attention, and what solutions do you propose?”
The anti-pattern I’ve seen from some folks is that they never want to propose solutions because then it’s someone else’s fault if those fail. These folks often demonstrate minimal ownership of any decisions, so they don’t feel bad complaining about all the problems they see. Not only is that unhelpful, it can actually be very toxic for the team. (As you mentioned.)
So when I’m saying “bring solutions” what I’m really asking for is some shared ownership of the choices and consequences—I’m asking folks to act like the main character in the story. And don’t worry, I own the consequences of the mistakes in my team to my leadership—this isn’t about throwing them under the bus. (Getting this to work well requires a lot of trust both ways.)
> When I say something like this it usually means…
Yes, exactly. This isn’t “do my job for me”, this is “do the job you have, and solve the problems you should be able to solve”. It’s also, at times, “pointing at fires is junior shit - find a fire extinguisher while you call 911.”
So DSA means a lightweight indexing model evaluated over the entire context window + a top-k attention evaluation. There’s no soft max in the indexing model, so it can run blazingly fast in parallel.
I’m surprised that a fixed size k doesn’t experience degrading performance in long context windows though. That’s a _lot_ of responsibility to push into that indexing function. How could such a simple model achieve high enough precision and recall in a fixed size k for long context windows?
It’s mind boggling that image generators can solve physics and chem problems like this—but I will note that there are a few slight mistakes in both. (An extra i term in LHS, a few of the chemical names look wrong, etc.) Unbelievable that we’re here, but it still remains an essential to check the work.
I’ll admit that I may just be immature at research as almost all my experience has either been attempting to replicate research or to put it into practice in production systems.
reply