Vesuvius Challenge

Matt_Cutts · on March 15, 2023

For what it's worth, I worked with Dr. Seales while getting my undergrad degree. Happy to vouch that he and his team are great humans.

This is such a fascinating problem, and could have real benefits for society. Imagine uncovering ancient works that would otherwise be lost.

janpaul123 · on March 15, 2023

Awesome!! They're so great; it's been such a joy to be welcomed into their project.

And totally — especially if there is indeed a larger library still waiting to be excavated. Who knows how many completely unique texts are waiting to be read, if we just have the right method to do so?!

ccooffee · on March 15, 2023

The images/animations in this page are fantastic at visually explaining something quite complicated. I would not have been able to understand the difficulty without them.

natfriedman · on March 15, 2023

Credit for those goes to Jonny Hyman, who also does animations for Veritasium, Dejan Gotić, who did the fancy 3d animations, and JP Posma, who directed the entire project!

mkaic · on March 15, 2023

Man, there's probably a dozen different loosely-formed ideas floating in my head right now — kudos on the exciting presentation, you really make the problem seem interesting. I may just have to give this a shot, though I think the odds of me figuring it out are exceptionally low. Still, working on cutting-edge problems is motivating, even if they're above my pay grade :)

You've nerd-sniped me good and proper. May the best team win!

janpaul123 · on March 15, 2023

Haha yesss, nerd-sniping is the goal!

I believe that there could be quite a few different ways in which this could get solved. The potential solution space is huge, so you might just stumble upon something interesting if you wander places where no one else is looking..

Good luck!!

janpaul123 · on March 15, 2023

One of the organizers here! Would love to welcome everyone's questions, ideas, etc! :)

all2 · on March 15, 2023

Another note: the demo data [0] is behind an HTTP link. Consider getting it behind HTTPS. My browser was complaining about downloading the data from plain HTTP.

[0] https://gist.github.com/janpaul123/280262ebce904f7366fe4cc15...

janpaul123 · on March 15, 2023

Good point, we'll look into it!

all2 · on March 15, 2023

More notes: your registration form to access the data requires that someone have a google account. While this isn't an issue for most, I'm not comfortable doing anything with Google anymore and I have as little to do with them as possible.

all2 · on March 15, 2023

How do I get my hands on the data set? There's no indication on the site about how to make an attempt.

all2 · on March 15, 2023

Nevermind. There's a hamburger menu in the top right corner. It wasn't immediately obvious to me.

jawns · on March 15, 2023

Is this the most cost-effective way to achieve the desired outcome?

janpaul123 · on March 15, 2023

We don't know. But it's the most fun!

SubiculumCode · on March 15, 2023

Does the prize include a PhD? /s Because, this sounds like a dissertation-level project, and that prize money is less than the cost of supporting a grad student, or a post doc for a couple of years. That said, I think this is soo freaking cool.

fortenforge · on March 16, 2023

This was the outcome of https://nat.org/puzzle

HN discussion: https://news.ycombinator.com/item?id=33735503

Some did guess correctly that it was about decoding the Herculaneum papyri

janpaul123 · on March 16, 2023

Haha, indeed!

glfharris · on March 15, 2023

Fantastic project. Amazing to think that there's an entire first century library just waiting for the technology to be read.

janpaul123 · on March 15, 2023

Right?! Who knows what could be waiting in a library owned by leaders of the Roman Empire

SubiculumCode · on March 15, 2023

I do a lot of brain image segmentation in my research using multiatlas image segmentation, which involves diffeomorphic image registration from multiple labeled atlases...but the amount of curling in on itself of these layered sheets seems a daunting problem for a fully automated pipeline.

irrational · on March 15, 2023

People spent many many decades laboriously putting tiny little dead sea scroll fragments back together like the world's worst jigsaw puzzle. I think that shows if there is a way to do this that takes a lot of tedious manual labor over many decades, there are people who will be willing to do that. They just need the tools to do the work without destroying the scrolls.

natfriedman · on March 15, 2023

It seems quite possible that the solution isn't fully automated. N is in the hundreds. And modern AI does, in fact, involve quite a lot of hand crafted data...

thih9 · on March 16, 2023

Easter egg: if you click on the “days remaining” at the bottom of the page, it changes to Roman numerals :)

SubiculumCode · on March 15, 2023

This needs to be solved ASAP. Chat-gpt needs more training data!

but seriously, way cool project.

janpaul123 · on March 15, 2023

More data!!

shrx · on March 16, 2023

Why has the team not tried to use terahertz tomography, if X rays gave such poor signal to noise ratio?

dreamcompiler · on March 16, 2023

Or gamma rays?

I'd guess terahertz might not provide sufficient resolution or penetrate deeply enough. Or maybe not even provide better discrimination than X rays.

johnnyo · on March 16, 2023

I guess this is the answer to this post from 3 months ago

https://news.ycombinator.com/item?id=33735503

jcuenod · on March 27, 2023

Someone called it: https://news.ycombinator.com/item?id=33736675

> Maybe decoding Herculaneum scrolls?

thih9 · on March 16, 2023

Confirmed here: https://news.ycombinator.com/item?id=35176585

localplume · on March 15, 2023

Reminds me of Kuzushiji recognition with ML, transcribing historical Japanese documents. Both are my favorite applications of ML: deciphering the past. This is really damn cool.

janpaul123 · on March 15, 2023