Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Vesuvius Challenge (scrollprize.org)
275 points by razin on March 15, 2023 | hide | past | favorite | 32 comments


For what it's worth, I worked with Dr. Seales while getting my undergrad degree. Happy to vouch that he and his team are great humans.

This is such a fascinating problem, and could have real benefits for society. Imagine uncovering ancient works that would otherwise be lost.


Awesome!! They're so great; it's been such a joy to be welcomed into their project.

And totally — especially if there is indeed a larger library still waiting to be excavated. Who knows how many completely unique texts are waiting to be read, if we just have the right method to do so?!


The images/animations in this page are fantastic at visually explaining something quite complicated. I would not have been able to understand the difficulty without them.


Credit for those goes to Jonny Hyman, who also does animations for Veritasium, Dejan Gotić, who did the fancy 3d animations, and JP Posma, who directed the entire project!


Man, there's probably a dozen different loosely-formed ideas floating in my head right now — kudos on the exciting presentation, you really make the problem seem interesting. I may just have to give this a shot, though I think the odds of me figuring it out are exceptionally low. Still, working on cutting-edge problems is motivating, even if they're above my pay grade :)

You've nerd-sniped me good and proper. May the best team win!


Haha yesss, nerd-sniping is the goal!

I believe that there could be quite a few different ways in which this could get solved. The potential solution space is huge, so you might just stumble upon something interesting if you wander places where no one else is looking..

Good luck!!


One of the organizers here! Would love to welcome everyone's questions, ideas, etc! :)


Another note: the demo data [0] is behind an HTTP link. Consider getting it behind HTTPS. My browser was complaining about downloading the data from plain HTTP.

[0] https://gist.github.com/janpaul123/280262ebce904f7366fe4cc15...


Good point, we'll look into it!


More notes: your registration form to access the data requires that someone have a google account. While this isn't an issue for most, I'm not comfortable doing anything with Google anymore and I have as little to do with them as possible.


How do I get my hands on the data set? There's no indication on the site about how to make an attempt.


Nevermind. There's a hamburger menu in the top right corner. It wasn't immediately obvious to me.


Is this the most cost-effective way to achieve the desired outcome?


We don't know. But it's the most fun!


Does the prize include a PhD? /s Because, this sounds like a dissertation-level project, and that prize money is less than the cost of supporting a grad student, or a post doc for a couple of years. That said, I think this is soo freaking cool.


This was the outcome of https://nat.org/puzzle

HN discussion: https://news.ycombinator.com/item?id=33735503

Some did guess correctly that it was about decoding the Herculaneum papyri


Haha, indeed!


Fantastic project. Amazing to think that there's an entire first century library just waiting for the technology to be read.


Right?! Who knows what could be waiting in a library owned by leaders of the Roman Empire


I do a lot of brain image segmentation in my research using multiatlas image segmentation, which involves diffeomorphic image registration from multiple labeled atlases...but the amount of curling in on itself of these layered sheets seems a daunting problem for a fully automated pipeline.


People spent many many decades laboriously putting tiny little dead sea scroll fragments back together like the world's worst jigsaw puzzle. I think that shows if there is a way to do this that takes a lot of tedious manual labor over many decades, there are people who will be willing to do that. They just need the tools to do the work without destroying the scrolls.


It seems quite possible that the solution isn't fully automated. N is in the hundreds. And modern AI does, in fact, involve quite a lot of hand crafted data...


Easter egg: if you click on the “days remaining” at the bottom of the page, it changes to Roman numerals :)


This needs to be solved ASAP. Chat-gpt needs more training data!

but seriously, way cool project.


More data!!


Why has the team not tried to use terahertz tomography, if X rays gave such poor signal to noise ratio?


Or gamma rays?

I'd guess terahertz might not provide sufficient resolution or penetrate deeply enough. Or maybe not even provide better discrimination than X rays.


I guess this is the answer to this post from 3 months ago

https://news.ycombinator.com/item?id=33735503


Someone called it: https://news.ycombinator.com/item?id=33736675

> Maybe decoding Herculaneum scrolls?



Reminds me of Kuzushiji recognition with ML, transcribing historical Japanese documents. Both are my favorite applications of ML: deciphering the past. This is really damn cool.


<3




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: