- Our search engine works and has been doing so for >15 years. Our search quality needs improving but does so gradually. And it's independent; our own crawler, index and infrastructure
It's possible that at this point I (and others) have trained ourselves to know/do the kind of queries that work on Google, without even realizing it, which would be another thing making it hard to switch. Although in this case... I'm actually a bit surprised mojeek doesn't manage it. Just `github hanami` doesn't get it either. Is it just not matching on sitename at all?
Hello, mojeek dev here. Cheers for the feedback, appreciated.
In this case we simply don't have the page in our index, though we do have others mentioning hanami. Our bot is permitted to crawl github.com and has a good number of pages in from that host, we'll evaluate whether we can increase crawling for github and similar large sites and hopefully before long that page will enter the index.
I tried some development-related queries that I have recently done in Google.
Queries related to Go seem to mainly work only if using Go and not if using Golang (unless "go" + term is popular outside of Go as well). Usually people use Golang in search (to avoid confusion with the verb & game), but pages generally refer to Go. "go package XXXX" seems to work better in many cases.
With a bit of lesser known technologies, it was hard to find a query that would get me to the actual site. e.g. Python SDK for OCI. Lots of links to examples with various queries (python oci, python oci sdk, python oci api), but not really any direct link to GitHub or the official documentation.
Hello, mojeek dev here, thanks for the feedback it is always appreciated.
I think there's two ideas that come up from your feedback, one is index size. Our index is small but growing. A larger index increases the chance of having pages that satisfy your query.
The other aspect is boolean search versus something more akin to the vector space model. We've found a lot of people that are dissatisfied with Bing/Google searches tend to be unhappy that the search they actually enter is somehow modified to include what the search engine believes are relevant synonyms. In some cases, those synonyms may help in producing a better result when use of language is split between two terms used interchangeably, like go and golang. It's something we're looking into. We do value searches being based on what is actually searched for but also accept there are cases where assuming synonyms may be advantageous to the end results.
If you do this, it'd be great to have a way to select the 'mode' of search (exact query vs 'smart' terms). I'm not sure what the user interface ought to be, but "literally" "anything" "would" "be" "better" than the contortions you have to go through to force search engines to search for the terms you've actually provided.
Great job so far, by the way, keep up the good work!
- Our search engine works and has been doing so for >15 years. Our search quality needs improving but does so gradually. And it's independent; our own crawler, index and infrastructure
- We just introduced non-tracking search ads: https://www.mojeek.com/support/ads/
- We use "No Tracking , Just Search" and "Search without Surveillance"
We've been building for 15 years; here's our founder story: https://news.ycombinator.com/item?id=26502140