A little project I have been working on is
https://node.xxx
It's a search engine that only indexes adult websites and aggressively deals with spam sites, preventing them from being indexed.
It will also only index canonical sites, so no white labels get into the search index.
At the moment it's a little slow to respond to queries but that will improve as new caching servers are deployed.
It's still got a way to go in development but it is live and the current index is around 100 million pages. It supports
complex queries which are
documented here.
The search engine is infinitely scalable and while it's currently crawling html, pdf, json, xml, rss, video and images it's only returning text based results right now.
Later this year once I've perfected image search that will be rolled out, with video search to follow around March 2017.
Have a look and let me know what you think at
https://node.xxx