Have you considered some sort of "crowdsourcing" / voluntary botnet type approac...

dredmorbius · on June 2, 2018

The thought's occurred.

My approach is sufficiently fluid that this would mean pushing pretty crude code to a bunch of hosts frequently and on a irregular basis. The runs themselves are fairly ad hoc.

Being able to directly query a corpus (IA, DDG, Bing, etc.) is another option.

Search across large corpora remains fairly expensive, I can understand hesitency here.

Nonstandardisation of search APIs across sites is another frustration.