Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I would recommend using Apache Tika to extract the text from the PDFs and using Solr (or Elasticsearch) to index and search them.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: