I love, love this extension. I am working on an app to turn this into a single click bookmark system on Linux. Run an inotify service to watch your downloads and then process any Single file downloads to a database and update a browsable index.
I think I basically get the idea, what kind of database are you using? Recoll sounds like a good idea, but I'm also thinking about how I might also make this public-ish.
(i.e. I teach in college and would love to have a centralized way to store and search all my assigned readings, which are most often webpages)
Each html page is processed by (1) getting url, title, time saved (this is under-rated as approximate time of saving is useful if you want to rediscover) and then (2) taking a screenshot and finally (3) extracting text with readability.js and hopefully doing some keyword analysis.
Right now it is stored in a local SQLite Database, although the article content is stored in text files. For search, I can use ripgrep to look through the associated text files.
The eventual goal is to create a flask app which will allow for interactive management of the bookmarks (tagging, searching). I've already got static generation of bookmarks.
I archived (privately) some documentation pages from some of our vendors that were behind a login page using that just in case it became inaccessible at a critical time for us.