Each html page is processed by (1) getting url, title, time saved (this is under-rated as approximate time of saving is useful if you want to rediscover) and then (2) taking a screenshot and finally (3) extracting text with readability.js and hopefully doing some keyword analysis.
Right now it is stored in a local SQLite Database, although the article content is stored in text files. For search, I can use ripgrep to look through the associated text files.
The eventual goal is to create a flask app which will allow for interactive management of the bookmarks (tagging, searching). I've already got static generation of bookmarks.
Each html page is processed by (1) getting url, title, time saved (this is under-rated as approximate time of saving is useful if you want to rediscover) and then (2) taking a screenshot and finally (3) extracting text with readability.js and hopefully doing some keyword analysis.
Right now it is stored in a local SQLite Database, although the article content is stored in text files. For search, I can use ripgrep to look through the associated text files.
The eventual goal is to create a flask app which will allow for interactive management of the bookmarks (tagging, searching). I've already got static generation of bookmarks.
Here's a screenshot: https://imgur.com/5YP4sP5