Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think I understand your concern but if I miss the point, please follow up!

So regarding getting access to read knowledge from the different tools, it depends tool by tool but a lot of them have API keys or options for app integrations available in the free tier (GitHub, Google Drive, Confluence come to mind). Other tools don't have a free tier and you just get access to the API keys as a part of paying for the service. I think there are probably tools that require a premium fee to get integration access but I'm not aware of any personally.

For the SlackBot, it can add itself to public channels but for private channels someone needs to add it. It is what it is sadly.

About search being available for most SaaS products: SaaS tools are definitely improving their own searches. But I still think a single place to search and aggregate data has significant value. For example, as an engineer by training, often getting the full picture for some customer escalation includes reading Slack threads, Confluence Design docs, old Pull Requests on GitHub. Would be nice to get it all in one place.



> It is what it is sadly.

This is what I mean -- previously I built a similar search engine on top of slack, notion, etc., but didn't launch the product because I thought that requiring users to constantly add bots to private channels would be a subpar experience. I thought this would be a blocker for good UX, so didn't go further, but maybe you'll find a nice solution!

Searching over public internal data is addressed by a few existing tools, but it's the private aspect which is pretty difficult to handle and disastrous to get wrong when managed ad-hoc - e.g. someone accidentally adds the bot to a private slack group called #layoffs :) so you'd want this handled properly and centrally.

I guess you'll also need to handle privacy well, ~maybe it's OK when run as a SaaS for db admins to have access to ingested data, but if it's OSS then the people that run it probably shouldn't be able to read the private data that's ingested, so now you need to handle search over encrypted data, which is a fun problem :D


Access controls is a non-glamorous but critical piece of what we're building. Currently implementing automatic access sync-ing for a few sources like Google Drive, Confluence, Jira, and Notion to start. By matching document-access in the source to users and groups, and then to emails, we can finally map Danswer users to document level access. So someone searching in Danswer will only get results based on the set of documents they have access to in the source tool.

For Slack it would look something like: get the users in the Slack channel, map those Slack users to users in Danswer. Then only those users in Danswer will be able to get results from that channel.


> ~maybe it's OK when run as a SaaS for db admins to have access to ingested data, but if it's OSS then the people that run it probably shouldn't be able to read the private data that's ingested

I don't understand the distinction here. If Danswer runs a SaaS version then yes I agree they can have a license agreement that lets their DB Admins see data in some cases which is fine. That seems an orthogonal issue to if a company is running the OSS version internally, in which case presumably their administrator can see all docs (but software administrators usually can do this anyway).


Yep, this is exactly correct! For our SaaS version, we do have an agreement which allows us to look at data if needed to debug issues and/or improve search performance.

For self-hosted deployments, usually a select few admins who have setup the plumbing on AWS do have access (but as nl has mentioned, these people usually have access to superuser access on the tools we connect to anyways so this is a noop).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: