Show HN: I built haystack – Google for the workplace https://ift.tt/DW1UZYa
Show HN: I built haystack – Google for the workplace Hi Yuval here! historicly a security researcher, more recently entered the NLP space! Iv'e started working on haystack recently because I feel modern workplaces are in dire need of a good workplace search product that is free to use just like google. Information is scattered between too many communication channels, we communicate with our peers through slack and email, share docs and specs on confluence, work with tickets on jira, commit code and have discussions on github, not to mention all the .docx, .ppt and .pdf that fly around the organization. Not to mention the fact that existing search featuers just plain suck, if you tried using confluence search you know what I mean, keyword search is terrible. Even when you find relevant looking results, they require you to commit to entering the page, and scroll through to get to the relevant paragraph. What does haystack do? - Enable you to search all your workplace applications from one place. (slack, confluence, notion, jira, github, outlook, gmail, etc...). - Natural language queries ("How to do X”, "Do we support Y", "How do I connect to Z"). - Help you decide if the result is relevant without entering the page. - Go directly: search result->relevant paragraph inside page. (no extra scrolling) - No download, all the magic happens in the browser. - Local browser storage option. (you don't need to trust me with your internal communications to use haystack). - Code references embedded in search results. Example, "How to connect to integ2 machine" on haystack could give you: ssh -i private.pem ubuntu@ec2-integration2.eu-west-1.compute.amazonwes.com aggregated from a slack communication you had a while ago. It was quite a challenge to get it up and running in the browser, but here's what I ended up using: IndexDB browser API for storage, and a fine-tuned TinyBERT-based bi-encoder for indexing, searching. Search result building involves using a fine-tuned t5-small model, there's some nodejs adaptations, and wasm rewrites in rust for performance. Next Steps fine-tuning haystack for lower-end laptops with no dedicated GPU, I'm feeling like it should be running smoothly for lowerend hardware by Feb/March, so that's the current public release date. If you would like to get early access + you have dedicated graphics, there's a button in our landing page, and my email address. I'll be here in the comment section! https://ift.tt/nhSJAxy December 28, 2022 at 07:39PM
Comments
Post a Comment