Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Semantic Search with Phoenix, Axon, Bumblebee, and ExFaiss (dockyard.com)
65 points by clessg on Feb 15, 2023 | hide | past | favorite | 12 comments


I've been learning Elixir for the past few months for personal projects and it's been a delight. Happy to see the ecosystem growing. For those unaware, Elixir came 2nd as most loved language in StackOverflow developer survey last year (after Rust, of course), and Phoenix was the most loved web framework.

https://survey.stackoverflow.co/2022/#technology-most-loved-...


Have been a fan of Elixir and its ecosystem for web dev. However, I haven't wrapped my head around the core value proposition behind Elixir's recent "pivot" to AI/numeric computing. Can someone shed more light on "why Elixir" for AI/numeric computing?


It shouldn't be seen as pivot to AI or trying to compete with say Python. It probably more should be seen as letting existing elixir developers/projects use AI in the projects without having to bring in another language like python. Thus avoiding the need to learn python / dealing with the complexities of having multiple languages in your app.


Anyone who has fallen in love with Ruby or Elixir but who knows enough Python to want to avoid it, is disappointed that Python got "picked" as the ML scripting language (or the bioinformatics scripting language, but I digress). I'm glad Elixir chose to go this route, as it's much more deserving of this role IMHO.


This recent Elixir Conf video by Chris Grainger does an excellent job articulating the benefits he saw in switching to a Elixir-based AI stack.

https://www.youtube.com/watch?v=Y2Nr4dNu6hI


I want so bad to see an article like this where somebody does some tests to see if the search results are any good.


I’ve played around with semantic search tools and the results were not great. This article [0] compares the model used in the above post with openai’s model.

[0] - https://medium.com/@nils_reimers/openai-gpt-3-text-embedding...


It's good, but these models are general purpose starting points - they expect and recommend fine tuning to get excellent results.


Is there ever going to be a way to distribute training - I would think this is the only way open models will eventually be able to exist and not just be owned by Microsoft, Google, Facebook and AWS.


The BigScience team (a working group of researchers that trained the BLOOM-176B LLM last year) released Petals [0][1] which allows distributed inference and fine-tuning of BLOOM, with the option to pick a custom model + private swarm. SWARM [2][3] is a WIP from yandex and UW that shares some of the same codebase, but is for distributed training.

[0] https://petals.ml/ [1] https://github.com/bigscience-workshop/petals [2] https://github.com/yandex-research/swarm [3] https://twitter.com/m_ryabinin/status/1625175933492641814


There almost surely will and Elixir/OTP is going to do it best.


Training is distributed already, but over a big cluster of machine in a data center.

I've always wanted there to be something like BOINC/Gridcoin for fitting these giant neural networks.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: