Semantic Search with Phoenix, Axon, Bumblebee, and ExFaiss

karolist · on Feb 15, 2023

I've been learning Elixir for the past few months for personal projects and it's been a delight. Happy to see the ecosystem growing. For those unaware, Elixir came 2nd as most loved language in StackOverflow developer survey last year (after Rust, of course), and Phoenix was the most loved web framework.

https://survey.stackoverflow.co/2022/#technology-most-loved-...

fredliu · on Feb 15, 2023

Have been a fan of Elixir and its ecosystem for web dev. However, I haven't wrapped my head around the core value proposition behind Elixir's recent "pivot" to AI/numeric computing. Can someone shed more light on "why Elixir" for AI/numeric computing?

victorbjorklund · on Feb 15, 2023

It shouldn't be seen as pivot to AI or trying to compete with say Python. It probably more should be seen as letting existing elixir developers/projects use AI in the projects without having to bring in another language like python. Thus avoiding the need to learn python / dealing with the complexities of having multiple languages in your app.

pmarreck · on Feb 15, 2023

Anyone who has fallen in love with Ruby or Elixir but who knows enough Python to want to avoid it, is disappointed that Python got "picked" as the ML scripting language (or the bioinformatics scripting language, but I digress). I'm glad Elixir chose to go this route, as it's much more deserving of this role IMHO.

billchristian · on Feb 15, 2023

This recent Elixir Conf video by Chris Grainger does an excellent job articulating the benefits he saw in switching to a Elixir-based AI stack.

https://www.youtube.com/watch?v=Y2Nr4dNu6hI

PaulHoule · on Feb 15, 2023

I want so bad to see an article like this where somebody does some tests to see if the search results are any good.

itake · on Feb 15, 2023

I’ve played around with semantic search tools and the results were not great. This article [0] compares the model used in the above post with openai’s model.

[0] - https://medium.com/@nils_reimers/openai-gpt-3-text-embedding...

mrdoops · on Feb 15, 2023

It's good, but these models are general purpose starting points - they expect and recommend fine tuning to get excellent results.

andy_ppp · on Feb 15, 2023

Is there ever going to be a way to distribute training - I would think this is the only way open models will eventually be able to exist and not just be owned by Microsoft, Google, Facebook and AWS.

jessfyi · on Feb 15, 2023

The BigScience team (a working group of researchers that trained the BLOOM-176B LLM last year) released Petals [0][1] which allows distributed inference and fine-tuning of BLOOM, with the option to pick a custom model + private swarm. SWARM [2][3] is a WIP from yandex and UW that shares some of the same codebase, but is for distributed training.

[0] https://petals.ml/ [1] https://github.com/bigscience-workshop/petals [2] https://github.com/yandex-research/swarm [3] https://twitter.com/m_ryabinin/status/1625175933492641814

mrdoops · on Feb 15, 2023

There almost surely will and Elixir/OTP is going to do it best.

nerdponx · on Feb 15, 2023

Training is distributed already, but over a big cluster of machine in a data center.

I've always wanted there to be something like BOINC/Gridcoin for fitting these giant neural networks.