Interesting. How long does it take to build the index of a movie like spider man (on whatever hardware you have at hand)? And how big will its index be?
The images all get scaled down to 256x256 before the embeddings are generated to optimize for space. Don't have an exact number on how large the index will be but happy to run some tests and get back to you. The embeddings are stored as float32 arrays of 512 length.
Are the images being compressed losslessly or they lose quality? I've played around with image compression in Electron JS and haven't found a good solution, so any resources are greatly appreciated.
The resizing process doesn't introduce any compression artifacts but there is some loss when the images are processed by CLIP, depending on the dimensions.
CLIP likes perfect squares, so that's a limitation on the model size.
In terms of general compression are you familiar with FFmpeg? It has support for lossless compression into a bunch of different image formats.