Hacker Newsnew | past | comments | ask | show | jobs | submit | agavra's commentslogin

This is the biggest gap in the 0.2.1 release. We have a pretty naive query execution engine because we've spent most of the time on core data structures and ingestion.

I have some prototypes of vectorized compute that takes that same query from 2s -> ~800ms, and it's just early days. If you want to contribute to help make it better, the query engine part of it is begging for help!


also super cool to see you on here valyala! we took a bunch of inspiration from your work at VM. kudos to all you've done :)

The other solution is to aggressively size your disk cache and keep effectively the full working set on disk, using object storage just as a durability layer. Then the main benefit is operational simplicity because you have a true shared-nothing architecture between the read replicas (there's no quorum or hash ring to maintain and no deduplication on read). Obviously you'll have a more expensive deployment topology if you do so, but it's still compelling IMO because you have the knobs to tune whether you want to cache on disk or not.

anecdotally I've heard confirmations of the challenge of running VictoriaMetrics clusters at scale. they're way better than Cortex/Thanos and they've built a pretty awesome product but still are a pretty significant operational burden.

Good point, a tl;dr is probably worthwhile.

It's definitely not quite turn key just yet but we've been dogfooding it in production against a moderate metrics use case (~30k samples/s) and have it hooked up to grafana (you just configure a prometheus source and point to your deployed URL). We run it on a single node with no replicas ;)


Checkout https://github.com/agavra/tuicr - it's built exactly for this purpose (reviewing code in your terminal and then adding comments and exporting it to an agent to fix).


This is really nice! I like the ability to add comments to "send it back" for another pass.


I just built a version of this a month ago that also allows you to add review comments so you can export them back to an Agent to fix: https://github.com/agavra/tuicr

Great work on deff, would love to brainstorm here :)


the compression algorithm you select for your data is quite dependent on the dataset you have. the equations in this blog post don't help you choose which compression to use, but rather "how much" and when to compress. I would be curious to formalize the math for different compression algorithms though... might be a good follow up post!


I was calculating timings and compression ratio for each array with each algorithm. Then I would save the “best” one to use for next chunks of data.

But it is hard to decide how to judge the cpu vs disk/network tradeoff like you explain in the article.

I was a bit curios if I could make an API so on the top level user enters some parameters and the system can adjust this calculation according to that.

But had some issues with this because the hardware budget used by all parts of the system, not only by the compression code.

As an example network is mega fast in data center but can be slow and expensive when connecting to a user. The application can know which case it is executing but it is hard to connect that part of the code into the compression selection stuff cleanly.

Also on network case. It might make sense to keep data large but cpu time low until I hit the limit but nothing matters when I hit the limit.

Would be cool to have a mathematical framework to put some numbers in and be able to reason about the whole picture


This is spot on, I understand very little about how terminal rendering works and was able to build github.com/agavra/tuicr (Terminal UI for Code Review) in an evening. The initial TUI design was done via Claude.


RocksDB actually does something somewhat similar with its prefix compression. It prefix-compresses texts and then "resets"the prefix compression every N records so it stores a mapping of reset point -> offset so you can skip across compressed records. It's pretty neat


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: