Hacker Newsnew | past | comments | ask | show | jobs | submit | munch117's commentslogin

Making a plan that works for the general case, but is also efficient, is rather trivial. Here's pseudocode from spending two minutes on the problem:

    # INPUT: lookfor: unicode
    var lower, upper: ascii
    lower = ascii_lower_bound(lookfor)
    upper = ascii_upper_bound(lookfor)
    for candidate:ascii in index_lookup(lower .. upper):
        if expensive_correct_compare_equal(candidate.field, lookfor):
            yield candidate
The magic is to have functions ascii_lower_bound and ascii_upper_bound, that compute an ASCII string such that all ASCII strings that compare smaller (greater) cannot be equal to the input. Those functions are not hard to write. Although you might have to implement versions for each supported locale-dependent text comparison algorithm, but still, not a big deal.

Worst case, 'lower' and 'upper' span the whole table - could happen if you have some really gnarly string comparison rules to deal with. But then you're no worse off than before. And most of the time you'll have lower==upper and excellent performance.


> Just clearly state your requirements.

Nothing new here. Getting users to clearly state their requirements has always been like pulling teeth. Incomplete sentences and all.

If the people you are teaching are developers, they should know better. But I'm not all that surprised if many of them don't. People will be people.


You're right, they should know better, but I think a lot of them have gotten away with it because most of them are not expected to produce written material setting out missing assumptions etc. and breaking down the task into more detail before proceeding to work, so a lot have never gotten the practice.

Once people have had the experience of being a lead and having to pass tasks to other developers a few times, most seem to develop this skill at least to a basic level, but even then it's often informal and they don't get enough practice documenting the details in one go, say by improving a ticket.


TNG, by a country mile. B5 has "writer identifies too much with the main character" written all over it. It's the story of how Our Great Leader does the right thing and saves the world, over and over again.


I'm slightly taken aback by the telnetd fix: The solution to the username "-f root" being interpreted as two arguments to /usr/bin/login is to add a "sanitize" function, really? I'm not seeing the sense in that. Surely in any case where the sanitize functions changes something, the login will fail. Better to error out early than to sanitize and try to hobble along.

What I'd like to know is how the arguments get interpreted like that in the first place. If I try giving that kind of argument /usr/bin/login directly, its argument parser chides me:

  $ login '-f root'
  login: illegal option --  
What's telnetd doing differently? Is it invoking login via a shell?


You passed '-f root' to login (a single long string). telnetd is likely passing '-f' 'root' to login (two arguments instead of one, whether this is because it creates two, or it uses the shell (which then reparses) I don't know).

But '-f' is a valid option to login (man login):

login [-p] [-h host] [-H] [-f username|username]

...

-f Used to skip a login authentication. This option is usually used by the getty(8) autologin feature.


I was reading https://www.offsec.com/blog/cve-2026-24061/, which implies that precisely that single long string passes through getenv("USER") in the attack. The mystery is how that single long string in telnetd becomes two separate arguments for login. execv or posix_spawn certainly won't do that. So either there's a version of /usr/bin/login that parses arguments in some very sus way, or there's a shell involved somehow.


This article is just about as un-AI written as anything I've ever read. The headings are clearly just the outline that he started with. An outline with a clear concept for the story that he's trying to tell.

I'm beginning to wonder how many of the "This was written by AI!" comments are AI-generated.


It's strange to see folks here speculate about something you've written.

And if you only knew how much those headings and the structure of this post changed as I wrote it out and got internal feedback on it ^^_


I struggled a bit with what to point to as signs that it's not an LLM conception. Someone else had commented on the headlines as something that was AI-like, and since I could easily imagine a writing process that would lead to headlines like that, that's what I chose. A little too confidently perhaps, sorry.

But actually, I think I shouldn't have needed to identify any signs. It's the people claiming something's the work of an LLM based on little more than gut feelings, that should be asked to provide more substance. The length of sentences? Number of bullet points? That's really thin.


I don't think people should be obligated to spend time and effort justifying their reasoning on this. Firstly it's highly asymmetrical; you can generate AI content with little effort, whereas composing a detailed analysis requires a lot more work. It's also not easily articulatable.

However there is evidence that writers who have experience using LLMs are highly accurate at detecting AI generated text.

> Our experiments show that annotators who frequently use LLMs for writing tasks excel at detecting AI-generated text, even without any specialized training or feedback. In fact, the majority vote among five such “expert” annotators misclassifies only 1 of 300 articles, significantly outperforming most commercial and open-source detectors we evaluated even in the presence of evasion tactics like paraphrasing and humanization. Qualitative analysis of the experts’ free-form explanations shows that while they rely heavily on specific lexical clues, they also pick up on more complex phenomena within the text that are challenging to assess for automatic detectors. [0]

Like the paper says, it's easy to point to specific clues in ai generated text, like the overuse of em dashes, overuse of inline lists, unusual emoji usage, tile case, frequent use of specific vocab, the rule of three, negative parallelisms, elegant variation, false ranges etc. But harder to articulate and perhaps more important to recognition is overall flow, sentence structure and length, and various stylistic choices that scream AI.

Also worth noting that the author never actually stated that they did not use generative AI for this article. Saying that their hands were on the keyboard or that they reworked sentences and got feedback from coworkers doesn't mean AI wasn't used. That they haven't straight up said "No AI was used to write this article" is another indication.

0: https://arxiv.org/html/2501.15654v2


> Also worth noting that the author never actually stated that they did not use generative AI for this article.

I expect that they did in some small way, especially considering the source.

But not to an extent where it was anywhere near as relevant as the actual points being made. "Please don't complain about tangential annoyances,", the guidelines say.

I don't mind at all that it's pointed out when an article is nothing more than AI ponderings. Sure, call out AI fluff, and in particular, call out an article that might contain incorrect confabulated information. This just wasn't that.


A __del__ that does any kind of real work is asking for trouble. Use it to print a diagnostic reminding you to call .close() or .join() or use a with statement, and nothing else. For example:

    def close(self):
        self._closed = True
        self.do_interesting_finalisation_stuff()
    def __del__(self):
        if not self._closed:
            print("Programming error! Forgot to .close()", self)
If you do anything the slightest bit more interesting than that in your __del__, then you are likely to regret it.

Every time I've written a __del__ that did more, it has been trouble and I've ended up whittling it down to a simple diagnostic. With one notable exception: A __del__ that put a termination notification into a queue.Queue which a different thread was listening to. That one worked great: If the other thread was still alive and listening, then it would get the message. If not, then the message would just get garbage-collected with the Queue, but message would be redundant anyway, so that would be fine.


Yep, a __del__ in the redis client code caused almost random deadlocks at my job for several years. Manual intervention was required to restart stuck Celery jobs. Took me about 2-3 weeks to find the culprit (had to deploy python interpreter compiled with debug info into production, wait for deadlock to happen again, attach with gdb and find where it happens). One of the most difficult production issues I had to solve in my life (because it happened randomly and it was impossible to even remotely guess what is causing it).


One helpful rule is: if you use `__del__`, it should be on a separate class which doesn't contain any methods or data except the native handle.

You can't call inappropriate functions if you don't have any way to reach them!


> With one notable exception: A __del__ that put a termination notification into a queue.

Yeah, at some point, I was working on a prototype of finalization for JavaScript, and that was also my conclusion.


> If you think you disagree with him (as I once did), please consider the possibility that you've only been exposed to an ersatz characterization of his argument.

My first exposure was a video of Searle himself explaining the Chinese room argument.

It came across as a claim that a whole can never be more than its parts. It made as much sense as claiming that a car cannot possibly drive, as it consists of parts that separately cannot drive.


This https://youtu.be/6tzjcnPsZ_w maybe? It's Searle explaining it.


I'm not that concerned with bugs in sqlite. sqlite is high quality software, and the application that uses it is a more likely source of vulnerabilities.

But I do see a problem if you really need to use a sqlite that's compiled with particular non-default options.

Say I design a file format and implement it, and my implementation uses an sqlite library that's compiled with all the right options. Then I evangelize my file format, telling everyone that it's really just an sqlite database and sooo easy to work with.

First thing that happens is that someone writes a neat little utility for working with the files, written in language X, which comes with a handy sqlite3 library. But that library is not compiled with the right options, and boom, you have a vulnerable utility.


Most of the recommended [1] setting are available on a per connection basis, through PRAGMAs, sqlite3_db_config, sqlite3_limit, etc; some are global settings, like sqlite3_hard_heap_limit64.

A binding can expose those settings. It's not a given a third party utility will use them, but they can.

1: https://www.sqlite.org/security.html


Ah, I missed that 9.a-c were alternatives. And that, in the absence of custom tables or functions, they are merely defense in depth for something that is already secure, barring bugs. I withdraw my concern.


You have just reinvented the slab allocator.


Sure—-But I was specifically thinking in the context of this article


15 years ago, was intermittent fasting even a thing back then? I wonder how many of the people eating within an 8 hour window didn't do it because of a diet, but instead because of an eating disorder or some other disease.


Anecdotally, yes, i was doing it and there was a bunch of stuff online


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: