Hacker Newsnew | past | comments | ask | show | jobs | submit | chrischen's commentslogin

A photo is easy to take but hard to reproduce.

As is randomly splattering paint on a canvas, even with no artistic vision or skill.

The whole concept of intellectual property rights is a social and legal construct designed to promote innovation in an economy. If you don't care about that, then there really isn't any moral or immoral aspect to it. The immorality of it and associating it with stealing was just MPAA propaganda to try to shame people into paying for stuff.

If I found some DVD lying on the ground and watched it and I didn't pay for it, it's really up to me to decide if I want to pay the creator so they can continue to produce content. If I don't pay then obviously it doesn't help them produce more content... but the consumption of the content itself neither felt nor heard by the creators.


The bedrock of the argument is that you give for what you take. This is very fundamental, not just some capitalist drivel. You'd be hard pressed to find a single level headed individual who could form a coherent argument against it (generally speaking, not just protracted edge cases). Even your most hippie communist commune requires giving in order to receive.

People act (many even think) like this doesn't apply to digital goods, since copying has no material cost. But producing that digital good costs time and money (anyone on HN care to disagree?). So then you have to decide who are the ones who pay and who are the ones who get free copies. Conveniently, everyone who is getting a free copy thinks that they have a rightful stake to it for free. And because nothing is actually free (see the first line), the ones paying are the ones also covering the cost for those who get is free.

I wouldn't expect teenagers to grasp this, after all we were the teenagers who devised this "piracy as a moral crusade" back in the 90's/00's (how convenient that a side effect of this moral crusade was all the free content your dead broke ass could imagine). But now, if you are in your 30's or older and still haven't logic'ed this out, it's time to catch up.


Simple: people who want it to exist can fund its creation. People who are indifferent or don't want it to exist can choose not to, and once it exists, there's obviously no moral question either way. We already have lifetimes of media available. It costs nothing to replicate infinitely. Do we need to specifically incentivize more?

I think the world would also be a lot better off if software could all be freely distributed and if warranty law required software sales to come with source as well. If you need the computer to do something, you pay a programmer to make it so. You or that programmer can then share the solution with others. The goal is to solve more problems and build a wealthier society for our children, not create rent extraction machines.

Likewise with things like the textbook racket. The government should just commission updates for k-12 books (including AP, so basic uni) every ~15 years or so. Most of this stuff is not changing. It should be "done".


> But producing that digital good costs time and money (anyone on HN care to disagree?)

Not disagree, but it is more nuanced than this I think. I spend a fair amount of money going to movie theaters, usually independent movie theaters but sometimes big ones, to see new releases. As I understand it, the production and funding model relies almost entirely on the box office numbers. I think when dealing with older releases, the waters are much murkier.

I end up seeing new things in person and paying a huge premium to do so. I won't pretend I do it for moral reasons or even strictly to support the creators (although I do it in part to support the independent theater itself). It does keep me from feeling bad for also running a media server, on which maybe 1% of the content is newer than 5 years old, though.

I have almost never bought a physical copy of a movie -- and in my mind the IP holders are usually terrible curators of their own content. Physical media is provided in a horribly limited and anti-consumer format, tied to ephemeral standards and technology and often embedded with advertisements and few subtitle options. Digital products are, somehow, worse. Tied to a walled garden, with no true 'ownership', sometimes platforms like Amazon video will even make their own edits to movies, removing crucial parts for no apparent reason (the wicker man, avatar) and without marking it as abridged. They often make decisions that scream 'cash grab' (i.e. years ago when TNG came on netflix, I went to stream it and was shocked at the potato quality. Later re-releases were released in an un-cropped widescreen that included things like boom mikes because of the original intended aspect ratio of the show.) DRM is a nightmare. The product I want -- a file containing the media and only the media, which I can view however I want without logging into anybody's servers -- does not exist. And if it did exist, well, I do also take issue with paying full price for a file of a 40 year old movie, for example. I know there are costs associated with remasters, etc, but most of these are not remasters (and those costs are also much much lower than outright movie production).

A notable exception is outfits like Vinagar Syndrome, who as a labor of love dig up lost media and often re-cut or remaster / distribute it, and due to the low scale and lack of demand likely do not make much if any profit off it. I often do see showings of Vinegar Syndrome releases at my indie theater though or rent them from the one remaining video rental place (I'm unsure whether or not that benefits the production company).

It probably gets more hairy for people who watch a lot of new serialized media, which I do not.

I kind of wish people would think critically about the gradient of potential consumption habits when making their media choices rather than separating into pro / anti piracy stances, because it's an interesting and multi-faceted topic with a lot of considerations to be made.


Yesterday I was trying to figure out if my expired nacho dip would be safe to eat and wanted to know how much botulism would be toxic if I ate it and so I asked Claude. It refused to answer that question so I could see how the current safeguards can be limiting.

It beeps at you if you stop paying attention, which is superior. Hands on wheel is an arbitrary design decision more likely to placate what a layman would think is necessary to ensure safe AI steering.


It’s an option in open pilot, but not one that defaults to on


This doesn't mean that it's over for SO. It just means we'll probably trend towards more quality over quantity. Measuring SO's success by measuring number of questions asked is like measuring code quality by lines of code. Eventually SO would trend down simply by advancements of search technology helping users find existing answers rather than asking new ones. It just so happened that AI advanced made it even better (in terms of not having to need to ask redundant questions).


With coding agents AI almost never manually type code anymore. It would be great to have a code editor that runs on my phone so I can do voice prompts and let the coding agents type stuff for me.


That sounds awful


Similar to a product or engineering manager giving directions on a call from the golf course.


Golf course isn't bad; I witnessed a CEO join meetings from the subway and packed airport concourses lol


Early in my career I drew the short straw to fetch a C level exec who was running a critical incident from a strip club and too drunk to drive.

I had to pay the $90 three drink minimum to get in. Getting that reimbursed was fun.


But hey, look how productive they are with their time! :)


Funny enough, that sounds awful, too.


To be fair with the languages I use there are only a finite number of ways a particular line or even function can be implemented due to high level algebraic data structures and strict type checking. Business logic is encoded as data requirements, which is encoded into types, which is enforced by the type checker. Even a non-AI based system can technically be made to fill in the code, but AI system allows this to sort of be generalized across many languages that did not implement auto-complete.


I have been doing this with GitHub's copilot agent web interface on my phone; word-vomit voice prompt + instructions to always run the tests or take screenshots so I can evaluate the change works really well.


Was it the github issues copilot integration? I found that to be slow compared to natively running copilot in the IDE.


Claude does this, at least on an iPhone. They added Code to the app about a month ago. I used it to get a Pebble Watch project started.


We already have verification layers: high level strictly typed languages like Haskell, Ocaml, Rescript/Melange (js ecosystem), purescript (js), elm, gleam (erlang), f# (for .net ecosystem).

These aren’t just strict type systems but the language allows for algebraic data types, nominal types, etc, which allow for encoding higher level types enforced by the language compiler.

The AI essentially becomes a glorified blank filler filling in the blanks. Basic syntax errors or type errors, while common, are automatically caught by the compiler as part of the vibe coding feedback loop.


Interestingly, coding models often struggle with complex type systems, e.g. in Haskell or Rust. Of course, part of this has to do with the relative paucity of relevant training data, but there are also "cognitive" factors that mirror what humans tend to struggle with in those languages.

One big factor behind this is the fact that you're no longer just writing programs and debugging them incrementally, iteratively dealing with simple concrete errors. Instead, you're writing non-trivial proofs about all possible runs of the program. There are obviously benefits to the outcome of this, but the process is more challenging.


Actually I found the coding models to work really well with these languages. And the type systems are not actually complex. Ocaml's type system is actually really simple, which is probably why the compiler can be so fast. Even back in the "beta" days of Copilot, despite being marketed as Python only, I found it worked for Ocaml syntax and worked just as well.

The coding models work really well with esoteric syntaxes so if the biggest hurdle to adoption of haskell was syntax, that's definitely less of a hurdle now.

> Instead, you're writing non-trivial proofs about all possible runs of the program.

All possible runs of a program is exactly what HM type systems type check for. This fed into the coding model automatically iterates until it finds a solution that doesn't violate any possible run of the program.


There's a reason I mentioned Haskell and Rust specifically. You're right, OCaml's type system is simpler in some relevant respects, and may avoid the issues that I was alluding to. I haven't worked with OCaml for a number of years, since before the LLM boom.

The presence of type classes in Haskell and traits in Rust, and of course the memory lifetime types in Rust, are a big part of the complexity I mentioned.

(Edit: I like type classes and traits. They're a big reason I eventually settled on Haskell over OCaml, and one of the reasons I like Rust. I'm also not such a fan of the "O" in OCaml.)

> All possible runs of a program is exactly what HM type systems type check for.

Yes, my point was this can be a more difficult goal to achieve.

> This fed into the coding model automatically iterates until it finds a solution that doesn't violate any possible run of the program.

Only if the model is able to make progress effectively. I have some amusing transcripts of the opposite situation.


I also try to do verbose type classes using Ocaml's module system and it's been handling these patterns pretty well. My guess is there is probably good documentation / training data in there for these patterns since they are well documented. I haven't actually used coding agents with Haskell yet so it's possible that Ocaml's verbosity helps the agent.


Real time translations is a real good use case. The problem is most implementations such as the Airpods live translate are not great.


I've been in many situations where I wanted translations, and I can't think of one where I'd actually want to rely on either glasses or the airpods working like they do in the demos.

The crux of it for me:

- if it's not a person it will be out of sync, you'll be stopping it every 10 sec to get the translation. One could as well use their phone, it would be the same, and there's a strong chance the media is already playing from there so having the translation embedded would be an option.

- with a person, the other person needs to understand when your translation in going on, and when it's over, so they know when to get an answer or know they can go on. Having a phone in plain sight is actually great for that.

- the other person has no way to check if your translation is completely out of whack. Most of the time they have some vague understanding, even if they can't really speak. Having the translation in the glasses removes any possible control.

There are a ton of smaller points, but all in all the barrier for a translation device to become magic and just work plugged in your ear or glasses is so high I don't expect anything beating a smartphone within my lifetime.


Some of your points are already considered with current implementations. Airpods live translate uses your phone to display what you say to the target person, and the target person's speech is played to your airpods. I think the main issue is that there is a massive delay and apple's translation models are inferior to ChatGPT. The other thing is the airpods don't really add much. It works the same as if you had the translation app open and both people are talking to it.

Aircaps demos show it to be pretty fast and almost real time. Meta's live captioning works really fast and is supposed to be able to pick out who is talking in a noisy environment by having you look at the person.

I think most of your issues are just a matter of the models improving themselves and running faster. I've found translations tend to not be out of whack, but this is something that can't really be solved except by having better translation models. In the case of Airpods live translate the app will show both people's text.


It's understating the lag. Faster will always be better, but even "real time" still requires the other person to complete their sentence before getting a translation (there is the edge case of the other language having similar grammatical structure and word order, but IMHO that's rare), and you catch up from there. That's enough lag to warrant putting the whole translation process literally on the table.

I see the real improvements in the models, for IRL translation I just think phones are very good at this and improving from there will be exponentially difficult.

IMHO it's the same for "bots" intervening (commenting/reacring on exchanges etc.) in meetings. Interfacing multiple humans in the same scene is always a delicate problem.


I have the G1 glasses and unfortunately the microphones are terrible, so the live translation feature barely works. Even if you sit in a quiet room and try to make conditions perfect, the accuracy of transcription is very low. If you try to use it out on the street it rarely gets even a single word correct.


This is the sad reality of most if these AI products and it’s that they are just taking poor feature implementations on the hardware. It seems like if they just picked one or these features and doing it well will make the glasses useful.

Meta has a model just for isolating speech in noisy environments (the “live captioning feature”) and it seems that’s also the main feature of the Aircaps glasses. Translation is a relatively solved problem. The issue is isolating the conversation.

I’ve found meta is pretty good about not overdelivering on promised features, and as a result even though they probably have the best hardware and software stack of any glasses, the stuff you can do with the Rayban displays are extremely limited.


Is it even possible to translate in real time? In many languages and sentences the meaning and translation needs to completely change all thanks to one additional word at the very end. Any accurate translation would need to either wait for the end of a sentence or correct itself after the fact.


Live translation is a well solved problem by this point — the translation will update as it goes, so while you may have a mistranslation visible during the sentence, it will correct when the last word is spoken. The user does need to have awareness of this but in my experience it works well.

Bear in mind that simultaneous interpretation by humans (eg with a headset at a meeting of an international organisation) has been a thing for decades.


I guess the translation can always update itself in real time if the model is fast enough.


But these models are more like generalists no? Couldn’t they simply be hooked up to more specialized models and just defer to them the way coding agents now use tools to assist?


There would be no point in going via an LLM then, if I had a specialist model ready I'd just invoke it on the images directly. I don't particularly need or want a chatbot for this.


Current LLMs are doing this for coding, and it's very effective. It delegates to tool calls, but a specialized model can just be thought of as another tool. The LLM can be weak in some stuff handled by simple shell scripts or utilities, but strong in knowing what scripts/commands to call. For example, doing math via the model natively may be inaccurate, but the model may know to write the code to do math. An LLM can automate a higher level of abstraction, in the same way a manager or CEO might delegate tasks to specialists.


In this case I'm building a batch workflow: images come in, images get analyzed through a pipeline, images go into a GUI for review. The idea of using a VLM was just to avoid hand-building a solution, not because I actually want to use it in a chatbot. It's just interesting that a generalist model that has expert-level handwriting recognition completely falls apart on a different, but much easier, task.


I think a bigger problem is the HN reader mind reading what the rest of the world wants. At least when an HN reader telling us what they want it's a primary source, but reading a comment about an HN reader postulating what the rest of the world wants is simply more noisy than an unrepresentative sample of what the world may want.


Point taken. However, would you say HN readers are an accurate average cross-section of broader society? Including interests and biases?


I would guess HN readers are not an average cross-section of broader society, but I would also guess that because of that HN readers would be pretty bad at understanding what broader society is thinking.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: