What's state without mutation? If something is immutable then you can't tell the difference between a value and a copy of that value, so the question of whether it's "state" is rather meaningless.
I think you can define it. You just need to be willing to pin down some definitions so there's something you can formally talk about.
For example, let's say an "program" (in whatever abstract sense, whether you want to represent it as DAGs of function compositions or as a sequence of instructions or whatever else) takes, by definition, an infinite sequence of input symbols, and produces an infinite sequence of output symbols. Both sequences start (WLOG) at index 0. Note that for finite sequences, you can represent them by appending an infinite number of symbols that you consider null.
Let's say that "running" the program "until step k" (in whatever abstract sense you want to define, whether it's evaluating functions or following a sequence of operations or whatever) corresponds to "producing all outputs up to index <= k". The "state" of that program at step k might then be defined as the amount of information (measured in bits or whatever) that that particular program requires for eventually producing outputs at indices > k, aside from the information needed to represent the program itself.
As long as you formally define a way to measure information and a way to represent your program, I think you should be able to come up with a definition as I did above to define "program state" formally. You just need to be willing to do that, is all.
I don't think that captures the aspects people care about most of the "program state" that people care about is unrelated to I/O - particularly in Haskell, people care about programs that are essentially "evaluate this large equation" where there's only one piece of output at the end, but might be a lot of intermediate computation. And the details of what's stored in memory aren't what matters - after all, Haskell very deliberately makes no distinction between a plain value like 3 and a large and complicated thunk that will eventually evaluate to that value - but if you just measure the minimum number of bits required to store something equivalent then you've assumed away the whole evaluation of the program.
At least from the functional perspective you're coming at this from the wrong end in the first place - you're looking at it operationally rather than denotationally. The point of saying that there's "no state" or "no mutation" in a functional program is that there is no such concept at the semantic, code level: when reading any given piece of code, you don't need to think about state to understand what it's doing.
You're muddying the waters here. If you're saying "I/we don't care about a notion of state in Haskell", that's fine; I can't force you to care about something you don't care about. I'm just saying if you did care to, you very well could find a pretty meaningful and useful definition for "state".
The thing is though, the fact that you don't doesn't imply state is meaningless or useless. After all, you're still going to face the fundamental limits on computation that are out there, regardless of the model of computation. It's not like switching to Haskell and beta-reductions suddenly lets you solve every problem instantaneously, right? You're still going to have just as hard of a time (if not harder!) solving a problem like TSP efficiently in Haskell, as you would in Python or Prolog.
Look at it this way: the only reason you wouldn't "care" about state would be that you already know how things are going to turn out, not because the whole notion is itself meaningless. If tomorrow you wrote a Haskell program that factored integers in time & space linear to the input size, you wouldn't toss it into the trash thinking "whatever, time and space are meaningless concepts in FP". You'd absolutely care. Not only would you care, but you'd safely assume everyone else would too. You'd either find a way to profit off those results, or you'd publish them with formal definitions and tell everyone and their grandmother how awesome and meaningful they are. You'd find them as far from meaningless as anything could be. Of course, you're probably not going to get that lucky, but your lack of luck or care for the concepts doesn't render the concepts themselves meaningless.
> I'm just saying if you did care to, you very well could find a pretty meaningful and useful definition for "state".
I'm not at all convinced. The thing that actually determines the performance of real-word programs these days (after algorithm choice) is cache efficiency, and no mainstream programming model gives you a "state" model that actually lets you reason about that.
> Look at it this way: the only reason you wouldn't "care" about state would be because you already know how things are going to turn out, not because the whole notion is itself meaningless. If tomorrow you wrote a Haskell program that factored integers in time & space linear to the input size, you wouldn't toss it into the trash thinking "whatever, time and space are meaningless concepts in FP". You'd absolutely care. Not only would you care, but you'd safely assume everyone else would too. You'd either find a way to profit off those results, or you'd publish them and tell everyone and their grandmother how awesome and meaningful they are. You'd find them as far from meaningless as anything could be. Of course, you're probably not going to get that lucky, but your lack of luck or care for the concepts doesn't render the concepts themselves meaningless.
Today's programmers put way too much focus on performance given that we as an industry haven't even mastered producing the correct output for any given input. Do I care about how much memory my program will use at runtime? Sure, at some level. Is it something I want front-and-centre in my programming language? No; language features that help me write correct programs are a higher priority.
> The thing that actually determines the performance of real-word programs these days (after algorithm choice) is cache efficiency
> Today's programmers put way too much focus on performance
So statefulness is a meaningless concept in functional programming because... programmers put too much focus on performance? and cache efficiency is what matters?
Is there a logical chain of implications I'm missing here? I feel like we're not even in the same conversation at this point.
Well, I got lost trying to follow you. My position is: Functional programming more or less means expressing (denotational) computation without using a concept of (hidden, mutable) state. This is a good approach because it makes it easier to write correct programs. I disagree with the argument that a programming language should expose state at the language level (rather than making it a detail of the implementation) for the sake of performance on two levels: performance is less important than correctness, so it's a bad idea to sacrifice correctness for performance in language design, but also exposed state doesn't even help you improve performance (because cache efficiency is what matters).