## Wednesday, 18 September 2013

### The Space of Languages

From time to time, I post things on the concept of language. What got me thinking about this originally was the modal status of T-sentences, and I gave a few talks on it over the last six or seven years, including a talk "Cognizing a Language" at the linguistics society in Edinburgh.

Since semantic concepts are language-dependent ("true-in-L", "refers-in-L", "implies-in-L", etc.), there's a quick argument that T-sentences are necessities. But this point is by no means restricted to T-sentences. The same holds when we consider any description of the syntactic, phonological, semantic, pragmatic properties of a language L. And, consequently, we need to distinguish:
(i) the syntactic, phonological, semantic, pragmatic properties of a language L
(ii) the cognizing relation that holds between an agent A and a language/idiolect L that A speaks, implements, realizes, etc.
because the modal status of the relevant facts is entirely different. (The orthodox view is that semantic facts are contingent. See, e.g., here.)

The basic argument was given by Field (1986) and Putnam (1985): for Putnam, in particular, applying modus tollens, it was some kind of reductio of Tarskian semantic theory that it yields the conclusion that semantic facts are necessities. But I apply modus ponens: they are necessities. What is contingent is the cognizing relation between agents and languages. Analogously, the properties of some Turing machine program are necessary, while it is contingent whether some physical machine "realizes" or "implements" that program.

A community $G$ of agents will, in general, all cognize different idiolects, $L_1, L_2, \dots$, even if they are very similar to each other. The point is that, strictly speaking, $L_i \neq L_j$ (for $i \neq j$). And a single agent may cognize multiple idiolects, or "micro-idiolects", which may be changing all the time. So, Humpty Dumptyism is true (... despite the protests of many philosophers!).

On this view, suppose we define $\Omega$ to be the space of all languages. So,
$\Omega$ = the collection of all $L$ such that $L$ is a language.
There may well be a Russell-style paradox, connected to largeness and self-reference, lurking here; maybe I'll mention it at some later point. So, $\Omega$ might have to be a somehow regulated space; e.g., the space of all set-sized, or maybe well-founded, languages.

$\Omega$ contains the idiolects spoken by each and every cognitive system, each human, old and young, each non-human creature, any non-terrestrial creature, or cognitive system there might be; as well as all the languages that, for feasibility reasons, cannot be spoken/cognized. $\Omega$ contains the idiolect you speak right now, and all other idiolects you may have spoken as your language state evolved to its present one. It contains all theoretically defined languages, finite and infinite, etc. It contains uninterpreted languages and it contains interpreted languages. It contains the Guitar Language, which is an odd language with no syntax at all.

[If so-called natural languages are languages (I think they aren't), then $\Omega$ contains all natural languages. But I think "natural languages" are idealized entities of some sort, as there is no individual that actually speaks or cognizes such a language. Strictly speaking, so-called natural languages, such as English, French, Hindi, etc., do not exist, in the sense of there being a community all speaking the same language. For example, what is the exact number of words in English? What is the exact pronunciation of "ouch"? Speakers exist and so do their idiolects, which may be changing in very complicated ways. But the concept of a natural language seems to be some kind of Hegelian myth, akin to "races".]

If what is said above is right (... I am plowing a very lonely furrow here), the sub-discipline within the philosophy of language that's now called "metasemantics" then has two main tasks. But these have a fundamentally different character modally and scientifically:
(i) What are the properties of the space $\Omega$ of languages? What are the individuation conditions for the elements $L \in \Omega$; what are the various relations amongst the $L \in \Omega$; etc.
(ii) How does the cognitive state of an agent evolve through $\Omega$? What is the nature of the cognizing relation, "$A$ cognizes $L$ (at time $t$)", which specifies the language state of a cognitive system $A$? How might it be constrained in terms of other cognitive states (memory, conceptual competence, perceptual input and action output, genetic factors, mental representation of strings and linguistic symbols, etc.)
The first problem belongs to applied mathematics: and this seems to be well reflected by the actual practice of workers in this field. Languages $L \in \Omega$ are specified---usually by an explicit definition of their syntax and sometimes the meaning functions (referential, intensional and pragmatic)---and their properties are examined, usually by proving theorems. The Chomsky Hierarchy is an example, but there are literally countless examples: uninterpreted formal languages; simple propositional languages; predicate logic languages; languages with all kinds of extra gadgets and operators, modal, epistemic, temporal, etc., operators; typed-languages; higher-order ones; infinitary languages; highly finitary languages; languages with no syntax (cf., the Guitar Language); etc.; etc.

Let us say that those who work on the first problem are studying $\Omega$, the space of all languages. Modally speaking, the properties of languages $L \in \Omega$ established are essential. Relations amongst languages $L_1, L_2 \in \Omega$ hold of necessity. For example, true claims of the form
$L^{+}$ is an extension of $L$ such that there is a relation definable in $L^{+}$ but not in $L$.
There is no intension-preserving translation $t: L_1 \to L_2$.
The string $\sigma$ is true in $L$ if and only if snow is white.
will hold of necessity.

Some theoretical linguistics, formal semantics, computational linguistics, mathematical logic, etc., belongs to this area: they are studying $\Omega$. Their theorems are about $\Omega$. Their theorems hold of necessity. The semantic description of a language $L \in \Omega$ holds of necessity. For the semantic properties of $L$ are intrinsic to it. If $L^{\ast}$ has different semantic properties, then $L^{\ast} \neq L$.

On the other hand, the question:
Does $L \in \Omega$ have one, many or no agents that speak/cognize $L$?
is a contingent matter. Compare with, say, the questions:
Does the large natural number $10^{10^{10^{10}}}$ have a "physical token"?
Is the infinite cardinal $\aleph_0$ "physically realized"?
etc.
These are contingent matters, requiring physical theory and experiment to help answer them.

The second problem, in contrast, belongs to empirical science. But I think the problem(s) here are very difficult, much harder than those confronting the first problem. If we are honest, very little is known about:
• the genetic basis of language cognition,
• how a cognitive language-using system evolves to anything like the mature cognitive state,
• what grounds or constitutes a cognitive system's cognizing $L$ rather than $L^{\ast}$,
• etc.
By analogy with physics, one would like to have some account of a "state-function'',
$L_A(t) \in \Omega$
which specifies how the language-cognizing state evolves, over time, through successive idiolects, and in connection with other states of the system. (Cf., in physics, the state of the system is an element of a state space, and the dynamical principles specify its evolution.)

The second problem uses (contingent) notions like:
$A$ "uses" string $\sigma$ when in a certain cognitive/affective state
$A$ and $B$ "communicate" with each other
$A$ "acquired" language by "interacting" with $B$
$A$ "copied" a word+meaning from $B$
$A$ introduced a new string $\sigma$.
$A$ "uses" string $n$ to refer to $x$.
etc.
It seems to me that no one properly understands any of this.

For example, how does one explain how agents $A$ and $B$ "communicate"? What is a "language community"? What could a "communal/social language" be? The cognitive states involved in a group of interacting speakers are associated with the idiolects actually spoken, much as in physics one is interested in the states that the system actually is in. What is the exact cognitive/affective state that an agent is in when "using" the strings "ouch" or "it's raining" or "the square root of $2$ is irrational"? What explains the introduction of new strings? How is a string "used" to refer to some object? To the object then, or the object now? Can anyone predict with some reasonable accuracy the evolving sequence of idiolects of a child? There are (combinatorially) countlessly many orbits through $\Omega$. Why one, and not another? What is the initial cognitive state? No one knows.

1. In answer to one of the earliest questions in this post, I think it is possible to imagine that the relationship between agents, rather than the semantics, is trivial. This is possible if there are standard applications for the perceiver, e.g. if the mind requires some referrant point (qua language) to have any perspective whatsoever. This is easier to believe if it turns out that what we mean by constructing a meaningful point of view has something to do with language or systems, e.g. if there is no certainty without language. If that is not the case, then people can disagree on emotional grounds. But is that really disagreement? It might be just as easy to say that agents do not even refer to eachother, rejecting one of the fundamental premises of society. What I detect is a willingness to outright accept arguments which have not been made, by accepting the view that individuals perceive ABSOLUTELY differently. Indeed, what could we mean by absolute? And where there is not an absolute difference, so the argument goes, there IS some relation.

So I find the argument that agents are perceptually different to be specious, because, indeed, language establishes a similarity. And language is all that could ever be discussed. As soon as there are emotional components of language, there are also emotional similarities. Otherwise, in those cases, the discussion simply fails to be sophisticated.