Commands and Collaboration in the Origin of Human Thinking: A Response to Azeri’s “On Reality of Thinking,” Chris Drain

Siyaves Azeri’s “On Reality of Thinking” (2021) is yet another informative contribution to the philosophical deployment of Vygotskian psychology. I’m grateful to Mr. Azeri for taking the time to reply to my comments and for the opportunity to continue this dialogue here … [please read below the rest of the article].

Article Citation:

Drain, Chris. 2021.“Commands and Collaboration in the Origin of Human Thinking: A Response to Azeri’s ‘On Reality of Thinking’.” Social Epistemology Review and Reply Collective 10 (3): 6-14.

🔹 The PDF of the article gives specific page numbers.

This article replies to:

❧ Azeri, Siyaves. 2021. “On Reality of Thinking: A Response to Chris Drain’s ‘Ideality and Cognitive Development’.” Social Epistemology Review and Reply Collective 10 (1): 22-28.

Articles in this dialogue:

❦ Drain, Chris. 2020. “Ideality and Cognitive Development: Further Comments on Azeri’s ‘The Match of Ideals’.” Social Epistemology Review and Reply Collective 9 (11): 15-27.

❦ Azeri, Siyaves. 2020. “The Match of ‘Ideals’: The Historical Necessity of the Interconnection Between Mathematics and Physical Sciences.” Social Epistemology 35 (1): 20-36.

I have no general qualms with Azeri’s reading of Vygotsky; his reply carefully elucidated many of the finer points of Vygotsky’s developmental story. However, I wish to dig a little deeper into one of the issues he raises, namely, the role of “joint activity” and language in the origin of human thinking. My hunch is that, despite Azeri’s claims to the contrary, Vygotsky’s conception of the role of language in development remains a “sourly dissonant” one (Jones 2019, 30), residing at a level of competition rather than collaboration. This poses a stark contrast to Tomasello, whose notion of joint intentionality provides a means of overcoming Vygotsky’ recourse to a foundational antagonism in development.

I previously argued that Tomasello’s account of the evolutionary transition from individual to joint and collective intentionality can buttress Vygotsky’s account of cognitive origins (Drain 2020). I still think this is the case, for reasons I’ll point out below. But for now, I’d like to consider the following passage from Azeri:

Language (speech) is the most significant psychological tool, which constitutes and determines consciousness with the use of regulatory [emphasis added] reactions. The formation of consciousness as a social relation and the constitution of social (joint) action that precedes constitution of consciousness as the individualized sociality is mediated by speech that facilitates the emergence of reversible stimuli/reactions as the basis of social (joint) action [emphasis added]…. It is through speech as a specific form of social stimuli that myself, the “I”, is formed and becomes comparable to others; through speech I get to know myself just in the same way that I get to know another…. [T]he constitution of individual consciousness and of social behaviour coincides—they are two facets or forms of existence of one and the same essence, that is, the labour (human activity) process in the widest sense of the term—and thus the formation of joint action is explained. In other words, even what Drain, following Tomasello, calls “joint intentionality” is necessarily preceded by human (productive) activity or labour as the ultimate socializing and humanizing factor (2021, 23).

Azeri brings up two issue here:

1. The regulatory aspects of language, which explains and how socialized labor precedes the formation of individual consciousness.

2. That Tomasello’s “joint intentionality” is phylogenetically preceded by a stage where labor becomes the motivating force in human development.

Regarding (2), much hinges on what we consider “social (joint) action” to be. If it is an essentially collaborative activity, then it can’t precede joint intentionality since, as Tomasello points out, joint intentionality is itself the precondition for collaboration. A more generous, though still faulty, reading would be to say that Vygotsky is simply referring to joint intentionality when he speaks of labor, or in Azeri’s phrasing “social (joint) action.”[1] I say this is faulty because, aside from some casual mentions (1987, 356), Vygotsky never explicitly embraces a theory of intentionality.[2] Thus, even if labor consists in “joint activity,” it is not clear that the “jointedness” of such activity resembles Tomasello’s joint intentional action, where conspecifics knowingly collaborate in pursuit of a common goal. If anything, Vygotsky’s account seems to be one less about collaboration and more about subordination.

For Tomasello, human labor is necessarily collaborative in origin and emerges due to changes in the intentional make-up of proto-sapiens. Phylogenetically preceding collaborative activity is a stage where social relations are marked by competition and coercion. Indeed, subordination and domination are the social marks of the individual (not joint) intentional agent who, because of such limits in intentionality, cannot yet labor in the human sense. As we’ll see, Vygotsky never seems to get past this level of competitive, subordinative, intentionality.

This point about subordination turns us back to (1), the issue of regulation­. It’s true that Vygotsky explains the origin of human consciousness through the regulative aspect of the sign. In Azeri’s words,

The formation of consciousness and its emancipation from immediate field of activity is possible through deployment of specific artificial devices, that is, psychological tools that are utilized by human beings in order to master their own behaviour just as technical devices are deployed toward mastery of objective processes (23).

However, there is a point not covered here by Azeri: Vygotsky’s “regulative” account hinges on the centralization of “directive” speech acts (commands or imperatives). With directives, one directs the activity of another, and in turn begins to “self-direct” (or self-regulate). It’s my claim that Vygotsky’s reliance on directives de facto keeps his account stuck at the level of individual intentionality.[3] Directive speech acts feature prominently in Tomasello’s developmental story as well. But Tomasello has the benefit of accounting for a functional differentiation in directive communication—i.e., in collaborative activity, the command gives way to both the request and informational assertion. Lacking such differentiation, Vygotsky’s account runs the risk of playing to a rather strident conception of the socius, one more Machiavellian than Marxist.[4]

Mind, Regulation, and Labor

Vygotsky returns to the point of “regulation” often as he develops his cultural-historical approach. For instance, in his unfinished manuscript “Concrete Human Psychology,” he affirms Pierre Janet’s idea that “The relation of psychological functions is genetically linked to real relations between people: regulation of the word, verbalized behavior = power-submission” (Vygotsky 1989). So too in The History of the Development of Higher Mental Functions—there Vygotsky highlights Janet as the progenitor of the “sociogenetic law of development.” In Vygotsky’s gloss, “the essence of this law is that in the process of development, the child begins to apply the same forms of behavior to himself that others initially applied to him” (1997b, 102). Such development, of course, relies on the sign: “initially the sign is always a means of social connection, a means of affecting others, and only later does it become a means of affecting oneself” (103). Again, Vygotsky explains that

According to Janet, the word is always a command because it is a basic means of controlling behavior. For this reason, if we want to explain genetically from what the volitional function of the word is derived, why the word subordinates motor reaction, what the origin of the power of the word over behavior is in both ontogenesis and phylogenesis, we unavoidably arrive at the real function of the command… [T]he relation of mental functions must be genetically attributed to real relations between people. Regulating another’s behavior by means of the word leads gradually to the development of verbalized behavior of the individual himself (104).

Vygotsky, then, takes it as “self-evident” that “the word was initially a command for others” before becoming “a complex story consisting of imitation, changes in function, etc., and was only gradually separated from action” (103).

Vygotsky’s (Janet-inflected) idea that the origin of language is found in the social imperative is traceable at least in part to Ludwig Noiré, who argued that humans are not only the “tool-making animal” and “the gregarious animal”, but also the “cooperative animal” (Noiré 1917, 138). By this account, the earliest words are verbal, rather than nominative, and thus the “earliest meaning… referred to human action” in a cooperative context (139).[5] Vygotsky’s suggestion that social subordination is the primary function of speech unnecessarily distorts this cooperative model into something resembling the master/slave dialectic (Jones 2019, 29-30).

For Vygotsky, the apprehension and utilization of the sign marks a transformational leap between unmediated affective communication as exhibited in apes and the mediated linguistic communication of early humans. Such a phylogenetic transition is illustrated in the movement from a stage of (a) “receptive” [6] self-monitoring (affectively based on situationally bound stimuli) to a stage of (b) cognitive, conceptual, self-monitoring (where the individual can take on the perspective of another in relation to her own cognitive states). As such, the historical division of labor is mirrored in development insofar as the “separation of functions among people is the basic mechanism of modification and transformation of function of the individual himself” (Vygotsky 1997b, 104). Inter-personal verbal regulation thus spurs the development of intra-personal verbal regulation; the social is moved “inside” the individual subject, and she can regulate her own behavior internally in the same manner as the supervisor externally regulated the subordinate. As Vygotsky makes clear, the role of the “director” of activity and “fulfiller” of activity are united in the mature cognitive subject: “An important step in the evolution of work is the following: what the supervisor does and what the underling does is united in one person. This… is the basic mechanism of voluntary attention and work” (1997b, 104).

According to Azeri, “owing to the sociality of psychological and conceptual tools the internalization of which amounts to the formation of individual consciousness, ‘pure’ thinking also emerges as truly social… pure thinking is ‘joint’ thinking (2021, 25). My concern, however, is whether Vygotsky’s embrace of Janet’s “command account,” insofar as it is predicated on an antagonistic picture of originary social relations, undermines a more cohesive thesis regarding cooperativity as the precondition for cognitive development. It just may be that Vygotsky does not have enough conceptual tools, or the right ones, to “move beyond” individual intentionality. Tomasello’s utilization of speech act theory, along with his focus on intentionality, offers a well-served corrective in this regard.

Joint Intentionality and Communication

According to Tomasello, joint intentionality augmented the communicative potential of early humans. At a previous stage of development, pre-sapiens “individual intentional” agents communicatively engaged only in directive speech acts—using ritualized signals responding to desires and beliefs, though with situational (imagistic and schematic) rather than propositional contents. Such communication relies on the understanding of certain perceptions and goal states, and utilizes a practical reasoning apparatus. But all this is done on the basis of an individually competitive rather than a cooperatively intentional foundation (Tomasello 2008, 105). In other words, our proto-human forebears engaged in social and technical causal reasoning, but only communicated with respect to conspecific behavioral regulation. This changes with the introduction of joint intentionality, which itself inaugurates a communicative shift towards cooperation. As such, joint intentionality allows for a differentiation in the directive speech act: insofar as these early hominins engaged in collaborative activity, they engendered a new communicative motive, that of sharing relevant information with respect to the other’s role in the activity. Thus, Tomasello hypothesizes that at some point in the evolution of joint intentional cognition, purely directive communication was expanded to allow for both informative and requestive communicative acts.

The advent of an informational motive alongside the directive motive had three major consequences for the evolution of human thinking:

1. It brought about truth conditions—if a collaborator wants to be seen as a cooperative partner (ever important in the Homo heidelbergensis foraging ecology), she must commit herself to relaying information honestly. This occurred first and foremost during the collaborative act itself, but gradually spread to the social interface of the band apart from such immediate activity.

2. It created what Tomasello calls a new “relevance inference,” where the recipient of a communicative act may infer that a certain act is intended for her with a purpose relevant to her specific goals (Tomasello 2014, 52). Great apes and those varieties of hominins prior to Homo heidelbergensis can make no such inferences.

3. Most important for us here, Tomasello notes that one major result of collaborative communication is the emergence of the distinction between communicative force and content. With the purely directive gesture, there is little room between the force of ostensive act itself (pointing to an object I want you to hand me, e.g.) and its content (the object itself). Thus, to the individually-intentional mind, the status of the directive is ambiguous with respect to a whether it is a command or request. Because of joint intentionality, certain ostensive gestures can now be differentiated. With the help of intonational specification, the same ostensive gesture (pointing to a stick) can be registered as either a directive (as either a request or command for the stick) or an assertive, where information about the location or existence of a thing is simply being communicated for the benefit of the collaborative partner.

For Tomasello, these three effects entail that “the situational (propositional) content of the communicative act was starting to be conceptualized as independent of the particular intentional states of the communicator” (loc. cit.). The content tied to the situation and goals of the individually intentional agent is now being ‘liberated,’ so to speak, through joint intentionality.

Directives and Cooperation

On their own, commands (and requests and entreaties, etc.) are “directive” speech acts in that they “are attempts by the speaker to get the hearer to do something (Searle 1979, 13). For Vygotsky to place directives as the first words does satisfy the basic conditions of an “activity-model” of language origins, which stipulates that language fundamentally functions to coordinate social activity and that word meaning is grounded on socially coordinated activity rather than some mechanism of reference (Winograd and Flores 1986, 17; see also Hutchins and Johnson 2009). Such an “activity model” certainly seems sympathetic with the general Marxian thesis that labor begets language (Engels 1946). However, the “directive” focus of Vygotsky’s account omits dealing with any cooperative social relations in the sense that cooperation requires more than just the mutual affirmation of individual intentions. Directive utterances may or may not be part of a joint intentional enterprise, e.g., compare “Move that” with “Pass that to me (since we are building this together).” The first expresses a desire, and the respondent can choose to respond accordingly (obey or suffer the consequences). In a technical sense, this is part of a “social” interaction—but not a joint one. Such social activity, by the Vygotskian account, is thus reducible to the aggregate sum of individual intentions; there is not a “Wir” involved, merely a dyad of “Ichs.” By contrast, the latter expression indicates something more than just an aggregate of individual intentions (Searle 1995; Swindler 1996). In the request, then, is the beginning of a qualitatively novel “we” intentionality (and the same can be said for informational assertions in a collaborative context).

Any sociogenetic explanation of the origin of mind and language should explicitly deal with joint and collective intentionality rather than leave it ambiguous whether individual intentional states are all that are in play in communication. Vygotsky’s Engels-inspired account certainly seems as though it should qualify here. Unfortunately, without a coherent theory of intentionality, there is little to differentiate the social activity and mentality of apes from that of humans. As I pointed out in Drain (2020), the individually intentional hominid can still entertain abstract representations about the social world—only theirs is a world determined by competition rather than cooperation. In other words, “social (joint) action” is not sufficient for human cognitive genesis since non-humans do it too.

The problems of Vygotsky’s “directive origin” of communication—i.e., that it assumes an antagonistic social relationality, that the intentional status of creatures engaging in “joint” activity still might be one of an aggregate of competing individual intentions—can be dissolved if we follow Tomasello in taking human communication as essentially cooperative in its origin, regardless of whether it is later used to achieve particularly selfish ends:

In the beginning skills of cooperative communication were used only in activities that were collaborative all the way down (and so structured by joint goals and attention, which provided the necessary common conceptual ground). Only later was cooperative communication co-opted for use outside of collaborative activities and for noncooperative purposes … (2008, 170).

Communicative forms that preceded human language may have been marked by dominance and coercion, but human language only became as such by evolving as a cooperative activity. By Tomasello’s score, noncollaborative communication can return, but only as parasitic on primarily collaborative forms.

Final Remarks

For Tomasello, the communication of great apes and proto-sapiens hominins is restricted to directive speech acts. By the time of Homo heidelbergensis, joint intentional collaboration expanded the directive to include both requestive and informational components. We have no such nuance in Vygotsky. By his account, the command admits of no further functional differentiation and there is no countenancing that a collaborative component can be derived from it. Tomasello is clear, on the contrary, that early humans were quite adept at circumstantially differentiating their communicative acts. Cognitive development, then, is primordially driven not by the internalization of any labor antagonism but by the necessity of collaborative activity and the resultant perspectival simulations that undergird the communicative repertoire of such intentional agents.

From all this we could suppose that Vygotsky is simply mistaken—but with the caveat that Vygotsky would have been right, if only he assigned the purely directive stage of communication not to early humans but to great apes and non-human hominins. To be generous, we may conclude that Vygotsky is in the right ballpark regarding many of his speculations. Much of what Vygotsky took as fact about early human communication does accurately correspond to Tomasello’s individual intentional agents, i.e., those apes and proto-heidelbergensis hominins that rely on purely directive forms of communication. Some aspects of Vygotsky’s “directive” account, then, could apply to any number of hominins prior to Homo heidelbergensis. Thus, while it may be too much to hold the cause of human-specific thought to be the internalization of the dominant/subordinative dialectic, Vygotsky’s insistence that the earliest communication consisted in directives is confirmed in Tomasello’s account, if only regarding nonhumans.

Author Information:

Chris Drain, Villanova University,


[1] It’s not clear (as I hope to show) that labor is a joint activity by Vygotsky’s account. Such is not the case for A.N. Leontiev, who is close to Tomasello in claiming outright that labor processes are developmentally contingent upon “conditions of joint, collective activity” (Leontiev 2009, 185; see also Drain 2018).

[2] Vygotsky falls short of ever affirming intentionality as a basic property of mind. In “The Problem of Consciousness” (1997a, 129-130), there is mention that “the relation between function and phenomena (the problem of intentionality)” is one of the questions that arises when considering “consciousness as a system of functions” (Carl Stumpf, mentor to Edmond Husserl, is mentioned in parentheses here). But none of these are Vygotsky’s actual words. The document serves as notes for a series of Vygotsky’s private talks with his research group, with insertions added by A. Zaporoshets. So, while intentionality is recognized as a problem by Vygotsky and his interlocuters, it is never explicitly thematized nor is there any attempt to explicate its social nature. This is especially evident when, borrowing from Kurt Lewin, Vygotsky treats the introduction of symbolic operations into the “field” of activity as something specifically human and integral in the formation of “any intentions” whatsoever, i.e., “in creating free action independent of the direct situation” (Vygotsky 1999, 36). While intentions may be a species of intentionality (Searle 1983), Vygotsky is focused neither on the fact of intentionality itself, nor on its social nature, but rather on the emergence of a “planning function” (Jones 2017) that frees the agent from situational determination.

[3] Jones (2009, 2019, 2020) has thoroughly explicated the problems that the “command theory” of development brings about in Vygotsky’s general psychology. My goal here is not to repeat his (extremely helpful) analyses—though see Jones (2009, 173) for a quick summary. Instead, I simply want to defend the proposition that Vygotsky does not utilize a theory of joint intentionality but that he should.

[4] One could just as easily replace “Machiavellian” with “Hobbesian,” though the former may also refer to the “Machiavellian intelligence hypothesis” of Humphry (1976)—which Tomasello is decidedly against (Moll & Tomasello 2007). A future paper will have to deal with whether Tomasello is correct in framing his own project as “Vygotskian” since, by my reading, Vygotsky fails to deal with collaboration.

[5] Brandist (2007) claims that Vygotsky read Noiré, though I find no references in the Plenum Edition of Vygotsky’s collected works. Leontiev, on the other hand, certainly engaged with Noiré, as Keiler (2008 & 2010) effectively demonstrates.

[6] In distinguishing ape from human consciousness, Vygotsky uses James’ (1892) distinctions between the human facility with concepts, which are abstract (and socially mediated Vygotsky would add), and the non-human facility with “recepts” (Romanes 1888) or “constructs” (Morgan 1891), which are composite impressions “generally received” (rather than conceived) from sensible percepts (Vygotsky 1997a, 132).

