Is Language a Commodity Google Can Sell?

When you open your Internet browser, Google is there to greet you with its bright, happy-colored, recently redesigned logo, and this…


…in which, seemingly by magic, you can type (or speak) anything—anything at all—and Google will have an answer for you in the form of links to websites. Lots of websites.

Increasingly, that answer is even TAILORED FOR you. But not just for you. (More on that in a bit.)

That’s right, Google is the oracle of the World Wide Web, serving up three quarters of the world’s Internet search queries, responding to something like 3.5 billion searches every single day for some 1.17 billion people. And it’s no wonder, really; Google is a good idea. The Web is huge—4.83 billion pages huge. In practical terms, Google reduces the entire Web to the size of a single screen.

I mean, who clicks through more than one page of search results, anyway?


But here’s the thing. Google isn’t really a magic fortune-telling machine from the Twilight Zone, (even though that would be SO perfect). It’s a business. And a mighty fine one, too. Every hour, Google earns several million dollars. And to a great degree, it does so by selling language. Words. Millions of words.

first, a little background

According to Frederic Kaplan in his 2014 paper, “Linguistic Capitalism and Algorithmic Mediation,” Google’s success is the story of two algorithms responsible for doing two things:

  1. associating web pages to searches based on keywords, and
  2. assigning commercial value ($$$) to those keywords.

Kaplan writes that in 1998, search engines could already search the Web for sites based on keywords, but the rankings were easily hackable, so you got lots of junk that climbed the ratings to get to the top of search results.

So Google came in and added another element, calculating not only the presence of keywords in a page, but also taking into account how many other pages on the Web cited it. The more a Web page was cited by others, the more valuable it became to certain keyword searches. Kaplan explains:

Each citation behaved like a vote whose weight was proportional to the number of citations of the citing document. With this voting principle, classification and search results kept improving as the World Wide Web continued to extend: the more documents, the finer the ranking.

The result?


After a while, Kaplan writes, Google began accumulating what social theorist Pierre Bourdieu calls cultural capital, specifically linguistic capital.

let’s define linguistic capital

Capital is like an asset that can be leveraged for benefit. Usually when we talk of capital we talk about money or wealth that a business owner accumulates over time and then invests in some way in order to ultimately accumulate more capital, or assets, that will benefit the company and spur greater profit. (That’s one nutshell; there are others.)

Cultural capital works in a similar way, except within our social relations. Language is one form of cultural capital that Bourdieu calls embodied capital, that is, something that you have and that you can implement or invest as an asset for some benefit or profit, in this case, cultural and social profit.

There are a ton of ways this can play out in everyday life. Take, for instance, your accent. You’re going to be able to leverage your natural Southern Arkansas accent—an asset of linguistic capital— to much greater social benefit among people of your class within the Southern U.S. than you might if you went up to Vermont, where the same accent has less cultural exchange value (maybe some associate it with being poorer or uneducated). And when you do go to Vermont, say for a job interview, you may downplay that accent or speak in such a way that is more likely to raise your chances of getting the job. Furthermore, imagine that you DO get that job; you’ve just converted cultural capital (language mastery) into economic capital (a new job with pay).

Kinda cheesy, but you get the point.

okay, back to Google

So how does your Southern Arkansas accent relate to Google?

When Google realized that it had acquired a certain accumulation of linguistic assets—for instance, knowledge of all the keywords its growing user base was using to search—it decided to cash in. Google transformed linguistic capital into economic capital by creating an algorithmic auction model for selling keywords.

When you search a thing, Google provides a few “sponsored” links at the top. These ad spots are awarded through a complex and automated process in which advertisers select a keyword and bid how much money they’d pay Google if a user clicked their ad when they used that keyword. Google computes the quality of the ad and the site it links to and [ABRA CADABRAH] before awarding the spots to the winners.

This whole process happens EVERY TIME A USER ENTERS A SEARCH QUERY. Millions of times every minute.


Behold the global, real-time linguistic market, where, as Kaplan writes, the value of words like “snowboarding” and “bikini” vary seasonally, where the word “gold” increases and decreases based on perceived financial crises, where words like “flowers,” “hotels,” “vacation,” and “love” are hot commodities, where the names of famous people like “Picasso” and “Freud” are bought and sold—where “anything that can be named can be associated with a bid.”

from expression to circulation

Kaplan writes that this sort of linguistic capitalism is an example of an “economy of expression.”

This is where Google’s autocomplete function comes in. You know, the little drop-down thingy that shows up when you start typing.


This function is key to Google’s language economy. Here’s what’s going on, according to Kaplan:

When Google’s autocompletion service transforms on the fly a misspelled word, it does more than offer a service. It transforms linguistic material without value (not much bidding on misspelled words) into a potentially profitable economic resource. When Google automatically extends a sentence you have started to type, it does more than save you some time, it transforms your expression into one that is statistically more regular based on the linguistic data it daily gathers.

By nudging and suggesting user’s search language based not necessarily on precision or expression but on profitability, Google promotes a homogenized language, a Google-ese lexicon, not only for users, but maybe even for content creators who want their sites to get noticed by the most people (ever heard of Search Engine Optimization/SEO?).

This is the concern of people like political scientist Jodi Dean, who writes in her essay “Communicative Capitalism: Circulation and the Foreclosure of Politics” that within the current model of increasingly abundant, commodified, networked communications that we see on the Web and in broadcast media, all messages (web pages, social media posts, memes, etc.) have gone from actions eliciting some response to mere contributions in an ever-expanding circulation of content. “Differently put,” she writes, “the exchange value of messages [i.e., $$$] overtakes their use value.” In such a system, it doesn’t matter who sends the message or who receives it, only whether it’s been accepted or rejected, and that is determined by how well it circulates with what’s already there: “The popularity, the penetration and duration of a contribution, marks its acceptance or success.”

In a way, this is what’s happening in Google’s language economy. What matters to Google, ultimately, is not necessarily how useful words COULD BE to its users, but how economically effective they HAVE BEEN when it looks to its database (and its moneybags).

But hold on—

is language Google’s to sell?

Fair question—usually whatever you’re selling has to be something you’ve GOT. Is Google selling something it doesn’t actually “own?” Doesn’t language belong to those who use it?

Literary theorists and linguists have thought about this a lot, particularly in terms of how it relates to the “author,” or, that person who uses language to compose messages—whether those messages be tweets or poems or novels or memes. Two people in particular,  Mikhail Bakhtin and Roland Barthes, suggest that language in fact DOES NOT belong to us.

Bakhtin writes in “The Problem of Speech Genres” that when we select words in constructing an utterance or writing a message, we don’t take them from language in its neutral, “dictionary form,” but rather from other people’s utterances. Even though we ascribe words to individuals, “the words of a language belong to nobody.” Any utterance is a “link in the chain of speech” of a particular discourse, Bakhtin argues; “Each utterance refutes, affirms, supplements, and relies upon the others, presupposes them to be known, and somehow takes them into account.”

Barthes agrees, writing in his famous essay “The Death of the Author” that every text is “a tissue of quotations drawn from the innumerable centres of culture,” that, in terms of signs and language, “the writer can only imitate a gesture that is always anterior, never original.” In thinking about language, Barthes even says we should substitute language itself for the person using it: “It is language which speaks, not the author.” For Barthes, language doesn’t need an author—an OWNER, if you will—it just needs some kind of instigator (he calls this the “scriptor”), for in the end, “a text’s unity lies not in its origin but in its destination.”

Roland Barthes teaching (and smoking).

But whereas Barthes saw this destination as the READER in what amounts to a sender-receiver relationship, Dean argues that within our current system, this final destination is not some reader, but the great big data stream in the sky:

A contribution need not be understood; it need only be repeated, reproduced, forwarded. … And, just as the producer…drops out of the picture in commodity exchange, so does the sender (or author) become immaterial to the contribution. The circulation of logos, branded media identities, rumors, catch phrases, even positions and arguments exemplifies this point.

invisible markets

If anything, Google’s model shows us that appearances can be deceiving. What some experience as a kind of service, others experience as a struggle for market share. Particularly when it comes to ad-funded services or products, this process can be largely invisible to us as consumers. We tend to take in a lot of advertising as useful content in its own right, engaging material to read or watch or listen to (and still other times, it’s pretty obnoxious—but I seriously watch Old Spice commercials for the lolz).

Of course, somewhere along the way, money’s got to be made. In modern capitalism, if there’s a will, there’s a market; and things that in the past might never have been considered as sell-able, buy-able objects, become just that. Sometimes that means words become commodities. Other times, it means you and I do.

The most intriguing effect of all this could be on textual expression itself. Kaplan writes that even if Google’s autocompletion isn’t really biased toward economically valuable expressions, it still tends to “transform natural language into more regular, economically exploitable linguistic subsets,” and as the raw material for these Google lexicons more and more incorporate content generated or modified by bot editors and machine translators for the very purpose of exploiting that system, Kaplan wonders if our “natural” languages could evolve to seamlessly integrate the biases of algorithms and economies. “Should we expect something like a pidgin or a creole to emerge,” he says, “whose syntax and vocabulary would be influenced by the linguistic capacity of machines and economic value of words?”

And, in the end, is that so different from how language has always evolved?

What do you think? Let me know in the comments or on the subreddit.



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s