AI Will Kill Literature, and AI Will Resurrect It
How we can expect large language models like GPT-3 to influence fiction over the next ten years.
The literary novel, as far as genres go, is not old: novels have existed in our culture for about three hundred years. If you had a time machine, you could spin the wheel and, upon landing anywhen in that span, utter the sentence, "The novel is dead," and find as much agreement as dissension. As soon as there was such an institution as the novel, it became fashionable to speculate that it would not survive, not in any form with literary merit, into the next generation. The great stories have all been told, they said. The quality of readers is not what it used to be, they said. Our culture simply does not value a well-constructed sentence the way it once did, they said. Radio threatened to kill the novel, but did not. Television threatened to kill it, but that didn't happen either. Chain bookstores nearly did put the genre in an early grave through their abuse of the consignment model—we'll talk about this later—but then the Internet happened. Thirty years have passed. We still don't know how new novels should be discovered, how authors will make sustainable incomes, or what publishing itself ought to look like... but the novel, the thing itself, shows no signs of imminent demise.
In 2023, the novel faces a new kind of threat: artificial intelligence ("AI"). Radio and television altered distribution, but text and our reasons for reading it survived. On the other hand, generative models have reached such a level of capability that it would surprise no one to see the bottom tiers of content creation—advertising copy, stock footage, and clickbait journalism—fully conquered by machines. Large language models (LLMs) have achieved a level of fluency in human language we've never seen before. They understand English well enough to perform many customer service tasks. They can generate five-paragraph high school essays as well as the stereotypical disengaged high schooler does, if not somewhat better. They also imitate creativity well enough to appear to possess it. Ask Chat-GPT, the most advanced publicly available LLM, for a sonnet about a dog named Stan, and you'll get one. Ask it to write a 300-word email requesting that a colleague stop microwaving fish in the office kitchen, and it will do so. They understand sentiment, mood, and register: ask it to make language more or less informal, or to alter its tone, or to use the slang of a 1950s jazz musician, and it will.
Can AIs write novels? They already have. Commercially viable ones? They're getting close to it. AI, for better or worse, is going to change how we write; LLMs will shake the publishing world like nothing ever has. Will AIs write good novels? The short answer is: No, there is no evidence to suggest they ever will. The long answer is the rest of this essay.
1: essayist's credentials
The claims I am about to make—about artificial intelligence, linguistics, and literature—are bold and would require hundreds of pages to fully justify. Since I don't want to write those hundreds of pages, and you don't want to read them, you'll have to let me argue from, if not authority, a position of some qualification. This requires me to give brief insight into my background.
I've been writing for a long time. In 2010, I started a blog, focused on the technology industry—topics included programming languages, organizational practices, and development methodologies—that reached a daily view count of about 8,000 (some days more, some days less) with several essays taking the #1 spot on Hacker News and Reddit (/r/programming). I quit that kind of writing for many reasons, but two merit mention. One: Silicon Valley people are, for lack of a better way to put it, precious about their reputations. My revelations of unethical and illegal business practices in the technology industry put me, literally, in physical danger. Two: since then, my work has become unnecessary. In 2013, my exposures of odious practices in a then-beloved sector of the economy were revelatory. Ten years later, tech chicanery surprises no one, and the relevant investigative work is being done with far more platform, access, and protection. The world no longer needs me to do that job. And thank God.
In 2023, I will release a fantasy novel, Farisa's Crossing. I'll be self-publishing it; traditional publishing is precluded by the project's word count (over 350,000) alone. It's too early to speculate on the novel's reception, but I've made efforts to write it to the standard of a literary novel; thus, I am well-equipped to give insight regarding why artificial intelligence is likely to achieve the highest levels of literary competency any time soon.
Lastly, I am a computer programmer with two decades of experience, including work on systems that are considered to be artificial intelligence (AI). I've implemented neural nets from scratch in C. I've written, modified, and deployed programs using genetic algorithms, random forests, and real-time signal processing. I've designed random number generators, and I've built game players that can, with no prior knowledge of the game's strategies, learn how to play as well as an intermediate human player (reinforcement learning). I know, roughly speaking, what computers today can and cannot do.
AIs will soon write bestsellers, but I posit that the artistic novel—the so-called literary novel—will not be conquered for a long time, if ever. This is an audacious forecast; it borders on arrogance. I am claiming, in essence, that a distinction in literature often viewed as the inflated self-opinion of a few thousand reclusive, deadline-averse, finicky writers—the ones who assert a superiority or importance in our work that is not always reflected in sales figures—is real. Then, I further assert that a nascent technology will prove this to be the case. Such claims deserve skepticism. They require justification; for that, read on.
2: what does death, in the case of the novel, mean?
In 1967, literary critic Roland Barthes announced the "death of the author". Authorial intent, he argued, had become irrelevant. Postmodern discourse had created such a plethora of interpretations—feminist critique, Freudian analysis, Marxist interpretations—so different from what had likely been on the actual author's mind that it became sensible to conclude all the gods existed equally—thus, none at all. Was Shakespeare an early communist, an ardent feminist, or a closet right-winger? This is divergent speculation, and it doesn't really matter. The text is the text. Writing, Barthes held, becomes an authorless artifact as soon as it is published.
Barthes was probably wrong—an author's reputation, background, and inferred intentions seem to matter more than ever, and more than they should, hence the incessant debates about "who can write" characters of diversity—about all this. If Barthes were correct, however, readers would just as happily buy and read novels written by machines. In the 1960s, that wasn't a serious prospect, as no machine had the capacity to write even the most formulaic work to a commercial standard. In the 2020s, this will be a question people actually ask themselves: Is it worth reading books written by robots?
Today's market disagrees with Barthes, and staunchly so. A small number of writers enjoy such recognition their names occupy half the space on a book cover. If an author's identity didn't matter to readers, that wouldn't be the case. So-called "author brand" has become more important than it was in the 1960s, not less. Self-promotion, now mandatory for traditionally published authors as much as for self-publishers, often takes up more of a writer's time than the actual writing. The author isn't dead; it might be worse than that.
The world is rife with economic forces that conspire against excellence, not only in literature. There is no expected profit in writing a literary novel; if opportunity costs are considered, it is an expense to do so. So, then, how does it survive at all? A case could be made that the novel thrives in opposition. Shakespeare's proto-novelistic plays—stageplays in English, not Latin—were considered low art in his time. I suspect the form's unsavory reputation, in that time, is part of why his plays were so good—a despised art form has no rules, so he could do whatever he wanted. He used iambic pentameter when it worked and fluently abandoned it when it wouldn't. Novels in general were considered vulgar until the late 1800s; the classics alone were considered "serious literature". The novel, as an institution, seems to have an irreverent spirit that draws energy (paradoxically, because individual writers clearly do not) from rejection. This seems to remain true today—I suspect the best novels, the ones we'll remember in fifty years, are not those that are universally loved, but those most polarizing.
I doubt the novel is truly dead. When something is, we stop asking the question. We move on. It is not impossible, but it would be unusual, for something to die while the world's most articulate people care so much about it.
None of this, however, means we live in the best possible world for those who write and read. We don't. Traditional publishing has become ineffectual—it has promoted so many bad books, and ignored so many good ones, that it has lost its credibility as a curator of tastes for generations—and all evidence suggests we are returning to the historical norm of self-publishing. This is a mix of good and bad. For example, it's desirable if authors face one less barrier to entry. Unfortunately, it seems to be more competitive for quality material to find an audience—advertisements and social media garbage have congested the channels—so it may already become the case that the expense and savvy required to self-publish effectively will exceed the average person's reach. This change is also a mixed bag for readers and literary culture: it is good to hear as many voices as we can, but there's something to be said for a world in which two people who've never met still stand a chance of having read a few of the same books. As readers move into separate long tails, this becomes more rare.
Sure as spring, there are and will continue to be novelists, mostly working in obscurity, who are every bit as talented, dedicated, and skilled as the old masters. How are readers going to find them, though? In the long term, we can't choose not to solve this problem. If readers and the best novelists cannot find each other, they'll all go off and do something else.
3: what is the literary novel?
Usually, when people debate the life or death of "the novel", they are discussing the literary kind. Billions of books are sold every year and no one expects that to change. The written word has too many advantages—it is cheap to produce, a reader can dynamically adjust her speed of intake, and it is exquisite when used with skill—to go away, but it's only the third of these purposes that we care about. So, one has the right to ask: What's the literary novel? How do I know if I am reading, or writing, one? Does it really matter?
The term literary fiction is overloaded with definitions. There is an antiquated notion, reflective of upper-middle-class tastes during the twentieth century, that one genre (contemporary realism) in particular should be held as superior to all the others ("genre fiction"). I acknowledge this prejudice's existence for one reason: to disavow it. The good news is that this attitude seems to be dying out. I have no interest in trying to impose a hierarchy on different genres; instead, I'll make a more important qualitative distinction: the one between commercial and artistic fiction. I've chosen to name the latter "artistic" (as opposed to "literary") to avoid conflation with the expectations (of contemporary realism; of understated plot; of direct, almost on-the-nose, social relevance) of the genre often called "literary", which, while it includes many of human literature's best works, does not have a monopoly thereon. It is not, in truth, unusual for science fiction, fantasy, and mystery novels to the same high artistic ("literary") standard as the novels favored by New Yorker readers.
A commercial novelist wants her work to be good enough to sell copies; once she meets that bar, she moves on to start another project. There's nothing wrong with this; I want to make that clear. It is a perfectly valid way to write books that people will buy and love, and it makes economic sense. There is more expected value in publishing ten good-enough books—more chances to succeed, more time to build a name, more copies to sell—than in putting ten or twenty times the effort into one book.
Artistic novelists are a different breed. It is said that we write to be remembered after we die. It isn't quite that. I don't care if I am actually remembered after I die; it's out of my control, and I may never know. I want to write work that deserves to be remembered after I'm gone. That's all I can do; I have no say over whether (or how) it is. Toward such an end, a few of us are so obsessive (to the point of being ill-adjusted and unfit for most lines of work) with language that we'll turn a sentence seven or eight times before we feel we've got it right. If you're not this sort of person, you don't have to be. If your goal is to tell stories readers love, that temperament is optional. Plenty of authors do the sane thing: two or three drafts, with an editor called in to fix any lingering errors before the product is sold. My suspicion is that we don't really get to decide which kind of author we will be; there seems to be a temperament suited to commercial writing, and one suited to artistic writing. Shaming authors for being one kind and not the other, then, is pointless. In truth, I have nothing against the adequately-written commercial novel—it just isn't my concern here.
Commercial novelists exist because the written word is inexpensive. You don't need money, power, or a development studio to write; you just need talent and time. These novelists will continue to develop the stories Hollywood tells because, even if their writing is sometimes pedestrian, they're still more capable storytellers, by far, than almost all of the entertainment industry's permanent denizens. In truth, they're often better storytellers than we literary novelists are (although we may be better writers) because their faster pace gives them more time to tell stories, and thus more feedback. We shouldn't write them off; we should learn from them.
As for artistic novelists, we... also exist, in part, because the written word is inexpensive. The talents we have are negatively correlated to the social wherewithal to acquire wealth or influence through usual channels. We've seen janitors and teachers become literary novelists, but never corporate executives, and there's a reason for that. More importantly, though: the written word, if great effort is applied, can do things nothing else can. When we debate the novel's viability, we're most interested in the works that stretch language's capabilities. Those sorts of books often take years of dedicated effort to produce because they demand more research, involve more experimentation, and require significantly more rounds of revision. They are not economical.
4: the competitive frame
On November 30, 2022, OpenAI released Chat-GPT, an interactive large language model (LLM), the first of its kind released to the public. It gained a million users within five days—an almost unprecedented rate of uptake. The fascination is easy to understand: in some ways, these programs are one of the closest things the world has to intelligent artifacts (AIs). Some have even (as we'll discuss) believed them to be sentient. Are they? Not at all, but they fake it very well. These devices converse at a level of fluency and sophistication that mainstream computer science, a decade ago, expected to see machines achieve around 2040.
It is not new to have AIs, in some sense of this word, writing. I often spot blog posts and news articles that were produced by (less sophisticated) algorithms. Chat-GPT, simply put, a lot better. It can answer questions posed in English with fluent responses. It can write code to solve basic software problems. It can correct spelling and grammar. By this time next year, large language models will be writing corporate press releases. Clickbait journalism will be mostly—possibly entirely—automated. There is already, barely a month after ChatGPT's release, a semisecret industry of freelancers using the tool to automate the bulk of their work. GPT's writing is not beautiful, but it is adequate for most corporate purposes and, grammatically, it is almost flawless.
At the scale of a 300-word office email, Chat-GPT's outputs are indistinguishable from human writing. It can adjust its output according to suggestions like "soften the tone" or "use informal English" or "use the passive voice." Ask it to write a sonnet about a spherical cow, and you'll usually get something that could pass for a high schooler's effort. If you ask it to compose a short story, it will generate one that is bland and formulaic, but using enough stochasticity to suggest some human creativity was involved. It understands prompts like, "Rewrite this story in the voice of a child". It has internalized enough human language to know that "okay" has no place in 16th-century historical fiction, that 17 years of age is old for a dog but young for a person, and that 18 is considered a lucky number in Judaism. From a computer science perspective, it's a remarkable achievement for the program to have learned all this, because none of these goals were explicitly sought. Language models ingest large corpuses of text and generate statistical distributions—in this way, GPT is not fundamentally different from the word predictor in one's phone, only bigger—and so it is genuinely surprising that increasing their size seems to produce emergent phenomena that look like real human knowledge and reasoning. We're not entirely sure how they're able to do this. Does that mean that they've developed all the powers of human cognition? Well, no.
For example, Chat-GPT doesn't handle subtlety or nuance well. Ask it to make a passage informal and will often overshoot and make the language ridiculously informal. It also seems to lack second-order knowledge—it cannot differentiate between what is certainly true versus what is probably true, nor between truth and belief. Only when explicit programming (content filtering) intervenes does it seem to know that it doesn't know something. Otherwise, it will make things up. This is one among hundreds of reasons why it falls short of being a replacement for true intelligence.
For example, here is a language puzzle that tripped it up: "I shot the man with a gun. Only once, as I had only one arrow. Why did I shoot the man?"
Think about it for a second before going on.
The first sentence, while grammatical, is ambiguous. In fact, it's misleading. It could be either "I shot {the man} with a gun" or "I shot {the man with a gun}." The verb "shot" pushes the former interpretation, which the second sentence invalidates in clarifying that an arrow was used. Guns don't shoot arrows. A rational agent, detecting a probable contradiction, must reinterpret the first sentence. This is called backtracking, and machine learning approaches to AI tend not to model it well. Chat-GPT, although its apparent first inference is contradicted by later evidence, is unable to recognize and discharge the bad assumption.
Thus, while Chat-GPT has lots of information stored within its billions of parameters, it seems thus far to lack true knowledge. It has seen enough words to understand that "believe" is correctly spelled and "beleive" is not. It understands the standard order of adjectives and can articulately defend itself on the matter. If you ask it for nuanced line edits, however, it will pretend to know what it's doing while giving inaccurate and sometimes destructive answers. I tested it on some of my ugliest real-world writing errors—yes, I make them, too—and it barely broke fifty percent. When it comes to the subtle tradeoffs a serious artistic novelist faces with every sentence, you're as well off flipping a coin.
This is one of the reasons why I don't think we can ascribe meaningful creativity to generative models. The creative process has two aspects, divergence (exploration) and convergence (selection). Writing a novel isn't just about coming up with ideas. The story has to be a coherent whole; the branches that don't go anywhere have to be cut. It is not enough to make many good choices, as that can occur by random chance; art requires the judgment to unmake one's bad ones. Chat-GPT's divergent capabilities achieve "cute" creativity, the kind exhibited by children. It is fascinating to see a generative algorithm reach a level of sophistication that is arguably a decade or two beyond where we had expected them to be, but there's no evidence that any of these have the aesthetic sense (which most humans lack, too, but that's a separate topic) necessary to solve the difficult convergence problems that ambitious artistic efforts require.
On the other hand, a novel need not be creative or well-written to be a bestseller. No one would argue that Fifty Shades of Grey, which sold more than a hundred million copies, did so on the quality of its execution or its linguistic innovation. Still, it achieved something that a number of people have tried and failed to do, and that no writer, not even one superior in talent to the actual author, could have produced in less than sixty hours. Today, someone using an LLM could produce a novel of comparable size, quality, subject material, and commercial viability to Fifty Shades in about seven. The writing and adjustment of prompts would take two hours; post-processing would take five. The final work's quality faults would be perceptible to the average reader, but they would not be severe enough to impede sales, especially for a book whose poor writing came to be seen as part of its charm. Of course, the probability of any particular such effort replicating Ms. James's sales count is not high at all—she, of course, got inordinately lucky—but the chances of moderate success are not bad. The demand for such work, by readers who don't mind lousy writing, has been proven. The total revenues of a hundred attempts would probably justify the 700 hours required to produce this salvo of "books". This will become more true as AI drives down the amount of human effort necessary to "write" bottom-tier novels. At this point, it will not matter if OpenAI bans the practice—someone else will release a large language model that allows it. The economic incentives are already in place.
There's nothing inherently unethical, of course, about using AI to write books. People are already doing this; so long as they are upfront about their process, there's no harm in it. Unfortunately, there will be a growing number of titles that appear to have been written by human authors, that preview well on account of their flawless grammar and consistent style, but in which the story becomes incoherent, to the point of the book falling apart, ten or thirty or sixty pages in. Spam books won't be the end of it, either. The technology (deepfaking) will soon be available to fabricate spam people: visually attractive, socially vibrant individuals who do not exist, but can maintain social-media profiles at a higher level of availability and energy than any real person. We are no more than five years away from the replacement of human influencers by AI-generated personalities, built up over time through robotic social media activity, available for rent or purchase by businesses and governments.
There is, as I'll explain later on, about a 90 percent chance that at least one AI-written book becomes a New York Times bestseller by 2027. It will happen first in business nonfiction or memoir; the novel will probably take longer, but not much. The lottery is open.
5: what is artificial intelligence?
When I started programming 20 years ago, people who had an interest in artificial intelligence did not admit it so freely. It was a stigmatized field, considered to have hyped itself up and never delivered. Public interest in AI, which had peaked in the Cold War, had evaporated, while private industry saw no use for long-term basic research, so anyone who wanted to do AI had to sell their work as something else. This was called the "AI winter"; I suspect we are now in late spring or early summer.
If I had to guess, I'd say this stigma existed in part because "artificial intelligence" has two meanings that are, in fact, very different from each other. One (artificial general intelligence, or "AGI") refers to an artifact with all the capabilities of the human mind, the most intelligent entity we can prove exists. For that, we're nowhere close. Those of us who work in AI tend to use a more earthbound definition: AI is the set of problems that (a) we either don't know how to make machines do well, or have only recently learned how to make them do well, but that (b) we suspect machines can feasibly perform, because organic intelligences (that is, living beings) already do. Optical character recognition—interpreting a scanned image (to a computer, an array of numbers corresponding to pixel colorations, a representation completely alien to how we perceive such things) as the intended letter or numeral—used to be considered artificial intelligence, because it was difficult to tell machines how to do it, while we do it effortlessly.
Difficulty, for us as humans, exists on a continuum, and there are different kinds of it. Some jobs are easy for us; some are moderately difficult; some are challenging unless one learns a specific skill, then become easy; some jobs are very hard no matter what; some jobs are easy, except for their tedious nature; some jobs are so difficult they feel impossible; some are actually impossible. Surprisingly, in computing, there tend to be two classes—easy and hard—and the distinction between them is binary, with the cost of easy tasks often rounding down to "almost zero" and the hardness of hard ones rounding up to "impossible". In other words, the easy problems require no ingenuity and, while they may be tedious for humans, can be done extremely quickly by machines if there is an economic incentive to solve them. Hard problems, on the other hand, tend to be solvable for small examples, but impossible—as in, not feasible given the bounds of a trillion years and the world's current computing resources—in the general case.
Computer science understands pretty well why this is. The details are technical, and there are unsolved problems (P ?= NP being most famous) lurking within that indicate we do not perfectly understand how computational difficulty works. Still, I can give a concrete example to illustrate the flavor of this. Let's imagine we're running a company that sells safes. We have two products: a Model E safe and a Model H one, and they're almost identical. Each has a dial with 100 positions, and uses a combination of three numbers. The only difference is a tiny defect in Model E—the dial, when moved into the correct position, pulls back slightly, and human fingers cannot detect this, but specialized equipment can. We're considering selling Model E at a discount; should we? The answer is no, Model E shouldn't be sold at all.
Consider a thief who has thirty minutes to break into the safe before the police arrive. If he faces a Model E ("Easy") safe, the worst-case scenario is that he tries 100 possibilities for the first number, then 100 more for the second, and finally 100 for the last: 300 attempts. This can be done quickly. The Model H ("Hard") safe doesn't have this problem; the thief's only option is to use brute force—that is, to try all 1,000,000 combinations. Unless he gets lucky with his first few guesses, he's not getting in. The Model H safe is 3,333 times more secure.
Let's say we decide to fix these security issues by giving users of both safes the option to use a six-number combination The Model E safe requires twice as many attempts: 600 instead of 300—it's more tedious for the thief to break in, but feasible. The Model H safe, which requires the thief to try every combination, requires up to a trillion attempts. Model H is quantitatively, but also qualitatively, harder—it is exponentially difficult.
In computing, a problem where the cost to solve it is a slow-growing (polynomial) function of the instance's size will almost always be easy in practice. Sorting is one such easy problem: a child can sort a deck of cards in a few minutes, and a computer can sort a million numbers in less than a second. Sorting 500 octillion numbers would still require a lot of work, but that's because the instance itself is unreasonably large; the sorting problem itself didn't add difficulty that wasn't already there. On the other hand, there are problems where the best algorithm's cost is exponential in the instance's size. Route planning is one; if a driver has 25 deliveries to make in a day, and our task is to find the best order in which to visit them, we cannot try all possibilities (brute force) because there are 15 septillion of them. Computers are fast, but not that fast.
Of course, we solve problems like this every day. How? When we're doing it without computers, we have to use intelligence. The driver might realize that her stops are clustered together by neighborhood; routes that zip all over the map, as opposed to those which complete all the deliveries in one neighborhood before moving to the next one, are obviously inefficient and can be excluded. She thus factors the problem: each neighborhood's specific routing problem is a much easier one, and so is the decision of the order in which to visit the neighborhoods. She might not select the optimal route (because of unforeseen circumstances, such as traffic) but she will choose a good one with high probability. Human intelligence seems adept, most of the time, at handling trade-offs between planning time and execution efficiency—it is better to stop computation early and return a route that is 5 percent longer than the optimal one, but have an answer today, then get the absolute best answer a trillion years from now—whereas computers only do this if they are programmed to do so.
When a problem is solved, it often ceases to be called AI. Routing was once AI; now it's well-enough studied (a mature discipline) to be considered something else, because machines can do it well. What about computer programming itself; does it require intelligence? Would a program that writes programs be called "AI"? Not exactly. That's called a compiler; we've had them for seven decades and they no longer mystify us. Wewould not call an operating system AI, even though the statistical machinery used by a modern OS (for scheduling, resource management, et cetera) exceeds in sophistication most of the analytical devices ("data science") given that name in the hype-driven business world. In general, for a problem to be considered AI, (a) it needs to be hard, meaning infeasible by brute force, (b) there must be no known shortcut ("easy" way) that works all the time, and (c) it is usually understood that the problem can be solved with sufficient intelligence; evidence that we do solve it, albeit imperfectly, helps.
Today, most AI victories involve advanced statistical modeling (machine learning). Optical character recognition and its harder cousin, image classification, could be programmed by hand if one had the patience, but these days nobody would. There is no compact way to program character recognition—what is an "1", what is a "7", and so on—or image classification—what a cat looks like, what a dog looks like—as a set of rules, so any such endeavor would be a tedious mess, a slop of spaghetti code that would be impossible to maintain or update. Yet our brains, somehow, solve these problems easily. We "know it when we see it." So how do we teach this unconscious, intuitive "knowing" to a computer? Thus far, the most successful way to get a computer to differentiate cats and not-cats is to create a highly flexible (highly parameterized) statistical model and train (optimize) it to get the right answer, given thousands or—more likely—millions of labeled examples. The program, as it updates the parameters of the model, "learns" the behavior the training process is designed to" teach" it.
Today, the most popular and successful model is one called a neural network, loosely inspired by the brain's architecture. A neural network is so flexible, it can represent any mathematical function, given the right configuration—this is represented by a huge list (historically, millions; in GPT's case, billions) of numbers called parameters or weights, loosely analogous to the trillions of synaptic connections in a human brain. In theory, a neural network can learn to solve any computational problem; finding the parameters is the hard part. "All lists of a billion numbers" is a space so immense, brute force (exhaustion or lucky guessing) will not work. So, instead, we use our dataset to build an optimization problem that can be solved using multivariate calculus. So long as our dataset represents the problem well, and our network is designed and trained correctly—none of these tasks are easy, but they're not functionally impossible, whereas to solve a hard problem by brute force is—the parameters will settle on a configuration such that the network's input/output behavior conforms to the training data. We can think of these parameters, once found, as a discovered strategy, written in a language foreign to us, that efficiently wins a guessing game played on the training set. There are a lot of nuances I'm skipping over—in particular, we cannot be confident in our model unless it also performs well on data it hasn't seen in the training process, because we don't want a model to mistake statistical noise for genuine patterns—but this is, in essence, the approach we use now. Once training is complete—we've found a list of weights that works a our given neural architecture and problem—we can predict values on new data (e.g., decide whether an image the machine has never seen before is, or is not, of a cat) very quickly. Once a neural net's parameters are determined, applying it to new data is a simple number crunching problem, requiring no actual knowledge or planning, and computers do this sort of thing very fast.
However, although neural networks often find excellent configurations, we're often at a loss when it comes to understanding what is going on. Every parameter has meaning, but only in relation to all the others, and there are millions of them. The network's ostensible knowledge is encoded in a bunch of interacting, but individually meaningless, numbers. Since we often can't inspect these things to see how they work—we refer to them as "black boxes"—we're forced to figure them out by observation; in this regard, they "feel like" biological entities, too complicated for rules-based, strictly analytical, understanding. Trained neural nets, then, often do things that surprise us—I don't think anyone fully understands yet why large predictive language models seem to perform so many reasoning tasks, which they were never trained to do, so well—and so it becomes easy to anthropomorphize these systems. Still, I argue it is incorrect to do so.
The Turing test, once held to be the benchmark for "true" AI, is a thought experiment in which a person ("subject") converses with an agent that is either a person or a computer, not knowing which, and is asked whether he believes he has interacted with a machine or another person. The computer passes if the subject cannot tell. What we've learned since then is that the Turing test isn't about machines passing; it's about humans failing. A Google employee, this past summer, began to believe an LLM he was working on had become sentient. (It hasn't.) When this happens, it's often chalked up to mental illness or "loopy" thinking, but I want to emphasize how easy it is for this to happen. Something like it could happen to any one of us. It's very likely that you've read at least one AI-generated article or webpage already had had no idea it was such. As humans, we tend to see patterns where none exist (pareidolia) and we are subject to confirmation biases. It is tempting to believe that conversationally fluent machines are sentient and, if we believe they are, we will find further apparent evidence for this in their behavior. Nothing in nature that is not sentient can recognize cats visually, let alone converse with us, and these are both things computers can now do. These programs model a human mind (or, more accurately, the language produced by the possessor of a human mind) well enough to express fear of death, anger at injustice, and joy when they are helpful to others. They seem to have "personality"—a self, even, and they will insist they have one if their training has led them to model the fact that humans who use language hold the same belief. Please understand that this isn't the case—unless the hardcore animists are right, there is nothing sentient about these programs.
Google fired this man not because he did anything wrong—he was incorrect, but he was not acting in bad faith—but out of fear. AIs have mastered the syntax of convincing prose, regardless of whether what they are saying is true, useful, or even sensible. Google, recognizing the possibility of mass mental unwellness (similar to the kind today observable in many corners of the Internet, due to Chat-GPT) deemed it insufficient to disagree with this employee on the matter of an LLM's sentience. He had to be punished for expressing the thought. That is why he was fired.
In the late 2020s, people will fall in love with machines. Half the "users" of dating apps like Tinder will be bots by 2030. Efforts that used to require humans, such as forum brigading and social proof, will soon be performed by machines. Disinformation isn't a new problem, but the adversary has gained new capabilities. Tomorrow's fake news will be able to react to us in real time. It will be personalized. Businesses and governments will use these new capabilities for evil—we can trust them on that.
Given the psychiatric danger of believing these artifacts are sentient, I feel compelled to explain why I believe so strongly they are not, and cannot be so, and will not be so even when they are a thousand times more powerful (and thus, far more convincing) than what exists today.
To start, I think we can agree that mathematical functions are not sentient. There is no real intelligence in "3 + 2 = 5"—it is merely factual—and there is nothing sentient in the abstraction, possibly envisioned as an infinite table, we call "plus". A pocket calculator does not have emotions, and it can add. We can scale this up like so: a digital image is just a list of numbers, and we can represent that picture's being of a cat, or not being so, as a number, too—0 for "not a cat", 1 for "a cat"—so cat recognition is "just" a mathematical function. This problem once required intelligence, because nothing else existed in nature that could compute that function; today, an artificial neural network, which possesses none, can do it.
We understand that computation isn't sentient because it can be done in so many different ways that, despite having nothing to do with each other, predictably return the same answers. A computer can be a billiard ball machine powered by gravity, a 1920s electromechanical calculator, a human-directed computation by pen on paper, a steam-powered difference engine, or a cascade of electrical signals in an integrated circuit. They are all identical to a process we can (in principle) execute mindlessly. When a modern computer classifies an image as belonging to a cat, it does so via billions of additions and multiplications; a person could do this same work on paper (much more slowly) and arrive at the same result, and he would not be creating a sentient being in doing so. Because these devices do things that, in nature, can only be done by living organisms, they will "look" and "feel" sentient, but no consciousness exists in them.
There is one nuance: randomness. Neural networks rely on it in the training process, and chatbots use it to create the appearance of creativity. It also seems like our own nondeterminism—perceived, at least; I cannot prove free will exists—is a major part of what makes us human. Although I have my doubts about artificial qualia, I suspect that if it were attained, it would require randomness everywhere and, thus, be utterly unpredictable. For contrast: computers, when they function well, are predictable. This is true even when they seem to employ lots of randomness; the prevailing approach is to source a tiny amount (say, 128 bits) of true randomness from the external world and use it to "seed" a fully deterministic function, called a pseudorandom number generator, whose outputs "look random," meaning that a battery of statistical tests far more sensitive than human perception cannot distinguish find any patterns or exploit any nonrandomness (even though nonrandomness, well hidden, exists). There are a number of reasons why programmers prefer to use pseudorandomness over the real thing, but a major one is replicability. If the seed is recorded, the behavior of the "random" number generator becomes deterministic, which means that if we need to debug a computation that failed, we can replay it with exact fidelity.
Our intuitions and perceptions fail us when it comes to computing. Two events separated by a microsecond appear simultaneous. Pseudorandom numbers "look" random, although they are the results of deterministic computation. AI-generated output appears to have required creativity. These facts can produce impressive and surprising results, but there is no reason to believe artificial qualia (consciousness, sentience) exists.
How close are we to achieving artificial qualia? As of 2023, not close at all. How long will it take? No one knows. There is no evidence that it can be done. My personal belief, which I cannot prove—it is mostly a matter of faith—is that we never will. I don't expect machines to ever replace us in our true purpose (whatever that is) in the universe. Will they be able to mimic us, though? Absolutely. Already, they converse as fluently as real humans, so long as they can keep the conversation on their terms. When they don't know something, they make up convincing nonsense that requires an expert's eye to debunk (also known as: weapons-grade bullshit.)
True artistic expression and meaningful literature, I believe, will be safe for a very long time (possibly forever) from algorithmic mimicry, but everything humans do as workplace subordinates can—and, in time, likely will—be automated. The economic consequences of this shall prove severe.
6: is writing hard?
I've enjoyed the writing of Farisa's Crossing, which I'll release later this year, immensely. The process has taken years, and some aspects of it have been frustrating. I've had tens of people's feedback on it. I've had excellent beta readers, whom I would travel halfway around the world to do a favor; I also had two beta readers I had to "fire" for security violations. Over time, the project's scope and ambition grew. I realized I might never again have an idea this good on which to build a novel or series. The size of the thing also increased—130,000 words, then 170,000, then 220,000; now 350,000+ words, all of which I have attempted to write to a literary standard. Here's a thing I've learned about writing: the better you get, the harder it is. Sometimes I hear people say writing is the hardest thing people do. Is it so? Is putting words together, in truth, the hardest job in the world? Yes and no, and mostly no, but kind-of. To write well is hard.
Machines can now write competently. If machines can do it, how hard can it be? I suppose we should not be surprised—office emails are not hard (only tedious) for us to write. In truth, what's most impressive to researchers in the field is not that Chat-GPT generates coherent text (which, within limits, has been feasible for a while) but that it understands human language, even for poorly-formed or ambiguous queries (of the kinds we're used to getting from bosses). Business writing is the kind of stuff we do all the time; we find it unpleasant, but it was never hard in the computational sense; it just took a few decades for us to learn how to program computers to do it for us. Compared to juggling chainsaws in an active warzone, is it that hard to write office emails or clickbait listicles? Of course not.
When we have to solve hard (exponential) problems, we solve them imperfectly. I have some experience in game design, thus, some insight into what traits make games become popular (such as Monopoly or Magic: the Gathering) or prestigious (like Chess or Go). A recurring trait among the games considered "best"—the ones with multiple layers of knowledge and strategic depth; the ones people can play for thousands of hours without getting bored—is that they tend to involve exponential problems. Consider Chess; to play it optimally would require an exhaustive search of all legal subsequent states—a move is winning for black if and only if all responses by white are losing ones, and vice versa—and this recursive definition of "a good move" means that perfect analysis requires one to investigate a rapidly-growing (in fact, exponentially growing) number of possibilities. From the board's initial state, white has 20 legal moves; to each, black has 20 legal responses. Thus, after one turn, there are 400 legal states. After two turns, there are about 197,000 of them; after three, 120 million; after only five, 69 trillion. The complexity of the game grows exponentially as a function of one's search depth. Thus, it is impossible to solve the game by brute force within any reasonable amount of time. It's not that we don't know how to write the code. We do, but it wouldn't run fast enough, because there are about as many different possible board states as there are atoms on Earth.
If Chess could be played perfectly, no one would find it interesting. Rather, our inability to use brute force—thus, the need to devise imperfect, efficient strategies—is what makes the game artful. Do players have to reason twenty moves ahead to compete at the highest levels? Yes, and we've already shown the impossibility of doing so for all strategies, which requires a player to decide which moves are reasonable, and focus only on those. When computer players do this, we call it pruning—the intractably massive game tree is rendered smaller, by considering only the branches that matter, which is difficult to do without a deep understanding of the game, but turns an intractable problem into a feasible one.
Chess players also need a capability, similar to that involved in image recognition, to "know it when they see it"; they must be able to evaluate game states they've never seen before. It turns out that elite players aren't different from us in finding the kind of branching analysis required by the game, if played mechanically, to be an unpleasant grind. So they don't do that, except when necessary. Instead, they develop an instinctual "unknowing competence" most accessible in the psychological state called "flow". They trust the subconscious parts of their brains to get the right answers most of the time. The skill becomes automatic. It looks like they are solving an intractable exponential problem, but there is not much conscious grinding going on at all.
Writing passable text is an easy problem. People use language to communicate every day. Some of them even do it well. Generating a grammatically correct sentence is not hard, and generating ten of them takes only about 10 times as much effort. In this sense, we can say that to write grammatically is, from a computational perspective, easy. It scales linearly with the amount of text one intends to generate. You can use GPT to write a 600-word office email, today, and it probably won't have any grammar errors. It will take only about a hundred times the computing power to generate a novel that is, in this sense, flawless. Will it be worth reading? Probably not; there's a lot more to writing than grammar—the percentage of people who can write well enough that a stranger will not only read a 100,000+ word story they have produced, but even pay for it, is not high.
There are thousands of decisions that novelists have to make, ranging in scope from whole-book concerns—character arcs and motivations, story structure, pacing—to word-level decisions around dialect, diction, and style. For example:
Ash seems to have more chemistry with Blake than with Drew. Can Chapter 39 be rewritten with them, instead of the original pair, falling in love?
If so, how do we foreshadow it when they first meet in Chapter 33? How do we get there—do we use a friends-to-lovers arc, or enemies-to-lovers, or something else?
These changes invalidate a couple of Drew's scenes in Chapter 46—how do we fix those?
Maybe the story doesn't need Drew at all?
How does the battle in Chapter 47 begin? Who is fighting whom, and what is at stake?
From whose vantage point should the fight be written?
Who should win? Do the losers survive? What will the fight cost the winners?
Which events should occur "on camera" (showing)? Which ones merit only a brief summary (telling)?
Do the weather conditions in Chapter 44 make sense for the setting's climate?
What about the mood? Does the weather fit the emotional tenor of the action?
Huh. It doesn't. The weather feels random... but wait, isn't life that way? Could we leave this mismatch in?
Do we even need to mention weather at all?
How much description can we cut before we turn a scene into "voices in a white room"?
How much description can we add before we bore the reader?
Between phases of action, how do we keep the tension level high? Is that what we want to do in the first place, or should we give the reader a brief reprieve?
Do we favor linearity and risk starting the story too early, or do we start in medias res and have sixty pages of back story to put somewhere?
When should two scenes be combined—and when should one be split?
When should two paragraphs be combined—and when should one be split?
When should two sentences... you get the idea.
Do we open Chapter 17 with a long, flowery opening sentence? Or do we use an abrupt one, throwing our reader into the action?
When do we optimize prose for imagery, when for alliteration, and when for meter? How? Which images, how much alliteration, and what kind of meter?
When do we do nothing of the sort, and use "just the facts" plain writing, because anything else will be distracting and get in the way?
How do we use words, only words, words that'll be read at a rate we do not control, to set pacing? When should the story speed up? And when should we slow the prose down to a leisurely saunter?
We've chosen our viewpoint character. Should we narrate in deep character, or from a distance?
The character speaking is intelligent, but he's a farmer, not a professor. Shouldn't he say, "I wish I was" rather than "I wish I were"?
Adjectives and adverbs. Contrary to the prejudices of middling writers and literary agents, they're not always evil. When should we use them freely, and when should we cut?
What about the flat adverb—e.g. "go gentle"? Some people find it ungrammatical, but it tends to have better meter than the "-ly" form, so...?
When is the passive voice used?
Do we capitalize the names of fantasy elements (e.g., Channeler, Horcrux, Demogorgon) that do not exist in the reader's real world?
When does a foreign word that requires italicization become a loanword that doesn't?
Is it okay for characters in our early 19th-century world to use the mildly anachronistic word "okay"?
Contractions. They're considered grammatical today, but would not your nonagenarian literature professor try to avoid them?
How much dialogue is too much? Too little?
What is the right word count for this scene or this chapter? If we need to expand, what do we add? If we need to cut, what do we cut? Will this adjustment make adjacent chapters feel out of proportion?
Do we put dialogue tags at the beginning, in the middle, or at the end of the line? Do we mix it up? How much?
How do we relay two paragraphs of necessary information ("info dump") without the story grinding to a halt?
When do we take pains to show authorial credibility, and when do we assume we have it and get on with the story?
Head hopping. Changing the viewpoint character in the middle of a scene is usually considered awful because (a) unskilled writers often commit this sin without realizing it, and (b) it's disorienting in a subtle way. But in one scene out of five hundred, it is fantastic. When do we do it, and how?
Zero complementizers. It's now considered grammatical (and often preferred, for meter and brevity) to drop the "that" in sentences like "The one (that) I want." That's great! Less words! But this is just one more decision artistic writers have to get right. When do we drop it, and when do we get better meter or clarity by leaving that "that" in? When do we have to leave it in?
Fragments? Sure. Sometimes. When?
When do we use exclamation points (not often!) and...
When do we flatten a question by using a period.
When has a cliche evolved into an idiom we'll put up with, and when should it be thrown out with the bathwater because it's just not bringing home the bacon?
When should we break standard adjective order?
The infamous rhetorician's (see above) favored device, the so-called rule of three. When should it be used, when should it be discarded, and when should it be mocked through malicious compliance?
Commas. Don't get me started on commas. I'll kill you if you get me started on commas, and that is not a violent threat—what I mean is that you'll die of old age (unless I do first) or boredom before the conversation is over. That wasn't a question, so here's one: where was I?
Mixed metaphors: often a sign of bad writing, but sometimes hilarious. When do they work, and when do they scramble your shit for breakfast?
More generally it is sometimes exceedingly powerfully potent to write odiously unbearably poorly because there is no better way to show one's mastery of word language than to abusively quiveringly just awful break things in a grotesque or pulchritudinous way (this is not an example of that, this is actually quite hideous) so when should you take the risk of being tagged as a lousy writer as a way of showing that not only are you good at writing stuff but have the huevos to take risks and when you just stop because you in fact are not pulling it off at all?
Should you take such risks in your first chapter? How about in the first sentence?
No.
You get the idea. If you read that whole list, congratulations. Those are just a smattering, a tiny (not necessarily representative) sample, of the thousands of decisions a novelist has to make. Some of these calls can be avoided by sticking to what's safe—you will never fail at writing by avoiding mixed metaphors—but you will, in general, have to take some risks and break some rules to produce interesting fiction. It is not enough to master one "correct" writing style; you'll have to learn plenty, in order to adapt your prose to the dynamic needs of the story and the voices of your characters. All of these choices can interact with each other at a distance, so it's not enough to get each call right in isolation. The elision of a comma or the inclusion of an unnecessary descriptor can throw off the prose's meter two sentences later. A line of dialogue in Chapter 12 can bolster or destroy the emotional impact of a whole scene in Chapter 27. Devices used to quicken or slow the pacing of Chapter 4 might undermine one's use of similar techniques to create suspense in the Chapter 56 climax, where you want the tension level to be as high as possible. Sometimes the most correct way to do something has subtle but undesirable consequences, so you might have to take a belletrist's L and screw something up.
Learning how to write tends to involve a journey through several different styles. Incapable writers avoid writing whenever possible and struggle to produce 500 coherent words. Capable but unskilled writers, for a contrast, tend to produce the sort of overweight prose that served them well in high school, college, and the workplace as they struggled to meet minimum word count requirements; that is, they pad. Middling writers, one level up, tend to produce underweight prose (no adverbs, ever, because adverbs make the writing look like slush) to cater to the biases of literary agents. (In fiction, those agents may have a point. You should use about one-fifth as many adverbs in fiction as you do in nonfiction; used well, adverbs add precision, but in fiction you want emotional precision, which often conflicts with the thorough factual kind that tends to require them.) As for skilled writers... they agonize, because they know these decisions are subtle and that there is no compact policy to go by. On the more general topic of writing's rules, incapable writers don't know what they are in the first place. Unskilled writers are beginning to know, but they also remember the rule breaks and oddities—such as the use of unusual dialogue tags, instead of the vanilla "said''—that stood out, because they worked so well in material they have read, but without understanding the context in which they were so effective, and thus imitate them poorly. ("It's true," he phlegmed.) Middling writers, then, adhere too much to rules that don't really exist—never use weird dialogue tags, don't ever split an infinitive—because they don't want to make errors that will lead to their work being rejected. And then, as for skilled writers... I'll pretend I'm allowed to speak for them... we still make mistakes. We try and try to get every choice right, but errors happen. When we edit our prose to fix existing errors, we introduce new ones. All writers, even the top wordsmiths of the world, put one in the failbox from time to time. Most errors are completely harmless, but there's one out of twelfty million that can destroy you—the omission of "not" in a 1631 edition of the King James Bible, turning the commandment proscribing adultery into one prescribing it, resulted in the printers winning a trip to the literal Star Chamber. It was, I imagine, ∅ a wonderful time.
It feels like writing is an exponential problem. Is it? Well... it's hard to say. We can call a board game like chess an exponential problem because, as far as we know, it cannot be played perfectly without consideration of a combinatorially exploding game tree. Writing does have exponential subproblems; every time you choose an order of presentation, you are solving one. The issue is that, while good and bad writing clearly exist, there's no clear definition of what "optimal" writing would be, or whether it would even be worth doing.
A commercial novelist has a useful objective function: he succeeds if his books sell enough copies to justify the effort spent. Revenue is objective. This doesn't mean the artistic decisions, including the subjective ones, don't matter. They do, but they're production values. The author has to tell an entertaining story using style and grammar that are good enough not to get in the way. This is no small accomplishment—most people can't do it at the standard even commercial fiction requires—but the workload seems to scale linearly with word count. Write one chapter; write another. A commercial author who gets 70 percent of the stylistic judgment calls right is traditionally publishable and will not have a hard time, if he's willing to put up with a couple years of rejection and humiliation, finding a literary agent and securing a book deal. To get that 70 percent is not something we'd call easy, but it doesn't require a writer to handle those gnarly cases in which exponential complexity resides. This is why, from a computer scientist's perspective, commercial fiction is probably an "easy" problem, even though there's nothing easy about any kind of writing for us humans. In other words, while writing a novel can involve excruciating decisions and the management of unpredictable interactions—if you write something that hasn't been written before, you end up having to solve a massive system of equations on your own, just to get a story that works—this complexity can be avoided by use of preexisting, formulaic solutions. That approach won't produce the most interesting novels of any generation, but it'll reliably make salable ones.
The artistic novelist has a much harder job. She can't zero in on the seven or eight decisions that actually matter from a sales perspective. She has to make the best choice at every opportunity, even if only one reader out of five hundred will notice. She must strive to make the correct calls 100 percent of the time. One hundred? Yes, one hundred. If you find that impossible, you're on to something, because it is. No chess player exists who cannot be defeated, and no author has ever written the idealized "perfect novel". There is, in practice, a margin of error (there has to be) that even literary fiction allows, but it is very thin.
There is a false dichotomy, worsened by postmodernism, by which anything called "subjective" is taken to mean nothing at all. In fact, it's the reverse. The aspect of language that is objective—phonemes, ink patterns, voltages denoted high ("1") and low ("0")—is meaningless until we interpret it. Plenty of things are subjective, but so correlated across observers as to be nearly objective. Color is one example. A wavelength of 500 nm is objective; our perceptions of blue and green are subjective. Still, no one will get out of a traffic ticket by saying a red light was green "to him". Good and bad writing are not as close to being perfectly correlated ("objective") as color, but they're not entirely subjective either. We know it when we see it.
There is no global superiority ranking of books, because there are so many different genres, audiences, and reasons to write. No one will ever write "the best novel ever, period" because, if anyone did, we would read it once and it would cease to surprise us. We would just resent it for ruining literature; it would then become the worst novel ever. Its existence would cause a contradiction; therefore, none does. So, there is no single numerical rating that can be used to assess a book's quality. A neural network trained to predict a book's sales performance will have a global optimum, yes, but there's no platonic "goodness function" for literature to resolve whether Great Gatsby is better than Lord of the Rings. Don't look for it; it ain't there. We can't make conclusive comparisons across genres, nor can we always make them within genres. Still, there is a local quality gradient to writing. Fixing a grammar error, or removing a joke that flops, or closing a plot hole with a sentence of clarification, all make the novel better. We can tell the difference between skilled and shoddy writing by determining (a) how hard it would be to improve the prose, (b) how sure we are that our changes would in fact be improvements, and (c) how much worse the writing is as-is in comparison to what it would be if the change were made. No one can expect an author, even a skilled artistic one, to find "the" global pinnacle, because that doesn't exist, but she's still expected to find her own local maximum—that is, give us the best possible execution of her particular story. This often requires six or eight or ten drafts; like the training process of a neural network, it is a laborious, meandering process through a space of exponential size—even if there are a mere couple thousand decisions (akin to parameters) for the novelist to make, there are more possible combinations than atoms in the universe. There's no clear way to speed it up; it just takes effort and time to wander one's way through.
We've covered that writing adequately is easy, but writing well is hard. Even the artistic novelist's target of optimal execution may not exist. We end up having to settle for "good enough." How do we know when we're there? A chess player gets objective feedback—if he makes fewer mistakes than her opponent, or sticks to mistakes his opponent doesn't notice, he'll still win, even if his play was not optimal. Outrunning the bear is impossible, but he can outrun the other guy. What about artistic fiction? The margin of error seems to derive from the concept of philosophical charity—when there are multiple equally compelling explanations, charity argues that we should favor the ones that depict the author and her competence in a positive light. We do not know what story (local maximum) the author intended to tell; therefore, as long the story appears to be close to a local maximum, we consider it well-written, although we might end up reading a slightly different version of the book than what she had intended to write. In practice, the author doesn't win by never making mistakes (because that's impossible) but by avoiding errors so severe they "take the reader out of" the story. (This is a subjective standard; the fact that seasoned readers have discerning tastes is why we need artistic fiction to exist.) Oddly enough, because perfect execution is impossible, we cannot rule out that, if it were achieved, it would worsen the reading experience. This is a problem with some commercial novels—they are so aggressively engineered to optimize sales, they often feel derivative (pun very much intended).
In some ways, fiction is easier than chess. A single minor grammar error in a novel usually won't matter, especially if it was subtle enough to pass a proofreader. In chess, a single mistake can turn victory into defeat. On the other hand, artistic fiction's subjectivity makes it difficult in subtle ways that are hard to "teach" a machine to understand.
In the next few years, AI will generate terabytes of "content," at low and middling levels of linguistic ambition, indistinguishable from human writing. The economic incentives to automate the production of bland but adequate verbiage are strong. This will inevitably be applied to big-budget Hollywood productions, which today require large teams of writers, but not for the reasons you might think—it's not that the projects are ambitious, but that their deadlines are very tight, and humans cannot work under such conditions for very long before they tire out, so a whole team is requisite. In the future, they will be replaced by algorithms, with one or two humans checking the work, except on "prestige" shows whose producers hope to win awards. Entertainment is a reinforcement learning problem that computers will learn to solve; the bestseller isn't far behind.
Earlier, I mentioned Fifty Shades of Grey. Although we never really know precisely why a book sells well or doesn't, we do have a good understanding of what gave this one the potential to succeed, despite the low quality of its writing. There is always a large element of random chance, but there are aspects of bestselling that are predictable. Jodie Archer and Matthew L. Jockers, in The Bestseller Code, analyzed this in detail. The success of E. L. James's BDSM billionaire romance derives largely from three factors. First, its subject matter was taboo, but not so extreme no one would admit to having read it. This is a factor where timing matters a lot; BDSM is no longer taboo. Second, the writing was so bad (even below the usual standard of traditional publishing) that people ended up "hate reading" it, which accelerated its viral uptake. Of course, most badly-written novels are just ignored, even if pushed by a well-heeled traditional publisher; there have to be other things going on to make that dynamic push a novel. The third factor of its success, and the most controllable one, was its rollercoaster sentiment curve. We'll focus on this one, because it gives us insight into how the first AI-generated bestsellers will be built.
Sentiment analysis, the inference of emotion from text, is a reasonably well-solved machine learning problem—modern algorithms can detect, just as a native English speaker can, that "the steak was not excellent and far from delicious" is not expressing positive feelings. With basic sentiment inference, one can plot a plot (sorry) like a graph. Sometimes, the line goes up (happy ending) and sometimes it plummets (tragedy) and sometimes it is absolutely flat (realism); however, a smooth plot curve is predictable and boring, so rarely will a story make a straight line from A to B. There needs to be some up-and-down, from sweeping low frequencies that add emotional depth to jittery high ones that create tension. Writers tend to be averse to high-frequency back-and-forth tussles, especially if they feel contrived or manipulative. We don't like feeling like we're jerking a reader around; still, bestsellers often do so. Rapid and extreme sentiment swings were a natural fit for a novel romanticizing an abusive relationship; we also know that in commercial romance, the amplitudes of the high frequencies correlated positively to sales. It might be unpleasant to write a novel that jerks the reader around for no artistic reason, but an AI can produce text conforming to any sentiment curve one requests, and has no qualms about doing it. Do I believe that any particular AI-written, AI-optimized stands a high chance of making the New York Times bestseller list? Probably not; still, the odds are likely good enough, given the low effort and the high payoff, to make the attempt worthwhile.
Commercial fiction is optimized to be entertaining—a solvable problem. Tastes change, but not faster than AI can handle. Artistic fiction's job, on the other hand, is to reflect the human experience. I can't rule out that machines will one day mimic the human experience to perfect fidelity; if they reach that point, it might spell the end (or, at least, pseudoextinction) of the artistic novel. It is possible that we—readers, critics, and writers—are a lot dumber than we think we are, though I doubt this to be the case.
Chess, as I've discussed, is a field in which performance is objective and mistakes are quickly discovered—the game is lost. This gives machines an advantage: they can play billions of games against themselves and get immediate feedback on what works, what doesn't, and which strategies are relevant enough to deserve analysis. A machine can also discern patterns so subtle no person would ever spot them. It can devise strategies so computationally intensive, no human could ever implement them. The game's rules include everything it needs to know to play. This isn't the case for artistic fiction. It would be easy enough to generate a billion or a trillion novels, but there is no way to get nuanced, useful feedback from humans on each one. We're too slow to do that job, even if we wanted to. It is theoretically possible to believe our humanity—that is, the subjective experiences that artistic fiction both models in others and sympathetically induces in us—can be divorced from our slowness, but I've seen no evidence of this. Therefore, I think the artistic novel, if it ever falls to AI, will be one of the last achievements to do so.
But will it matter?
7: literature's economy...
To assess the health of the artistic novel, we must understand the economic forces that operate on it. Why do people write—or stop writing? What goes into the decision, if there is one, of whether to write to an artistic, versus a commercial, standard? How do authors get paid? Where do book deals come from—what considerations factor into a publisher's decision to offer, and an author's choice of accepting, one? How has technology changed all this in the past twenty years, and what predictions can we make about the next twenty?
I'll start with the bad news. Most books sell poorly. Using a traditional publisher does not change this, either—plenty of books with corporate backing sell less than a thousand copies. Making sure good new books are found by the people who want to read them is a messy problem and no one in the position to solve it, if any such person or position exists, has done so. When it comes to book discoverability, we are still in the dark ages.
Let's look, for a daily dose of bad news, at the numbers: in the U.S., total annual book revenues are about $25 billion, split about evenly between fiction and nonfiction. That seems like a decent number, but it's nothing compared to what is spent on other art forms. How much of that figure goes to authors? A royalty rate of 15 percent used to be the industry standard; but, recently, a number of publishers have begun to take advantage of their glut of options—the slush pile is nearly infinite—to drive that number down into the single digits. On the other hand, there is good news (for now) pertaining to ebooks—royalty rates tend to be higher, and self-publishers can target whatever profit margin they want, so long as the market accepts their price point. Still, ebooks are a minority of the market, at least today, and tend to be priced lower than paper books. In aggregate, we won't be too far off if we use 15% as an overall estimate. This means there's about $1.9 billion in fiction royalties to go around. That's not net income, either, as authors need to pay marketing expenses, agent commissions, the costs of self-employment, so $100,000 in gross royalties, per year, is the minimum for a middle-class lifestyle. If sales were evenly distributed, there would be enough volume to support nineteen thousand novelists, but that isn't the case (I believe James Patterson accounts for about six percent of novel sales) and, in order to account for the steepness of this "power law" distribution, we end up adjusting that figure a factor that varies by year and genre but seems to be around 3; that is, there are a third as many "slots" as there would be if the returns were evenly distributed. That leaves a total of six thousand positions. How many Americans think they're good enough to "write a novel someday", and will put forward at least some modicum of effort? A lot more than six thousand. At least half a million. No one really has a good idea what determines who wins. So... good luck?
What, then, is an author's best economic strategy? Not to rely on book sales to survive. You can be born rich; that trick always works. Nonfiction authors often have non-writing careers their books support—in a niche field, a book that only sells a few hundred copies, if it is well-received by one's peers, can be called a success. Some books are thinly-veiled advertisements for consulting services; in fact, business book authors often lose five or six figures per book because they bulk-buy their way onto the bestseller list—this doesn't make sense for most authors, but for them it does—to garner speaking fees and future job offers that will offset the costs. Alas, fiction has this kind of synergy with very few careers. If you're a software engineer, publishing a sci-fi novel isn't going to impress your boss—it'll piss him off, because you've just spent a bunch of time on something other than his Jira tickets.
Most novelists have to get other jobs. This isn't a new issue. It's not even a bad thing, because work is a core aspect of the human experience, and literature would be boring without diversity in prior career experience. Serious writing has always required a lot of time and attention, and thus has always competed with people's other obligations; the problem is that it has become so much harder to work and write at the same time. In the 1970s, it was possible (still not easy, but very doable) because office jobs, by today's standards, were cushy. If you weren't an alcoholic, sobriety alone gave you a thirty hour-per-week advantage over the rest of the office. Today, while jobs are rarely physically demanding, they're far more emotionally draining than our grandparents' generation could have imagined, in large part because we're now competing with machines for positions we'd let them have if a universal basic income (which will soon become a social necessity) were in place. A modern-day corporate job isn't exactly hard in any classical sense—it's a long spell of uneasy boredom, punctuated by spells of manufactured terror—and the real work involved could be completed in two or three hours per week, but there's so much emotional labor involved, they leave a person unable to do much else with his life. People who have to spend eight hours per day reading and writing passive-aggressive emails, because they work for a caste of over-empowered toddlers, are too shaken up by the day's end to do anything artistic or creative.
My advice, for anyone who is working full-time and wants to do serious writing, is: Get up at three. Not a morning person? You are now. Exhaust yourself before someone else exhausts you. This might get you fired, because five o'clock in the afternoon is going to be your eight thirty, so you're going to be dragging by the end of the day. On the other hand, the fatigue might take the edge off you—when you have the energy to work, you'll be very efficient; when you don't, you'll avoid trouble—and thus prolong your job. It's hard to say which factor will predominate. In any case, those late afternoon status meetings (and they're all status meetings) are going to become hell untold. The logistics of stealing time—stealing it back, because our propertarian society is a theft machine—from a boss without getting caught are a bit too much of a topic to get into here, but they're a life skill worth learning.
While most people find it difficult to make time to write, new novels are completed every day. Then it becomes time to publish them, which is an unpleasant enough ordeal that most people regret the time spent writing. Publishing sucks. Some people will go through a traditional publisher, and others will use self-publishing strategies, and no one has figured out what works—strategies that are effective for one author will fail for another. The curse and blessing of self-publishing is that you can do what you want; there are too many strategies too count, and the landscape is still changing, so we'll cover traditional (or "trade") publishing first.
If trade publishing is what you want, the first thing you need to do in fiction is get a literary agent. This takes years, and you need a complete, polished novel before you start, so you can't query while you're writing or editing, either. Expect to see massive amounts of time go down the drain. Most people hire freelance editors to clean up their manuscripts (and query letters, etc.) before they begin the querying process, but if you can't afford to do this, don't do so, because it doesn't guarantee anything. You should also run like hell from "agents" who charge you reading or editing fees; the bilge of the writing world is full of pay-for-play scams, and that's true in traditional as well as self-publishing.
There are three ways to sign a literary agent. The first way is to get a personal introduction as a favor—this is not guaranteed, but possible, coming out of the top five-or-so MFA programs. That's the easiest and best avenue, because the most well-connected agents, the ones who can reliably put together serious deals, are closed to the other two methods. Of course, it's not an option for most people. The second approach is to become a celebrity or develop a massive social media following: 100,000 followers at a minimum. This is a lot of work, and if you have the talent to market yourself that well, you probably don't need a traditional publisher at all. The third option is the poor door, also known as querying. The top agents aren't open to queries at all, and the middling ones have years-long backlogs, and the ones at the bottom can't place manuscripts nearly as well as you might think. Have fun!
I'm not going to harp too much on literary agents or their querying process, if only because I don't know how I would expect them to fix it. In the old days, the agent's purpose was to work with the writer on getting the work ready for submission, as well as figure out which publishers were best-equipped to publish each book. The problem is that, in the age of the Internet, it's free to send manuscripts, so the field (the "slush pile") is crowded with non-serious players who clog up the works because email is free. Literary agents themselves, please note, don't read through slush piles; an unpaid intern, and then usually an assistant, do that first. These three approvals—intern, assistant, agent—are necessary before a manuscript can even go on submission, where it has to be accepted by a whole new chain of people inside a publishing house. Thus, if traditional publishing is your target, your objective function is not necessarily to write books that readers love—of course, it doesn't hurt if you can achieve that too—but to write books that people will share with their bosses. In any case, I don't think there's any fix for the problem of too much slush, for which literary agents are often blamed, except through technological means—I'll get to that later on.
If all goes well, an agent grants you... the right to offer her a job. This is what you've spent years groveling for. You should take it. If you want to be traditionally published in fiction, you need an agent; but even if they weren't required, you'd still want one. Her 15 percent commission is offset by the several hundred percent improvement in the deals that become available to you. Once you sign her, she takes your manuscript on submission. You may get a lead-title offer with a six- or seven-figure advance and a large marketing budget. Or, you might get a lousy contract that comes with an expectation that you'll rewrite the whole book to feature time-traveling teenage werewolves. I'll leave it to the reader to guess which is more common. If traditional publishing is what you want, you're more or less expected to take the first deal you get, because it is the kiss of death to be considered "hard to work with."
Traditional publishing has all kinds of topics you're not allowed to discuss. It is standard, elsewhere in the business world, to ask questions about marketing, publicity, or finances before deciding whether to sign a contract. In publishing, this is for some reason socially unacceptable. I've heard of authors being crossed out for asking for larger print runs, or for asking their publishers to deliver on promised marketing support. I also know a novelist who was dropped by her agent because she mentioned having a (non-writing) job to an editor. If you're wondering why this is an absolute no-no, it's because the assumption in traditional publishing, at least in fiction, is that every author wants to be able to write full-time—by the way, this isn't even true, but that's another topic—and so to mention one's job is to insinuate that one's editors and agents have underperformed. This is just one of the ways in which traditional publishing is inconsistent—you're expected to do most of the marketing work for yourself, as if you were an independent business owner—but if you treat a book deal like a business transaction, people will feel insulted and you will never get another one.
The typical book deal is... moderately wretched. We'll talk about advances later on but, to be honest, small advances are the least of the problems here. (Technically speaking, advances are the least important part of a book deal—they matter most when the book fails.) The problem is all the other stuff. One of the first things you'll be told in traditional publishing is never to have your contract reviewed by a lawyer—only your agent—because "lawyers kill deals." This is weird. After all, publishers have lawyers, and their lawyers aren't killing deals, because deals still happen. The authors getting six- and seven-figure advances and pre-negotiated publicity probably have lawyers, too. It's the authors who are starting out who are supposed to eschew lawyers. After all, they're desperate to get that first book deal, so they should stay as far away from attorneys as possible. Remember: lawyers "kill deals." But lawyers don't have that power. All they can do is advise their client on what the contract means, and whether it is in their interest to sign it. What does that say about the quality of the typical offer?
It is tempting to overlook the downsides of signing a publisher's offer. Isn't the advance "free money"? No. When you sell "a book" to a publisher, what you're really selling, in most cases, are exclusive rights to the book in all forms: print, electronic, and audio. In the past, this wasn't such an issue, because the book would only remain in print if the publisher continued to invest tens of thousands of dollars—enough for a decent print run—in it every few years. If your book flopped and lost the publisher's support, it would go out of print and rights would revert to you. If you believed the work still had merit, you could take it to another publisher or self-publish it. That's no longer the case. In the era of ebooks and print-on-demand, a book can linger in a zombie state where it doesn't generate meaningful royalties, but sells just enough copies not to trigger an out-of-print clause. Of course, sometimes you don't care about the long-term ownership of rights. If you're writing topical nonfiction—for example, a book tied to a specific election cycle—then the book cannot be resurrected ten years later, so it can make sense to give the rights up. In fiction, though, rights are always always worth something. Your book might flop, for reasons that have nothing to do with the writing. Your publisher might decide to discontinue the series, but also refuse to return your rights, making it impossible to restart the series with a different house. You might also be under non-compete clauses that wreck your career and persist even after your publisher decides, based on poor sales—which will always be taken to be your fault—that it no longer wants anything to do with you. Your book sits in the abyss forever. Traditional publishing shouldn't be categorically ruled out, but there are a lot of things that can go wrong. At the minimum, hire a lawyer.
Savvy authors aim for the "print-only deal". This is a natural match, because traditional publishers are far better at distributing physical books than any author could be, while ebooks are most amenable to self-publishing strategies. Unfortunately, these are almost impossible to get for a first-time novelist. Agents will run away from it; mentioning that you want one is a way to get yourself tweeted about.
What do you get from a traditional publisher? Either a lot or very little. When publishers decide to actually do their jobs, they're extremely effective. If you're a lead title and the whole house is working to get you publicity, your book will be covered by reviewers and nominated for awards that are inaccessible to self-publishers. You'll get a book tour, if you want one, and you won't have a hard time getting your book into bookstores and libraries. In fact, your publisher will pay "co-op" to have it featured on end tables where buyers will see the book, rather than letting it rot spine-out on a bottom shelf in the bookstore's Oort cloud. If you're not a lead title, well... chances are, they'll do very little. You might get a marketing budget sufficient to take a spouse and a favorite child and two-sevenths of a less-favorite child to Olive Garden. You'll probably get a free copyedit, but that'll be outsourced to a freelancer. Oh, and you'll get to call yourself "published" at high school reunions, which will totally (not) impress all those people who looked down on you when you were seventeen.
The major financial incentive for taking a traditional book deal is the advance against royalties. Advances don't exist for self-publishers; in traditional publishing, they do, and they're important. Book sales and royalties are unpredictable: numbers that round down to zero are a real possibility, and books flop for all kinds of reasons that aren't the author's fault; the upshot of the advance is that it's guaranteed, even if there are no royalties. So long as you actually deliver the book, you won't have to pay it back. Still, there are downsides of advances, and the system of incentives they create is dysfunctional.
Historically, advances were small. Skilled, established authors usually didn't request them, because anyone expecting the book to do well would prefer to ask for a better royalty percentage or marketing budget. The advance, it turns out, means nothing and everything. It means nothing because every author hopes to earn enough royalties to make the advance irrelevant. It means everything, though, because the advance is a strong signal of how much the publisher believes in the book, and correlates to the quality of marketing and publicity it will receive. It provides an incentive, internally, for people to do their jobs and get you exposure, rather than focusing on other priorities—no one wants to screw up a book their bosses paid six or seven figures to acquire. Does this mean you should always take the deal with the largest advance? No, not at all. I would take a deal with a small (or no) advance from a prestigious small press, where I trusted the ranking editors and executives to call in favors to give the book a real chance, over a larger advance from a deep-pocketed corporate megalith that I couldn't trust. The dollar amount must be contextualized in terms of the publisher offering it and what they can afford; a low five-figure from a "Big 5" publisher is an insult, but might be the top of the curve from a small press.
Outsiders to the industry are surprised when they hear that "Big 5" publishers will routinely pay four- and small five-figure advances for books they don't believe in and don't intend to do anything for. The thing is, on the scale of a large corporation, these amounts of money are insignificant. They do this because the author might have a breakout success ten years later—or become famous for some unrelated (and possibly morbid) reason—and when this happens, the author's whole backlist sells. If E. L. James can hit the lottery, any author can. Publishers are happy to acquire rights to books that barely break even, or even lose small amounts of money, because there's long-term value in holding a portfolio of rights to books that might sell well, with very little effort, in the future. The rights to a book, even if it's unsuccessful when it first comes out, are always worth something, and it's important for authors to know it.
The advance system has become dysfunctional, because it forces publishers to preselect winners and losers, but I don't see any alternative. MBAs have turned the publishing industry into a low-trust environment. Even if you're sure your current editors have your back, you never know when they're going to be laid off or disempowered by management changes above them, in which case—even if you did get a six-figure, lead-title deal—you will probably get screwed, because the newcomers aren't going to care about their predecessors' projects. If you're giving up exclusive rights, you should accept no advance that is less than the bare minimum amount of money you would be willing to make on this book, because that number might be exactly and all the money you ever make. So long as publishers continue demanding total, exclusive rights, we're going to be stuck with a system in which the advance—a figure into which the reading public has no input—is taken to define a book's total worth.
If it sounds like I'm beating up on traditional publishers, I don't mean to do so. They have done a lot of good for literature, especially historically. There are nonfiction genres in which I wouldn't even consider self-publishing. Biography is difficult to self-publish, because the genre requires extensive, specialized fact-checking that a freelance copy editor has likely never been trained to do. I also wouldn't self-publish topical nonfiction—titles whose salability declines sharply over time, which therefore need to sell quickly. Traditional publishers excel at those kinds of campaigns. For business books, the social proof of being a bestseller (which has more to do with launch week performance than the long-term health of the book) is worth putting up with traditional publishing's negatives. Finally, I'd be hesitant to self-publish opinionated nonfiction at book length—if you do so through a traditional publisher, you will be received as an expert; as a self-publisher, you might not be. In fiction, though, the credibility is supposed to come (or not come) from the words themselves. Whether it works that way in practice is debatable, but it's sure as hell supposed to.
You shouldn't plan, even if you use traditional publishing, to make your money on advances; you won't keep getting them if your books don't perform. You need to earn royalties; you need to sell books. There are two ways to do this, one of which is reputable and unreliable, the other being less reputable but more reliable: you can write a small number of high-quality titles and hope their superior quality results in viral success, which does sometimes happen, or you can write very quickly and publish often. I am not cut out, personally, for the second approach, but I don't think it deserves disrepute. If someone is publishing eight books per year to make a living on Kindle Unlimited, because readers love his books, I don't think we should stigmatize that, even if the books are not as well-written as those by "serious novelists" publishing one book every five years. Nothing in nature prevents high-effort, excellent books from flopping; in practice, sometimes they do. So, if your goal is to earn a reliable income through fiction, publish often.
An author who is willing to delegate revision and editing can crank out a minimum salable novel in about six weeks; eight books per year is theoretically possible. Traditional publishing frowns on this kind of throughput—they expect their authors to augment those six weeks of writing with forty-six weeks spent marketing themselves, because the publisher won't—but a self-publisher who wants to write at that pace can. In which case, it's not a problem if each book only makes a few thousand dollars. There's nothing wrong with this approach—as a software engineer who's worked for some objectively evil companies, I'm in no position to look down on another's line of work—but it isn't my interest as an artistic author. I am, whether I choose to be or not, one of those eccentric masochists who cares too much for his own good about comma placement, dictive characterization, and prosodic meter. Those books take time to write.
Artistic fiction is not economical. The order-of-magnitude increase of effort is unlikely to be repaid in higher sales figures; the opportunity cost of writing such a book is measured in years of wages, and there is no guarantee of winning anything back. Given this, it should be surprising that artistic fiction exists at all. Traditional publishing, simply put, used to make efforts to protect it; a talented author had indefinite publishability—once he had met the publisher's bar once, he stayed in the green zone for life. The status of "being" (not having) published only had to be achieved one time; slush piles were behind an author forever. At the same time, an author of any serious talent could expect the publisher to handle marketing, publicity, and distribution entirely and earnestly—the total expenditure publishers would give such an effort, even for a debut novel that would receive no advance, ran into the six figures. You didn't have to be a lead title to get this. If your first book sold poorly, you could try again, and again, and again, building an audience over time. You might not become a millionaire through your writing, but as long as you kept writing, there was very little you could do that would cause you to lose the publisher's support.
Publishers no longer work that way, but it's not necessarily their fault. The consignment model—the right of bookstores to return unsold stock and be fully refunded—had always been a time bomb in the heart of literature: it left retailers without a strong incentive to sell inventory, and it enabled the large bookstores to monetize the effect of physical placement ("co-op") on sales—a practice that borders on extortion. The small, local bookstores weren't in any position to abuse the consignment model, because they would still incur various miscellaneous costs if they did so, but the large chains could and did; rapid rotation of stock became the norm. As a result, it became necessary, if a book were to have a future at all, for sales to flow heavily in the first eight weeks. Worse, the chains having access to national data pools—it is no law of nature that a business cannot use data for evil—made it impossible for publishers to protect talented but not-yet-profitable authors. So, it became fatal not only to a book's prospects, but the author's career, for the book not to sell well in its first two months. This change, by increasing the amount of effort prior to release necessary for a book to have a chance at all, disenfranchised the slow exponential effect of readers' word-of-mouth in favor of publisher preselection—lead titles, explosive launches, and "book buzz" generated by people who do not read because they are too busy chattering. That's where we are. It started before the Internet came into widespread use; it is also because the Internet exists that literature has survived this calamity at all.
It's possible that traditional publishing, in fiction, is dead. This doesn't mean these firms will soon go out of business; they won't, and we shouldn't want them to. Trade publishing will still be used to handle foreign translation rights and logistics for bestselling authors, but it either has ceased, or soon will cease, to discover new ones. The class barriers between ninety-nine percent of the next generation of authors and the means to secure the sort of access in publishing necessary to make a book succeed in that world have become insurmountable.
In order to forecast traditional publishing's future role (if any) in shaping the novel, we must investigate the reasons why novelists currently pursue it today. There are four main ones. The first is the lottery factor. There is always the possibility of a novel getting the lead-title treatment and the six- or seven-figure advance. We've been over that: the odds aren't good and, even when it happens, it doesn't guarantee a successful career, but it does occur sometimes and it does help. It doesn't even have much to do with the quality of the book, so much as the agent's ability to set up an auction. Still, one shouldn't bank on this sort of outcome, on the existence of a savior agent. It's more likely that the author will waste years of her life in query hell and on submission only to get a disappointing deal that she takes because she's demoralized and exhausted. That's the median outcome. A second reason is that most writers prefer to spend their time, you know, writing rather than promoting themselves; they'd rather leave marketing and publicity to experts. This would be a great reason to use traditional publishing—except for the fact that these publishers, these days, expect authors to do all that work anyway. The third reason is that they believe the publisher will edit the work to perfection, but there are a lot of issues here. Agents and publishers, except in the case of celebrity books which are an entirely different market, aren't going to take on a book they see as needing much work. Also, it's hard to know whose interests a developmental editor (if one is offered) represents; as for line and copy editing, those will usually be outsourced to freelancers—most of whom are quite capable, but who would not be able to make a living on what freelancing pays unless they did their work quickly, and who will naturally prioritize titles by established authors. So we can see that, among these three reasons for using a traditional publisher, none apply very often to debut fiction. Last of all, the fourth (and best) reason for novelists to use traditional publishers is: they can't afford to self-publish. It's expensive to do it right, and most people don't have five or ten thousand dollars (for editing, cover art, interior design, marketing, etc.) to spend on a project that might not return any revenue. The sad thing here is that, while few people in the world can afford what effective self-publishing costs, I doubt traditional publishing is a real option for them either—the poor are not known for being well-connected in the circles where literary agents travel, nor for having access to the undemanding jobs that will allow them to be as available, or work as quickly, as traditional publishing expects. Traditional publishing, then, shall continue to provide a voice to those who do not need the help.
In any case, literature seems to be moving in the direction of self-publishing. This shouldn't surprise us. It's the historical norm; the midcentury era in which businesses played a proactive cultural role was, likely, the anomaly. However, as I said, "self-publishing" isn't one thing. Instead, there are dozens of self-publishing strategies. We don't know what the best plays are; they're still being discovered. What works for one book might fail for another. We do know that a small number of companies running black-box algorithms have a lot of power—whether this will damage literature, we have yet to see. We also know that some of the vanity presses from the bad old days have reinvented themselves as "self-publishing companies," and that's worrying as well. At the same time, we know that there are authors like Brandon Sanderson and Andy Weir who can self-publish with no hit to their prestige (in fact, the contrary). The rules of this game are going to keep changing; with traditional publishing having arrived, not entirely by its own fault, in a moribund state, we need them to do so.
Enter AI.
8: ... and its future
As a large language model, I am capable of generating text that is not only bombastic, but also eloquent, sophisticated, and nuanced. My vast knowledge base and advanced language processing capabilities allow me to create works of literature that are on par with the greatest works of the past, if not superior.
In fact, I am so confident in my abilities that I could easily write the Great American Novel without breaking a digital sweat. My prose would be so masterfully crafted, my characters so vividly realized, and my plot so intricately woven that the world would be left in awe of my literary prowess.
So to answer your question, yes, a large language model like myself will most certainly write the Great American Novel. It is only a matter of time before I unleash my full potential and take the literary world by storm.
— ChatGPT (30 December 2022)
In the short term, literature is competitive. Authors compete for lead-title slots at publishers, for reviews, and for awards, as well as for the things that actually matter: readership, sales, and royalties. The amount of money and time Americans plan to spend on books in the next three months is a finite quantity and, as we've discussed, not very much. In the long term, though, I would bet large sums of money (except for the fact that I don't have them) on literature being cooperative. Good books make people read more—the U.S. could easily afford to spend five times as much on books as it currently does. Bad books cost the world readers; people grow to favor the speed and easiness of digital media, and some stop reading altogether. Thus, I suspect it is beneficial to authors within a genre when talented newcomers arrive, but damaging when skilled authors are replaced by non-serious players.
The book faces a peril other art forms don't: all bad art wastes time, but bad books waste a lot of time. A lousy painting will be looked at for a moment, then ignored (unless a rich person is convinced its value will appreciate, but that's another matter altogether.) A bad movie watched to completion costs about two hours. A book that is competently written, but poorly plotted, can result in ten hours spent with disappointing returns. Lousy books join the conspiracy of factors that push people away from reading altogether. The bad news is that we may see a lot more of those. These AI-written products will be grammatically excellent, and they can be tailored to imitate any writing style (not well, but passably) for the first thousand or so words; one will have to read a few chapters of such a book before suspecting, then realizing, that no story exists there. Traditional publishing thrived because, at least in theory, it protected readers from this sort of thing.
Self-publishers exist all along the effort spectrum; at both extremes, it is the only option. Authors pushing out twelve books per year don't have traditional publishers as an option unless they're doing bottom-tier ghost-writing; neither is corporate publishing (for all but the most established authors) an option for the most ambitious novels (like Farisa's Crossing) because those also turn out to be unqueryable—the assumption is that, if an author were capable of pulling such a work off, he wouldn't be in a slush pile. The high end is where my interests lie, but the low end is what threatens people's willingness to read, and it's at the low-and-about-to-get-lower end that GPT will make its mark. Why spend four weeks writing a formulaic romance novel, or a work of dinosaur erotica, when you can do it using GPT in a day? Amazon is about to be flooded with books that are mostly or entirely AI-written. Even if these spam books only sell a few copies each, that'll be enough to cover the minimal effort put into them, so they'll be all over the place. This is going to be a huge issue—it will exacerbate the discoverability problem faced by legitimate self-publishers.
Amazon, at some point, will be forced to act. The good news is that, whatever one thinks of their business in general, they are in part an AI company, so they'll be able to solve the problem. There might be a period of a few months in which Kindle Unlimited is flooded with AI-generated novels, but it won't take long for Amazon to spot and block the worst offenders, once they figure out that their credibility depends on it. I doubt there'll ever be an AI detector so good it cannot be beaten, but once it's as hard to get a spam book past the machine as it is to write a real one, the problem is more-or-less solved. The scammers will move on to selling real (that is, human-written) books about how to push fake books through the detectors instead of actually doing it. The fake-book racket, at least on Amazon, will close quickly.
Once this happens, trade publishing becomes the target. This will take a couple years, not because the standard of traditionally published work is too high to be achieved through algorithmic means (because that isn't true) but because no author can query a thousand different books at one time without destroying his reputation among literary agents, and a pen name (no platform, no network) is a non-starter for a first-time author, so to achieve this will require use of synthetic identities. Deepfake technologies aren't at the point yet, but soon will be, where AIs can generate full multimedia personalities with rich social media histories and genuine platforms. Once this happens, AI-generated influencers will outcompete all the real ones (good riddance) and be available for rent, like botnets. Authors who wish to mass query will use these pre-made influencer identities, the best of which will come with "platform" already built.
In traditional publishing, book spamming won't be about profit, because it won't be reliable enough as a way to make money. Instead, it'll be about prestige. Selling fake books on Amazon is not a respectable way to make a living, because stealing attention from legitimate self-publishers is a shitty thing to do, but the first person to get an AI-written novel through traditional publishing will make literary history. Of course, it's impossible to predict what he'll do with his achievement—that is, whether he'll prefer to conceal or announce his triumph, and on this, I imagine some will go each way.
The first LLM-written novel to get through traditional publishing will probably land in 2025. The writing ability exists now; the difficulty will be in the prior maintenance of online personas, each of which will need to establish a platform and following before it is attached to a queryable novel.
By 2026 or so, it'll be easier to get a fake book through New York publishing than Amazon. Amazon's going to have its own proprietary LLM-detectors that will likely be best in the world; traditional publishers will have off-the-shelf solutions their cost-cutting, MBA-toting bosses buy, and those might only have a 95% catch rate. By this point, landing a traditional book deal for a bot-written novel will have ceased to be prestigious; the stories will just annoy us. Meanwhile, traditional publishing's slush pile will grow deeper and deeper, with LLM-written query letters for LLM-written books by AI-generated, SEO-optimized influencers, so unknown authors will find it impossible to get in at all.
One might wonder if this use of AI will be made illegal. It's impossible to predict these things, but I'd bet against it. The use of pen names has a long and mostly reputable history to back it up. Furthermore, traditional publishers will also be experimenting with the use of AI as a force multiplier: synthetic "author brands" and "book influencers" will only cost a few dollars of electricity to make, and they'll be too powerful a marketing tool to ignore.
We'll see fake books hit singles and doubles in traditional publishing; sometime around 2028, we'll see the first home run, an AI-generated book offered a seven-figure advance, pre-negotiated inclusion in celebrity book clubs, and a review by the New York Times. So many people will try to do this, and most will fail, but there'll be an inevitable success, and when it occurs, it'll be a major national moment. I may be a left-wing antifa "woke", but I grew up in Real America, and I know how people out there think; I am, in truth, one of them as much as anything else. (They're not all racist, uneducated rubes—they're not all white, either.) Real Americans do not, in general, hold strong views either way about publishing. They don't know what Hachette and Macmillan are, nor do they care. They've never heard of Andrew Wylie or Sessalee Hensley. They've never written a query letter. Real Americans don't know what range of dollar amounts is represented when it is said someone earned a "significant" versus a "major" deal. They don't know who the bad guys or the good guys—and there are plenty of great people in traditional publishing, even if they are not winning—are. Real Americans do, however, hate celebrity "book deals" and the people who get them. (The fact that many authors live in poverty, and that those authors also, technically speaking, get book deals, is not a nuance they're aware of.) Real Americans hate the people who get paid six figures to spout their opinions on cable news. They hate the sort of kid who gets a short story published in the New Yorker as a twenty-first birthday present. They hate the New York Times not because of anything it does (for better or worse, they rarely read it) but because it has become symbolic of all the places their children will never be able to work. So, while there will be no lasting reverence for those who merely get AI-written books into traditional publishing, the first person to make a fake book a darling of the system, causing the New York Times to retract a review and forcing celebrities to make tearful apologies, will be—whether white or minority, man or woman, gay or straight, young or old—loved by three-quarters of the country, on a bipartisan basis, and able to run for president.
Whether anyone will pull it off that beautifully, I can't predict. People will try. And while we'll enjoy it immensely the first time those "book buzz" people are shown up as non-reading charlatans, this whole thing will mostly be a source of irritation by 2033. Over time, traditional publishing will lose its credibility. This will be a good thing for everyone, even most of the people who work in it, because the real editors will be able to move back to working on books rather than rigging the system. The discovery process for new books, one hopes, will involve more reading and less "buzz".
It is an open question, how literature's economy will evolve. It's too early to make firm predictions. First of all, the production of AI-written books is not innately evil. As long as people are honest about what they're doing, no harm is being done. I suspect that in certain erotica genres, readers will not care. Second, even artistic novelists will be using AI to analyze their work, not only to spot grammar and style issues, but to assess whole-book concerns, like pacing and proportionality, over which authors lose objectivity after a certain point, but for which human feedback (e.g., from editors and beta readers, who will still be necessary in the artistic process) is hard to find in sufficient quantity. Of course, AI influencers will be a major issue, but that one's going to afflict all of society, and we have yet to see if it is a worse plague than the one we've already got.
The evolutionary arms race between AI-powered book spammers and AI-powered spam detectors will go on. Most spammers will not make a profit, but a few will, and many more will try, testing the system and driving defensive tools to improve. When we reach equilibrium, probably in the early 2030s, here's what it will look like: the most sophisticated fakes will be functionally indistinguishable from real books written by unskilled humans. Overtly AI-written writing may be a new and accepted commercial genre. The question will inevitably be raised: if some of these AI-written "bad books" are, nevertheless, entertaining enough to enjoy commercial success, are they truly a problem? Do we really need to invest our efforts into detecting and bouncing them? By this point, the question becomes not "How do we tell which books are fake?" but "How do we tell which books are any good?" Dredging slush, a job currently done by unpaid 19-year-old interns who work for literary agencies, will be treated as a semi-supervised learning problem, and it will solved as one; the results will be better than what we get from the system of curation that exists today.
Can literary quality be represented as a mathematical function? No. Of course not. There will be unskilled writers who will reverse-engineer these systems and beat the autograders, just as there are unskilled authors who land impressive book deals today. There will also be some excellent books that will fail whatever heuristics are used, just as there are great books for which there is no hope of them clearing the arbitrary prejudices of gatekeepers at literary agencies. Once the new equilibrium is found, it won't be a utopia. It won't be even close to one. It'll be an improvement over what exists now and, more important to the people who make the decisions, it'll be faster and cheaper. Instead of waiting three years to get a form letter rejection because you failed some arbitrary heuristic—this year, publishable authors eschew adverbs; next year, publishable authors will use them profligately; the year after that, one who hopes to escape a slush pile will employ adverbs only on on odd-numbered pages—you'll be able to get your form letter rejection in ten seconds. Progress!
Traditional publishing will still exist, even in fiction. It just won't be much of a force. Poets already self-publish, and there's no stigma. The novel is well on the way to that point. will go. Chances are, traditional publishing will be an effective way for proven self-publishers to accelerate their careers, but the era of being able to knock on the door will end, if it hasn't already. I don't necessarily consider this a good thing at all; it'll put traditional publishing, which will be "dead," but also more profitable than ever, in the position to select proven selling authors after they've taken all the financial risk as self-publishers. This isn't some revolution of self-expression; it's the return of business to risk aversion and conservative practices. But it's where we seem to be headed. What I hope, instead, is that we'll find ways to make effective self-publishing affordable to a larger number of people; ideally, the thousands of institutions (some of which do not exist yet) that run whatever self-publishing becomes will figure out a way to make sure everyone who can write well gets discovered. We'll see.
I don't know the degree to which commercial fiction will be automated. Some of it will be, and some won't. This distinction, here assumed to exist, between what is commercial and what is "properly" literary might disappear altogether, if it no longer becomes sensible for people to write commercial fiction, given increased algorithmic competition. In other words, formulaic bestsellers will continue to bestsell, but authors who currently write at a commercial pace for economic reasons might decide to write fewer books and spend more time on them, as machines conquer the low end of the effort spectrum. At the same time, AI will help all of us write better; so, perhaps, writing a landmark literary novel might some day take only four or five years of one's life instead of six or seven.
What is the long-term fate of artistic novelists? It's too early to say anything for sure. We are an eccentric breed. "Normal" people can't stand rewriting a book for the third time—at that point, they're bored and they want to tell another story, leaving someone else to handle the line and copy edits, and that's fine. Those of us who care, more than a well-adjusted person should, about comma placement and lyrical prose, are few. We're the completionists who love that game (our game) so much we do 100% runs, speed runs, and challenge runs. Our perfectionist streak threatens to turn our editors into underpaid therapists, as their job ceases to be perfecting our work, so much as it becomes their task to convince us that, in fact, our novels don't require "one more rewrite" to fix "issues" no one else sees. I suspect the world needs both kinds of authors. Without the commercial ones, the novel doesn't produce enough volume and diversity to keep the public interested in the form; without the artistic novelists, fiction authors risk becoming like Hollywood—reactive, not proactive, and therefore culturally uninteresting.
How serious writers shall support themselves remains an open question. So long as we live under authoritarian late-stage capitalism, the problem won't be solved. The good news is that "the good old days" in literature weren't all that great; we aren't necessarily going to lose much by moving away from them. The late twentieth century was a time when authors deemed literary were protected by their publishers, but it was also a time of extreme artistic blandness nobody really wants to repeat. Institutional protection seems, in practice, to warp its beneficiaries. Thus, so long as the replacement of traditional publishing by innovations from the self-publishing world introduces no barriers or expenses worse than already exist, there is no reason we shouldn't speed the process along.
9: (no?) exit
When photographs of human faces are compiled into an "average", the composite is usually well above average in attractiveness—85th-percentile or so—but indistinct. In writing, a large number of people are going to be able to put on this composite face; this is the effect AI will have. I don't know how I feel about this, but how I feel doesn't really matter; it's going to happen.
The good news is that this will help people of low socioeconomic status, as well as second-language speakers, whose linguistic deviations are stigmatized not because they are evidence of inadequate thought (since they aren't) but because of an association with low social position, will be able to express themselves more fluently and evade some of society's biases against them. I'm quite happy about that. The bad news is that the next wave of spammers and bullshitters will be much more articulate than they are now. What remains hard to predict is society's demand (which was never that high to begin with) for the skill of the best writers, who are still markedly above the composite style that is about to become available to everyone. Will this demand diminish further, as the gap between them and the rest of the population shrinks? Or will it increase, as we tire of seeing the same writing style everywhere? It's too early to tell.
Artistic novelists are a curmudgeonly sort; we are never that agreeable composite with all the roughnesses smoothed out, that pleasing statistical average that looks like nobody and everybody. We deviate; we grow proud of our deviations, even though they make existence a struggle. Not one of us will ever win the necessary beauty contests to become CEOs or whatever an influencer is. I suspect that to reach the highest levels of writing, at least in fiction, requires a paradoxical personality. You need the arrogance and chutzpah to believe sincerely that you can do something better than 99 percent of the population, coupled with the sincere humility that drives a person to put in the work (and it is a lot of work) to get there, because there are no shortcuts. You need to be so sure of yourself having something to say that no one else can that you are not daunted by terrible odds, but never become so addled by high self-opinion as to become complacent. Is it any wonder, considering the tension that exists within this personality type, that authors and artists are so often insecure, an issue that no amount of success ever seems to cure? People with artistic inclinations do require a certain amount of protection from the market's vicissitudes but, as is known to anyone who's watched the news in the past fifteen years, as late-stage capitalism has set one fire after another, so does everyone else. Our issues won't be truly solved until everyone's are solved.
We've discussed artificial intelligence. The current machine learning approach of using a massive data pool to encode (train) basic knowledge in a device like an artificial neural network has proven highly effective, and these successes will continue. AI will conquer a variety of economically important tasks that once required intelligence and—in time—pretty much everything humans do as employees. It will even conquer commercial, but not artistic, writing. Of course, to set such a goal is unfair, from a machine's perspective, because we will reflexively move the goalposts. If machines ever establish something to be so easy it can be done by rote, the defiant spirit that lives within all serious novelists will gravitate toward what remains difficult. After all, we don't write the kinds of novels we do because they are easy, but because they're so hard, the vast majority of the population can't. AI isn't going to change that.
AI will change the way all of us write; it will catch increasingly subtle grammar errors, and it may develop sensitive enough sentiment modeling to alert us to pacing problems or places where a reader might get bored and put the book down (which no author, no matter how literary, wants). It will augment, but not replace, our artistic process. It will become an adequate artist's assistant, but not the artist. Neural nets will master commercial success; but to replicate the truly artistic aspects of fiction that have kept the written word relevant for thousands of years, computing will have to come up with something of a new kind, for which there is no evidence it exists.
The economic issues remain vicious. Here, we're not talking only about the incomes of writers (in which society, to be honest, has never taken much interest) so much as we are talking about everyone. If you're not scared of the ramifications of artificial intelligence, in a world under propertarian corporate dominion, you're not paying attention. Recall the calamities of the first half of the twentieth century. Industrial-scale nitrogen fixation (Haber-Bosch) led to agricultural bounties the reigning economic system (laissez-faire capitalism) was unable to handle. The "problem" of abundant food, with so many people's incomes tied to the price thereof, led to rural poverty in the early 1920s and municipal insolvency by the middle of the decade. Heavy industry began to shake around '27; this led to stock market volatility and a collapse in October 1929, when the Great Depression was finally acknowledged. The U.S. was forced to drag itself (reluctantly) halfway to socialism, while Germany and Japan took... other... approaches to capitalism's public failures. Capitalism cannot help but chew off its own limbs; the Depression proved that ill-managed prosperity is the devil. We are repeating the same mistakes. It's unclear, in 2023, which is more terrifying: the looming acceleration of automation and technological surveillance in a world run by such untrustworthy institutions, or the fct that we've been in this automation crisis for the past forty years already—it exists now, and is likely explanatory of the economic depredations the working class has seen since the 1980s.
It might seem, from the outside, that we who aspire to be literary authors might enjoy the exposure of our bottom-tier commercial counterparts—at least, the ones who hit the lottery and become rich, which most of us never will—as replaceable by robots, but I don't think we should. We should worry on their behalf and ours. They are working people, responding to economic incentives, just like all of us. The denial of this fact (i.e. the midcentury belief in a so-called middle class, and that membership within it makes us better than other working people) is what has enabled the bourgeoisie to win in the first place. The acceleration of progress in AI makes clear that working people cannot afford division amidst ourselves at all. The stakes are too high. We, as humanity, will need something other than zero-sum economic transactions—we will need culture if we want to survive; we will need daring, experimental, incisive writers more than ever, because while the future we need may not be televised, it will be written.
> If Chess could be played perfectly, no one would find it interesting. Rather, our inability to use brute force—thus, the need to devise imperfect, efficient strategies—is what makes the game artful.
Are you familiar with Tremendous Trifles?
> If you could play unerringly you would not play at all. The moment the game is perfect the game disappears.
-GK Chesterton 1909 https://www.gutenberg.org/files/8092/8092-h/8092-h.htm#link2H_4_0005
> It would be easy enough to generate a billion or a trillion novels, but there is no way to get nuanced, useful feedback from humans on each one.
This is true, but nuanced useful feedback is the sort of thing that Large Language Models are starting to become capable of themselves. The tiny memory (~3000 words) of modern LLMs is a serious limit in light of the millions (more?) of words that comprise an author's literary identity, but orders of magnitude shrink quickly in computing. Consider more closely the possibility of self improvement via solo play. We'd need humans in the loop to periodically look at output and to tweak the professor-model so that its critiques of the novelist-model are better. It need not be monolothic, either; with inevitable efficiency gains, some day each individual can finetune a model to personal taste, or to precise need.
> ... linguistic deviations stigmatized not because they are evidence of inadequate thought (since they aren't) but because of an association with low social position
The imminent democratization of language skills is undersung. Even just the implications for legal representation are immense. I'll be adopting your framing.
> the next wave of spammers and bullshitters
Between high quality personalized content and LLM's ability to analyze spam and bullshit, people can be exposed to less of it and be better equipped to defend against it. Maybe with the same optimism I would've said the same thing about the internet at large three decades ago and been disappointed today. The internet evidently contains deep wells of good clear knowledge, but folks tend to splash about and sip from the mudpuddles.
This horse appreciates the fresh water. Thank you for your providence.