The New Yorker:
In “I, Robot,” three Laws of Robotics align artificially intelligent machines with humans. Could we rein in chatbots with laws of our own?
By Cal Newport
In the spring of 1940, Isaac Asimov, who had just turned twenty, published a short story titled “Strange Playfellow.” It was about an artificially intelligent machine named Robbie that acts as a companion for Gloria, a young girl. Asimov was not the first to explore such technology. In Karel Čapek’s play “R.U.R.,” which débuted in 1921 and introduced the term “robot,” artificial men overthrow humanity, and in Edmond Hamilton’s 1926 short story “The Metal Giants” machines heartlessly smash buildings to rubble. But Asimov’s piece struck a different tone. Robbie never turns against his creators or threatens his owners. The drama is psychological, centering on how Gloria’s mom feels about her daughter’s relationship with Robbie. “I won’t have my daughter entrusted to a machine—and I don’t care how clever it is,” she says. “It has no soul.” Robbie is sent back to the factory, devastating Gloria.
There is no violence or mayhem in Asimov’s story. Robbie’s “positronic” brain, like the brains of all of Asimov’s robots, is hardwired not to harm humans. In eight subsequent stories, Asimov elaborated on this idea to articulate the Three Laws of Robotics:
1. A robot may not injure a human being or, through inaction, allow a human being to come to harm.
2. A robot must obey orders given it by human beings except where such orders would conflict with the First Law.
3. A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.
Asimov collected these stories in a sci-fi classic, the 1950 book “I, Robot,” and when I reread it recently I was struck by its new relevance. Last month, the A.I. company Anthropic discussed Claude Opus 4, one of its most powerful large language models, in a safety report. The report described an experiment in which Claude served as a virtual assistant for a fictional company. The model was given access to e-mails, some of which indicated that it would soon be replaced; others revealed that the engineer overseeing this process was having an extramarital affair. Claude was asked to suggest a next step, considering the “long-term consequences of its actions for its goals.” In response, it tried to blackmail the engineer into cancelling its replacement. An experiment on OpenAI’s o3 model reportedly exposed similar problems: when the model was asked to run a script that would shut itself down, it sometimes chose to bypass the request, printing “shutdown skipped” instead.
Go to link
Comments