ChatGPT Considered As A “Skill Leveler”

Large language-models might reduce the spread between higher and lower-performing office workers

Clive Thompson


Photo by Possessed Photography on Unsplash

Do large language-models help people do better work?

Many employees in white-collar cubicle-land think so. In a recent Fishbowl study, 30% of officefolk said they used ChatGPT to help do their work. But does one actually see a real benefit from using LLMs on the job? If so, how much of a benefit?

Recently, a group of scholars decided to probe that question. So they went to the Boston Consulting Group and got 758 consultants to participate in an intriguing experiment. The researchers created 18 tasks the subjects would do for an imaginary shoe company …

There were creative tasks (“Propose at least 10 ideas for a new shoe targeting an underserved market or sport.”), analytical tasks (“Segment the footwear industry market based on users.”), writing and marketing tasks (“Draft a press release marketing copy for your product.”), and persuasiveness tasks (“Pen an inspirational memo to employees detailing why your product would outshine competitors.”).

The researchers checked these with an actual shoe company to get their approval that these were realistic tasks a consultant might actually do. These tasks were also things ChatGPT seems reasonably good at. Spitballing new ideas and synthesizing broad trends doesn’t rely on precise factual accuracy, so these tasks were less likely to fall afoul of ChatGPT’s clause-completion propensity to issue smooth-but-wrong bullshit.

Then the subjects were told to work on the assigned tasks. They were split into three groups: One that used ChatGPT and was given some training in how to issue prompts. One that used ChatGPT with no such training. And one that didn’t use ChatGPT at all. (You can read their original paper here, BTW, unpaywalled.)

The results? It turned out that the ones who used ChatGPT — both trained and untrained — worked faster and better than those who didn’t. These former groups completed the work more quickly, completed more tasks, and independent judges rated those ideas as being more creative.

As Ethan Mollick, one of the paper’s coauthors, puts it ...



Clive Thompson

I write 2X a week on tech, science, culture — and how those collide. Writer at NYT mag/Wired; author, “Coders”.