Clive Thompson
Aug 16, 2021

--

"Log likelihood" is a way of figuring out whether a word is used more or less frequently between two different big bags 'o text! The two bags of text you're comparing could be whatever you want. You could compare the occurrence of words in a Lovecraft short story to the occurrence of words in all the issues of the New York Times; you could compare the occurrence of words in a sonnet to all the words in War and Peace; etc. So it's not a metric of English words themselves, but a metric of two pieces of text compared … if that makes any more sense? I may not be explaining it with good clarity ...

--

--

Clive Thompson
Clive Thompson

Written by Clive Thompson

I write 2X a week on tech, science, culture — and how those collide. Writer at NYT mag/Wired; author, “Coders”. @clive@saturation.social clive@clivethompson.net

Responses (1)