Melania Trump speech: The odds of a word match

What are the odds that the appearance of identical passages two political speeches eight years apart are entirely coincidental?

In the case of Melania Trump's GOP convention keynote Monday night, it turns out they're pretty long: at least 1 in 4,835,703,278,458,516,698,824,704, according to one expert in text analysis.

The potential first lady's high-profile convention debut was the most talked-about event so far at the GOP gathering in Cleveland. But not for the reasons the Trump campaign had hoped. Though well-received by delegates, it has been marred by remarkable similarities in two passages that echoed a similar address by Michelle Obama in 2008.

The Trump campaign dismissed charges of plagiarism, saying the words were "not unique."

"In writing her beautiful speech, Melania's team of writers took notes on her life's inspirations, and in some instances included fragments that reflected her own thinking," Trump spokesman Jason Miller told The Associated Press.

To be sure, the matching passages aren't exact; a comparison of the two speeches shows that Melania Trump's version substituted or dropped a few words from the extended passages in question.


And political speeches are often filled with commonly used phrases, from "my fellow Americans" to "God bless the United States of America." So it's hardly surprising when they show up verbatim in multiple speeches.

Linguistic experts call these "lexical bundles" — strings of words that commonly appear in everyday speech. But once those strings reach a certain length, it's a red flag when comparing two texts for possible plagiarism.

"The passages (in Melania Trump's speech) are simply too long to have occurred by chance," said Robert Leonard, a professor of forensic linguistics at Hofstra University. "Sure, it's possible. But which is the better hypothesis — that they were copied or not?"

In fact, the odds that use of these extended passages was entirely coincidental are "astronomically high," according to professor Patrick Juola, an expert in text analysis at Duquesne University.

"You almost never see these stereotypical phrases longer than seven words," he said.

Beyond seven words, the odds become exceedingly small that two speakers will choose the identical string of words by coincidence.

Calculating those odds, said Juola, depends on a number of factors. But much like calculating the odds of a coin flip, the odds grow exponentially higher as the string of common words gets longer.

A rough rule of thumb, he said, is to count the number of characters in the matching phrases — and then raise two to that power.

Based on that math, the longest section of exact duplication is 82 characters long.

Which works out to a 1 in 4,835,703,278,458,516,698,824,704 chance that the similarity was coincidental.

That's about 1 in 5 septillion.