Henry Livingston, Jr.
Moore

coming spring

NBC
Soft Cover from Amazon
Kindle Edition
Sample Pages
Poet You Always Loved
 
Kindle Edition
Sample Pages
Thrice Happy
Soft Cover from Amazon
Kindle Edition
Sample Pages
Mac's cover
Soft Cover from Publisher
Soft Cover from Amazon

Bsckground

Auckland University Professor Emeritus Mac Jackson spent over two solid years conducting statistical analysis of the writing of Clement Moore and Henry Livingston in order to determine which poet was the most likely author of "The Night Before Christmas." The result of his investigation was to give the nod to Henry Livingston.

I worked with Mac, and with Don Foster before him, to provide data for Mac to analyze. My computer background is in computer languages, and I took early retirement from IBM Research in 1993, along with my husband, Paul Kosinski, who provided the programs that collected the data. I used to chair the computer group SIGPLAN under the ACM, and have been on the ACM's SIGBOARD, run two newsletters, and chaired and co-chaired five computer language conferences. Paul's PhD is in computer science from MIT. My friend Lyn Bates was president of the Association for Computational Linguistics. Her PhD is from Harvard.

Lyn suggested to Mac that he use the online dictionary from Carnegie-Mellon, on which she had worked, to analyze the poetry by phoneme analysis. Mac knew this approach and had only demurred at the amount of work he would lay on us. We were up to it and transcribed every word of every poem by both poets into phonemes - the sounds of the words. Phonemes track how the tongue and lips move while reciting the poetry aloud. This was one of the important new approaches Mac used for his authorship attribution analysis.

Mac's Data:
"And " versus "And,"

Moore starts a line with "And" 307 times out of 2873 lines. In 40 of those instances, the word is followed with a comma:

    "And,"

Livingston starts 212 out of 1892 lines with "And" and the word is never followed by a comma.

"And" starts 12 out of 56 lines in "Night Before Christmas" and is never followed by a comma.

Let's look at the data.

    Initial Ands for Visit and Moore
    Initial Ands for Visit and Henry
    Ands for All, including Plus Henry

The data linked above includes the text of all poems examined. Plus Henry is not included in the details. The set was a rough idea of poems that might be Henry's and was tested to see how they individually fit within the body of Henry's work. We are currently working on a methodology to test whether a particular poem can be identified as being by Henry or by a sample of poets writing in the same timeframe and publications as Henry.

Mac's Data:
Henry-Favored and Moore-Favored Phoneme Pairs

Mac's Chapter on "Individual Phoneme Pairs More Favored by Moore or Livingston
All phoneme pairs falling within either Moore's or Livingston's top hundred, in terms of frequency of use, were tested by chi-square to determine whether they were used significantly more often within the overall corpus of one or other poet. This significance testing uncovered, neatly though coincidentally, ten phoneme pairs more favored by Moore and ten more favored by Livingston.

Moore's were:

T/DH   T/F   T/S   Z/W   S/W   Z/T   IY/T   D/P   S/S   Z/CH

Livingston's were:

AH/N   AH/F   AH/S   AH/B   AH/K   AH/L   AH/P   N/AH   Z/AO   ZIH

Mac's phoneme pair lists would mean more if you could listen to their sounds. Taking all the phoneme pairs from a single poem for each poet, the phoneme pairs are the sound at the end of the first word and the sound at the beginning of the next word.

Moore Favored Phoneme Pairs in "Saratoga"
Phoneme Pair    Word Pair
T/DHat the
T/Fnot for
T/SThat spoke
Z/Wcomes we'll
S/Wchance were
Z/Twaters to
IY/Twe take
D/Pand plenteous
S/Sfierce soe'er
Z/CHhis children

Henry Favored Phoneme Pairs in "Invitation to the Country"
Phoneme Pair    Word Pair
AH/NThe nightingale
AH/FThe flimsy
AH/Sthe side
AH/Ba bush
AH/KThe copses
AH/Lthe lawn
AH/the plains
N/AHseason of
Z/AOflits o'er
Z/IHgambols in

Let's look at the data.

    Favored Phoneme Pairs for Individual Poems - Moore
    Favored Phoneme Pairs Summary - Moore
    Favored Phoneme Pairs for Individual Poems - Henry
    Favored Phoneme Pairs Summary - Henry
    Favored Phoneme Pairs - "Night Before Christmas"
    Favored Phoneme Pairs Summary - "Night Before Christmas"

In the table below, Mac was able to discriminate between Henry's poetry and Moore's by dividing Henry-favored pairs by the sum of Henry-favored and Moore-favored pairs.

Because poems with too few favored phonemes would yield random results, poems with less than 12 total phoneme pairs were eliminated from each poet's set. Since Henry wrote shorter poems, on the average, this meant 25 poems were removed from his set, while Moore only lost 6 poems. But for the ones remaining, tests could be trusted to be statistically significant. What is being analyzed is 1483 lines by Henry and 2750 lines by Moore.

Poet    Means of Percentage
for Individual Poems
H-fav/(H-fav + M-fav)
Henry66.654
Moore        42.912
  
Visit64.912

So, for phoneme pairs, a completely unconscious writing characteristic, Mac's calculations showed that "Account of a Visit from St. Nicholas" sat firmly in Henry's camp at 64.912.

Mac's Data:
Frequency of Common Words

Common Words That Discriminate
Despite the context-sensitive character of many pronouns and verbs, they have been used effectively in dozens of authorship studies, along with other high-frequency words. Very common words that, unlike "that," are ineffective as stand-alone discriminators may have value as members of a substantial group of words, each with some discriminatory power. So, as an initial trial, from word lists, ordered by frequency, for Moore and for Livingston, there were extracted each poet's top fifty words.


    All Words in All Poems
    Word Frequencies in All
    Word Frequencies in Moore
    Word Frequencies in Henry
    Word Frequencies in Visit

Mac pulled from the frequency listing twenty-six words that were in both Moore and Henry's poetry, and which appeared twice in "The Night Before Christmas." These he placed in rank order. Mac then applied Spearman's rank-order correlation, a simple statistical test, to determine whether the rank order for Visit of these twenty-six words more closely matches the rank order for Henry or the rank order for Moore.

From this data, Mac found the correlation between Visit and Henry to be .7638. The correlation between Visit and Moore was .6633. Which meant that the way the words are used in Visit is closer to the way they're used in Henry's poetry rather than the way they're used in Moore's.

Next Mac identified words favored by Henry more than Moore (Henry Favored Words), and by Moore more than Henry (Moore Favored Words). After dropping words that had been evaluated in other tests, so as to keep the tests independent, Mac was left with

Henry Favored Words:
I   his   my   her   on   as   is   was   at    thy   will    day   When   me   Where   While

Moore Favored Words:
to   from   your   for   they   be   With   this   our   not   which   so   would   For   it    heart   Of   are   we

    Henry and Moore Favored Words in Poems

Using a t-test, Mac found it unlikely that Henry's poems and Moore's poems fit within a single population. So he had a differentiator. Looking at "The Night Before Christmas," Mac found that it fit neatly within Henry's percentages, but was an outlier for Moore, that it, it was at the extreme end of Moore's percentages.

Mac's Data:
Frequency of Less Common Words

Words of Medium-High Frequency
The success of high-frequency (top 50) words in discriminating between poems by Moore and poems by Livingston is an encouragement to experiment with words of medium-high frequency - the sixty next most highly ranked in either poet's body of verse. From lists of these were extracted the words that were used at rates at least 1.2 times higher by Moore than by Livingston, and vice versa. It turns out that only two were between 1.2 and 1.3 times as frequent, and both of these were very close to a more demanding 1.3 cut-off point. Sixty words were checked, rather than the fifty of the previous test, because, being of lower frequency, items in this category naturally provided fewer data, in terms of total occurrences.


    Henry and Moore Favored Words in Poems

After examining the sixty words, Mac found thirty-four Henry Favored Words and thirty-seven Moore Favored Words. He chose to analyze only those poems that contained at least ten of the favored words.

This time there was no subtlety to the separation of the two bodies of poetry. Mac explained that "Livingston's mean of 60.814 for individual poems with at least ten test words is more than twice Moore's of 29.489. The percentage of 53.704 for 'The Night Before Christmas' lies just outside Moore's actual range of 14.583-53.333 for such poems but well within Livingston's of 30.000-89.474."

Mac's Data:
Repetition

Sometimes examining grammatical constructions to uncover potential statistically analyzable data is subjective, which doesn't mean the technique shouldn't be used. Rather, it means that you have to have safeguards that your analysis is consistent over a single body of work, as well as between separate bodies of work.

For Moore and Livingston, the difficulty is that Livingston's poetry is so pleasant to read, and Moore's so boring, that it takes extra effort to be sure that Moore isn't being shortchanged by the desire to get through his analysis as quickly as possible.

We approach the problem of achieving uniform analysis in several ways. One way is to perform the entire analysis several times, over both canons. Painful, yes. Necessary, also yes.

The other protection is to make the categorization as obvious as possible so that Mac can come in and quickly skim the results for any mistakes I've made in categorizations. My husband Paul automated the process so that each repetition pattern appears on a page with the category criteria showing at the top, the patterns in Night Before Christmas following, and then all the poems of the poet being analyzed. I enter the first pass of categorization, then Mac, in New Zealand, looks over the results (shown large and bold) and makes any changes he deems necessary. Each change is instantly reflected in the gathered statistics. Blessed Paul.

    Repetition Categories
    Identical Initial Words on 3 Consecutive Lines   His/His/His   And/And/And
    Multiple Identical Words   His eyes how they twinkled/His dimples how merry
    Identical Middle Word   Tore open the shutters/threw up the sash
    Identical End Words   to all, and to all
    Word And Word And Word   whistled, and shouted, and call'd
    Simple Repetition   dash away/dash away/dash away;
    Inner Identical, Outer Same Parts of Speech   wink of his eye/twist of his head
    Repeated Words with Connector   His cheeks were like roses/his nose like a cherry
    Initial Pivot Words   in their beds/in their heads;

    Repetition in All
    Repetition in Henry
    Repetition in Moore
    Repetition in Visit










Fun Activities for Christmas
  65 TV Xmas Music Videos
  Antique Illustrations to musical NBC Recitation
  CBS Good Morning America, 2000
  Comic Book Poetry with antique postcards
  The Poem's Story in Anapest
  Antique Illustrated Editions
   Antique Santa Postcards
And after the fun, fall asleep to Clement Moore's Poetry
        
NAVIGATION


All Henry Livingston's Poetry,     All Clement Moore's Poetry     Historical Articles About Authorship

Many Ways to Read Henry Livingston's Poetry

Arguments,   Smoking Gun?,   Reindeer Names,   First Publication,   Early Variants  
Timeline Summary,   Witness Letters,   Quest to Prove Authorship,   Scholars,   Fiction  


   Book,   Slideshow,   Xmas,   Writing,   The Man,   Work,   Illos,   Music,   Genealogy,   Bios,   History,   Games  


Henry's Home


Mary's Home


IME logo Copyright © 2012, Mary S. Van Deusen
mailto:  Mary S. Van Deusen