Thursday, March 7, 2013

Refining the Question: Bilabial Stops in Milton's Areopagitica. (and some fricatives)


Consecration of Hermagoras by Peter, Aquiela Basilica

I have maintained that the examination of bi-labials in the Areopagitica is a wild-goose chase. I apologize for the thoughtless affront to wild-geese and their chasers and for the careless use of English.

It turns out that the concept wild-goose chase is quite complex and subtle, although it has lost some subtlety in modern usage - as illustrated by my own thoughtless use. I used the term to indicate an essentially pointless endeavor involving considerable effort with no tangible result, i.e. the goose proved uncatchable.

This is not quite true to the history of the term. From wiki we learn that it originally referred to a type of horse race where the racers had to follow the tracks of the lead horse, at intervals, I have read elsewhere. The "wild-goose" reference refers to the unpredictable course of the leader, from the perspective of the followers and the imperative for the followers to follow the leader precisely. It is possible that in the 16th c. the people were still close enough to nature, to ponds and geese, that we could expect the behavior of wild-geese to be used metaphorically in a trustworthy manner.

In Shakespeare, the term seems to be used to indicate a path difficult to follow, e.g. a complex argument.

Wild Geese Descending to a Sandbank
All the wild-geese I have ever seen, the one's at the pond behind my house in North Carolina and the one's flying over the swamps around Princeton have always displayed very predictable paths. Generally they seem to fly in a straight line with the followers arranged geometrically behind the leader in a nice V-formation. Landings tend to be very graceful curving maneuvers into open water. I have yet to see a wild-goose engage in erratic flight behavior. Of course, at a distance, it is difficult for a layperson to differentiate ducks and geese.

So where does the metaphor originate. Did people in the 16c. surprise wild-geese on the ground and try to catch them only to have them fly off in all directions? Pigeons or even chickens might do the same thing. Did wild-geese, on the ground, being chased, change directions while starting  the run to its flight path which must surely be a straight line given the effort to attain height and speed? Would a wild-rabbit chase be more to the point? Of course what human would risk humiliation in chasing rabbits? Hopeless. Perhaps the modern usage represents several centuries of experiments by humanity in chasing wild-geese, all of which failed abjectly due to a fast run up and a predictable flight path, up, up and away; hence the adjustment in meaning.

Did riders in the 16c., lacking beagles and a handy fox on occasion, actually chase flying flocks of geese, simply for the sport of the chase and for practice when the beagles would be brought out? So it was never really about a goose, it was about exercising the horses.

Perhaps the real metaphor should be wild goose-chase, i.e. the chase of a domesticated goose that turned "wild" because the goose did its best not to be caught, again behaving very much as a chicken would. Flying is not an option since the wings had been clipped. That does not fit the metaphor since the goose invariably lands in the roasting pan, though the human effort may have been considerable.

Given the general lack of experience of the literate populace with geese, wild or otherwise, perhaps the thought of geese wildly careening across the evening sky is a product of fancy in its 17c. meaning, spun from no observed data, like the ideas so many lit. crit. graduate students and their mentors in their discussion of Milton. It is based on lack of experience with the real thing, Geese in the former and the actual referentially ambiguous words in the text before us in the latter case.

However that may be, let me try to focus on the examination of bi-labials in the Areopagitica. This would also serve to differentiate the approach of a world famous literary critic, august scholar, if you please, from the efforts of a retired Digital Humanities perl programmer writing scripts extracting patterns of words from a text.

For Professor Fish, living his discrete situation, being surrounded by a vast collection of interpretive mechanisms and conceptual building blocks suited for interpretation, the metaphor should be the wholesale slaughter of geese. Let me remind you of his methodological snippet:
The direction of my inferences is critical: first the interpretive hypothesis and then the formal pattern, which attains the status of noticeability only because an interpretation already in place is picking it out. [Fish]
The professor has both barrels loaded and on a hair-trigger, the first with an intimate knowledge of rhetorical forms such as chiasmus and the second with intimate knowledge of the history of Milton's time, specifically the evolution of church hierarchies. Of course a mild dispepsic spasm could unleash an incidental interpretation. To be more accurate, the professor has countless guns at the ready, sitting in his blind waiting for the ducks to pass over. The action is lightning quick, the "Bishop-Presbyter" ducks appear, bam bam, and the ducks fall lifeless into the water. An interpretation has been formed joining church politics with rhetorical forms. The critic is habitually crouched in the interpretive pose. The interpretation simply pops forth. Explanation and justification follows, metaphorically, talking to the game wardens, some of whom question the validity of the hunting license or assert the expiration of the season or the overstepping of the bag limit or the unsporting use of an automatic weapon without rational controls.

William Laud, Archbishop
We have looked at the interpretation.  Now let us look at the "interpretive hypothesis" [from above]. The "formal pattern" [above], e.g. the BP's, attain noticeability because an interpretation is in place. From my very incomplete grasp of Milton, this interpretation, cryptic though it is, cannot get full marks, C+ on the American scale. Why? You ask? Is your task to isolate a common, even secondary or tertiary theme? Were Bishops and Presbyters active in pre-publication censorship? Duh. Is that your interpretive hypothesis? Are you reading the Areopagitica looking for evidence of religious strife manifesting in censorship? Have you spotted the formal pattern of the bi-labial chiasmus to nail 17c. religious strife to the wall? In my view, you have picked a commonplace of 17c. history and unearthed an extremely unlikely "formal pattern" to prove something no one would deny. C+. Too easy, too obscure, too peripheral to the Areopagitica., in short, a rewrite.

In any case, an interpretation has been put into the world. Prof. Fish interprets easily, the only question is of the dozens of interpretations that offer themselves every day, what gets written down. There can be no real mysteries in the world of Fish, if there are, the public persona does not show it. Everything exists to be explained. The voice is practically bursting forth, be it the Areopagitica or the Academy Awards winner for the best movie. The world wants his opinions. The judgments are absolute: this is that, a connection has been made, read and learn. I see my function to give this mechanism a much needed service, a tweaking.

Bishop Laud's Trial
What if, upon opening the game-bag, the reader of an interpretation finds not a goose but a cuckoo bird. What if, the greatness of the critic notwithstanding, the interpretation appears nonsensical; in addition, the interpretation is in service of undermining the reader's field of work, computer work with humanities texts. The emotions that swept through the Digital Humanities community last New Years were hurt, betrayal, bewilderment, abashment, confusion. The temptation is to ignore as literally hundred things are ignored in the course of a single day, every day, starting with the fact that it may be cold and raining.

The digital humanist, in general, is a less public figure than the super-star literary critic. There are loaded weapons at the ready in the digital world, but they are not designed to slay geese. The act of killing something and having Rover go fetch is a fairly swift action. The eye sees, the finger squeezes, Rover heads for the splash and the interpretation is in the bag. The copy editors come running. The great one's have this capacity of turning a life of experience into gems of interpretation.

The digital humanist has no instant access to such treasure. The facility is more along the lines of a cartographer, mapping the lay of the land, finding out where the ducks may be and what are their flight patterns. There are months of meeting to lay out data-base structures. There is no assumption that ducks will be put into the bag tomorrow or in a month.

There is a possibility that after the map has been drawn that an interpretation will arise. It may be possible that someone else, not involved in fetching forth the data may hit upon something interpretable. In my case, in chasing bi-labials in a fairly non-metaphoric linear fashion, the result has been lists of bilabials with various labels attached, sentence number, sequential position in the text, distance to the next bi-labial and a few more.

The creation of lists involves an inherent progression from a most pedestrian beginning, a sequential list of words, to a final display, at present, which shows the words with the position in the text and the distance behind and ahead to the next bilabial that can be sorted by the gap each bi-labial straddles. Thus it is easy to identify big gaps and small gaps. One would assume that the sonorous prosody of BP's could not survive a gap of 40 to 60 words. Gaps of three, four, even eight words in sequence, on the other hand, could certainly be read to emphasize a pattern of sound. Perhaps.

There is a system in taking a sequence of 18,000 lexical items and extracting numerical data on the interrelations of the words with specific content. It is even convenient that this exercise is empty of meaning, I can concentrate on the mechanics. Textbooks have been written on this field and entry is possible at various levels of virtuosity, perl being one of the easier.

The last redesign of the output was caused by my recognition that I had concentrated exclusively on following bi-labials from one to the next to the next. In other words, I had accepted the forward motion of text, concentrating on the distance from the previous bi-labial to the next. The algorithmic logic that does that is also easier since no values have to be passed backwards. What would be more important to analysis, assuming there is something to analyze, would be the gap which each word straddles. That requires holding the data of the previous BP in stasis, while the next TWO are collected and printed out with the middle one along the lines: previous BP, BP to be printed, next BP.

For example, for starters, the first sentence contains 11 bi-labials.

[NOTE: there may well be some fricatives hiding among the plosives. But you can recognize them.]

12 - Speech,
16 - Parlament,
23 - private
33 - publick
36 - suppose
41 - beginning
58 - doubt
62 - be
71 - be
76 - hope,
85 - speake.
SEN# 001 BP 11 STOT 085 PCT BP/STOT 0.13

"Speech" is the first BP. It is the 12th word of the actual text, the first sentence of the oration as such, dismissing the front matter for now. The first sentence contains 11 BP's out of a total of 85 words, a percentage of 13 (11/85), e.g. 13%.

The sentence in question is below.

|p1
They who to States and Governours of the Commonwealth direct their Speech, High Court of Parlament, or wanting such accesse in a private condition, write that which they foresee may advance the publick good; I suppose them as at the beginning of no meane endeavour, not a little alter'd and mov'd inwardly in their mindes: Some with doubt of what will be the successe, others with fear of what will be the censure; some with hope, others with confidence of what they have to speake.

You can see that digital humanities as I practice it is quite tedious. All that seems to be happening is that one is asked to read and appreciate lists.

A more interesting list is:

diff 12 12 - Speech,
diff 4 16 - Parlament,
diff 7 23 - private
diff 10 33 - publick
diff 3 36 - suppose
diff 5 41 - beginning
diff 17 58 - doubt
diff 4 62 - be
diff 9 71 - be
diff 5 76 - hope,
diff 9 85 - speake.
SEN 001 BP 11 STOT 085 PCT BP/STOT

Here we can see the numerical relation (difference) to the previous BP. "Speech" is 12 words from the beginning - four more to the next BP. The last line summarizes the data for the sentence:

1. number of sentence (1),
2. BP's (11),
3. total words in sentence (85),
4. percentage (13).

In sentence #4 we have 52 words, 9 BP's, 17%. In addition we can see some fairly close proximities of BP's caused by a single gap of 17. The gap of 17 is large also because the nex BP is the next word. Thus the gap is completely behind the word. Looking at the small gaps from "expect - 234" to "liberty - 253" we get a percentage of 37. The cluster 234 to 244 reaches 50%. Since I don't really know if such clustering of bi-labials is unusual, in Milton's time or in our time, I will just assert that there ARE clusters of bi-labials. They can be clearly pinpointed by browsing the list. The list is around 2500 items, easy to sort, easy to scroll, easy to find the sentence in question - assuming some minor virtuosity and willingness - kazoo, not violin.

diff 6 213 - liberty
diff 4 217 - hope,
diff 17 234 - expect;
diff 1 235 - but
diff 2 237 - complaints
diff 4 241 - deeply
diff 3 244 - speedily
diff 6 250 - bound
diff 3 253 - liberty
SEN004 BP 09 STOT 052 PCT BP/STOT 0.17

Below the fourth sentence for reference.

|p4
For this is not the liberty which wee can hope, that no grievance ever should arise in the Commonwealth, that let no man in this World expect; but when complaints are freely heard, deeply consider'd and speedily reform'd, then is the utmost bound of civill liberty attain'd, that wise men looke for.

The latest, and probably last view calculates the span between BP's.

16, 012, 000012, 004, Speech,
11, 004, 000016, 007, Parlament,
17, 007, 000023, 010, private
13, 010, 000033, 003, publick
08, 003, 000036, 005, suppose
22, 005, 000041, 017, beginning
21, 017, 000058, 004, doubt
13, 004, 000062, 009, be
14, 009, 000071, 005, be
14, 005, 000076, 009, hope,
12, 009, 000085, 003, speake.

In sentence four (below) you can see the sequence of single digit spans.

10, 006, 000213, 004, liberty
21, 004, 000217, 017, hope,
18, 017, 000234, 001, expect;
03, 001, 000235, 002, but
06, 002, 000237, 004, complaints
07, 004, 000241, 003, deeply
09, 003, 000244, 006, speedily
09, 006, 000250, 003, bound
16, 003, 000253, 013, liberty

One last topic has to be covered: graphical output.

I am no great fan of graphical output in text research. The temptation is to show a graph with the assumption that spikes mean something more than a grotesque hair-do. I prefer to look at the low gap numbers in sentence four (table directly above) and immediately go to the sentence. Spikes and troughs are fine as long as they lead to an examination of the sentences forming the features.

On some level of visionary blue sky, I do wish we could run all our text through some cross between Ngram viewer, SAS and Mathematica. Btw., the Ngram results for bishop, prelate and presbyter show that bishop completely wipes the other two off the graph. There is a spike in the Bishop line around 1590 that begs for an explanation from real experts on 16c. publications.

Often, graphs of very high quality and statistical expertise are lavished on texts, where the graphs and the attendant statistics not only go over the head of the scholars in the field, but have lost the connection to the reading of a text. Alas, in projects working on up to 80 manuscripts of a tradition, the temptation is to test the outer limits, and I accept that.

In the meanwhile, graphs play a minor role in the BP chase. In the tables above, (only excerpts shown) there are some 2500 data points of BP instances. It is possible to graph 5 or 10 sentences. The graphs show nothing that you cannot see from the data tables. My reaction upon fashioning graphs is: Oh yea, and a quick click to the data tables and the text.

I did make on list of all 18000 data points of BP's and non-BP's, just to be able to make a quick and dirty Excel graph - all they show is a fairly consistent oscillation between short gaps and long gaps.

1,0,
2,0,
3,0,
4,0,
5,0,
6,0,
7,0,
8,0,
9,0,
10,0,
11,0,
12,16
13,0,
14,0,
15,0,
16,11
17,0,
18,0,
19,0,
20,0,
21,0,
22,0,
23,17
24,0,
25,0,
26,0,
27,0,
28,0,
29,0,
30,0,
31,0,
32,0,
33,13
34,0,
35,0,
36,08
37,0,
38,0,
39,0,
40,0,
41,22
42,0,
43,0,
44,0,
45,0,
46,0,
47,0,
48,0,
49,0,
50,0,
51,0,
52,0,
53,0,
54,0,
55,0,
56,0,
57,0,
58,21
...

I am not yet ready to draw any conclusions from the inescapable fact of lage gaps and small gaps. The graphs below will allow you to make up your own minds. Everything here can be repeated so the warning: "Don't try this at home girls and boys does not apply here."

18,000 Data Points of All BP's
 The graph of all the data points only shows that there are considerable gaps in the distribution; by sorting the data tables it is quite easy to  separate out the big gaps and the little gaps.

The graph below focuses on a smaller context.

This graph focuses on the first 6 sentences, 405 data points. The last two data points, 16 and 16 are the last two BP's in sentence 6. The arrows point to sentence cusps.


The graph below covers the first 46 data points.


106 data points below.


210 data points below. Note that the points represent gaps. Low values point to dense patterns of BP's and large spikes, the absence of BP's.
Even the fine grained graphs do not really tell a story. There is no real connection between the act of reading a text and inspecting the graph. Perhaps it would be possible to create an interface where clicking on a data point would lead into the text.

The same can be achieved with a simple three window text display (cited before). The point is to have easy access to sentences of the Areopagitica. The printed editions do not provide that. The often extremely long sentences are presented in extremely long paragraphs. As I see the task at hand, the logistics of Milton studies need to be improved. The interface below concentrates on bi-labial plosives, but any number of more valuable features could be extracted from the text, put into an abstract form with the links back into the text. We must help the human brain to easier access to our textual tradition. Prof. Fish is one of our great athletes on the court of text. But reading texts and understanding our heritage cannot be left to virtuosi in subjective expression, objective it may seem. Our knowledge of nature began in our civilization with the questions Aristotle presented. The systematic work over two centuries has forced nature to yield many erstwhile secrets.  In the 20th c. we have made some great strides to a more universal understanding of the texts of discrete cultures. How can we coexist on the planet with very similar physiological processes, very similar existential challenges, yet with so opposed cultural expressions.

We must make access to text easier. The point is not just to increase access to schools and universities, we must improve the logistics of bringing to texts all that is required to work through them. Many Digital Humanists are convinced the answer lies in automatic processes that can quantify vast amounts of text. Perhaps, likely even. Google has astonished me in the last ten years. Yet, it is not uninteresting to work in depth on a single text of 18,000 words.

Ngram has shown me a blip in the uses of the word bishop in 1590 continuing for some twenty years. The thought will haunt me for the next couple of days. I suspect that it may be merely an accident of what has been scanned. Until we get a more complete and more accurate record of our texts such blips will be little more than phantom images on our still relatively primitive machines.

One of the guiding lights, quite peripheral to what I am doing here, but still a guide into the future of pedagogical work with old texts is the work of Jonathan F. Bennett.

Professor Bennett has had a long career at various universities  starting at Cambridge and continuing to universities in Canada and the US.  He has gleaned the insight from decades of teaching that very pedestrian language problems are blocking access to ideas from the 17th c. for modern students. In philosophy the problem is not the ideas, but the archaic language in which the ideas are presented. I understand that a student of Milton might be required to deal not only with Milton's ideas but also with his language. The study of Descartes might not operate under the same imperatives. Prof. Bennett does not work with Milton and he is fully aware of the controversial aspect of his recent work and the need not to leave the early modern period completely. He believes that the benefits for students outweigh the imperatives of faithful reproductions of old editions. Prof. Bennett concentrates on philosophy texts:
When students are introduced to the great philosophical works of the early modern period, it is usually in the hope that they will engage with the thoughts and arguments that the texts present. The teaching experience of many of us suggests that most students simply cannot understand these texts. The increasing rate of change in the English language ensures that fewer and fewer of today’s readers can cope with the writings of the 16th-18th centuries. There are difficulties of syntax, length and complexity of sentences, words that are no longer current, still-familiar words used in meanings that they now do not have, arcane references to other philosophers which today’s students will seldom understand or be required to follow up; these and other factors create forbidding obstacles to engaging with these early modern texts. I reduce the obstacles so that students can more easily come to grips with the philosophical thoughts the texts express. Once they do that, they still won’t have an easy time, because the material itself is hard; but their efforts will go into getting philosophical understanding, not decoding old prose. http://www.earlymoderntexts.com/f_why.html
The same thing can be said for the Areopagitica. We read that text not for the poetry of it, Prof. Fish is here the exception, we read it for the ideas. We could start speculating with smoothing the language. I am not completely convinced that it is impossible to separate the language of the 17th c. from the ideas of the Areopagitica.

The argument here, the point to the effort is to encourage text workers to use the resource of the windowed laptop. I am not concerned with the data-miners. I am concerned with specialists on small areas of the text tradition who should find ways to use computers and algorithms to map their field with greater precision. I have explained enough that the text in the window below should be comprehensible. I have no illusion that this work is easy, neither was the path from Lachmann to Cladistics. It may be that the text miners will force us to forget who transmitted what to whom, which we have done now for 200 years with complete philological rigor, and concentrate instead on what Milton is telling us in the first place. Or they may be do both. To use modern scientific methods to revisit questions that lost relevance a hundred years ago seems atavistic nostalgia. Much of what has survived in our academies of textual positivism must be rethought in terms of opening the tradition to readers, not in perusing ever more esoteric provenance studies.

As such, the problem of what is a sentence in Milton becomes less important. For example, is a question - a phrase ending in a question mark - a sentence, even if the following phrase is not capitalized? largely becomes irrelevant. It becomes an easily understandable example of irrelevance. In the past our academic methodologies have tried to reproduce type-setting conventions of the past. In our new electronic editions, we can ignore the conventions of the past and try to recapture the communication. Is there any reason to carry Milton's spelling "wee" into the present? The loss of meaning due to lack of familiarity with 17c. prose is greater than some daring maverick replacing all the "wee" with "we" and capitalizing the first word after a question mark, just to make parsing text easier for perl programs. To some that would be irresponsible vandalism, endangering the transmission, knawing at the foundations. I say, lets save what we can for the students of today. To this end, I plan a few more posts, principally on and-pairs and the use of apostrophe. 

Three Widows: 1. percentages, 2. sentence profiles, 3. the text.
So what has all this programming yielded? The answer is not much really. For me, the exercise was not completely uninteresting; I exercised my perl programs. I got a chance to practice some feature extraction I had not tried before, on a text I had not touched since undergraduate days. In addition, programs are living things. When they are awoken and applied to data, they execute logic that works in harmony with the human mind. The program gives me the percentage of BP's in every sentence in the Areopagitica - and sorts them ascending or descending. As such, the program has a life, a script life, a symbiant life designed around the deficiencies of the human brain in assimilating streams of words. As such, it has intrinsic value. Its intrinsic value also requires that it be perfected, optimized, and extended as new questions present themselves. There are some quasi-parental obligations we have towards our programs.

I have resisted interpretation.  What about the single digit spans? What about the fact that "liberty" is the first and last BP in sentence four. My weapons have the safety on; these geese are safe for now.

However, there are collateral benefits. In chasing BP's, I have had to pour over the text in some detail. I have checked individual lexical items and tracked down a few chiasmi. I must admit that my attempt to get profit from a sequential reading has yielded sparse results. I have outlined and parsed the first 26 sentences, the introduction. I have followed Milton's history of censorship and gained some insight into the last phases of censorship from the inquisition to the Church of England to to the Presbyters. I have followed Arber's outline in his 1868 reprint and secured a lifeline through the imparsable. It has been curious how library vandals have marked up the library books scanned for Google. There are lines and arrows to track down hidden sentence parts. So I am not the only one who is having problems. However, it s not necessary to disfigure electronic texts. I cannot stress enough how important is is to have the physical aspect of a text well in hand in a multi-window text processor. Milton editions, under the guise of historical bling-bling err on the side of the textual brier-patch.

 



Milton's argument in the Areopagitica does careen from one unfamiliar reference to the next. I do feel vindicated in my emphasis of Milton's conciliatory mission. He is trying to convince his "Parlament" of the greatness of England and the contribution of its learned men. At the beginning of the final appeal for tolerance in the last 20 sentences, Milton rises above his argument.

Before I return to the programming I would like to share two quotes in sentences relatively easy to parse.

|p321
What else is all that rank of things indifferent, wherein Truth may be on this side, or on the other, without being unlike her self.

Sentence 321 exhibits a healthy perspectivism in a time of absolutism, when few would share this thought: it may be possible for truth to be on opposing sides and still be truth in each case.

Sentence 324 pleads for peace and withholding of judgement.

|p324
How many other things might be tolerated in peace, and left to conscience, had we but charity, and were it not the chief strong hold of our hypocrisie to be ever judging one another.

In this context it is difficult to render my final judgement on the tools Prof. Fish uses to smite Digital Humanities. I continue to maintain that he is rushing down a doctrinal debate that is not an essential, only a peripheral theme in the Areopagitica. Whether he is just caught in a subjective moment, or lampooning, or just stuck with a unreflected argument, I cannot say. In any case, sharp reflexes that are so helpful in sport, should be restrained in hermeneutics.

I do appreciate having been goaded through this extended tour through Milton; wish I had some standing to interpret in the 17th c.

Before I close this project I plan some more posts:

1. to present the latest version of the script which traces a path from unnumbered sentences to ever more focused list. The point being to attain effiiciencies by starting with the text in each of the literally hundreds of test runs. That way on can fix a typo in the Milton text and not interfere with tests on the percent calculations. The corrected word, or a faux sentence cusp, will automatically be carried to the latest list. The production of analysis tool goes hand in hand with cleaning the text. One chief task of the "cleaning" is to smooth out the arbitrary and unsystematic printing conventions of yore.
2. to extract all and pairs and find several views that shed some light on equivalences in Milton's thought at this time (and to recommend the extraction of conjunctions and their arguments as a general methodology with texts);
3. to extract the  various words with apostrophe to indicate missing words.

More on this later.

For now I would like to return to the BP chase and discuss the latest version of the script. Please go to the next, newer post.