If you are unfamiliar with Word
Mutagenation, |
The meaning of words |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Though this be madness, yet there is method in ’t. So how do we determine, for the purposes of Phrasenation, what constitutes a valid word or phrase? Well, we will take a miniscule sample of English literature, and compare our phrase to that sample! If the sample is found, then it will be considered a valid phrase. One proviso: All words must be complete. No half words. (As a technical matter, we will count certain symbols, such as "!" and "?" as separate words.) |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Our Phrase Book For our Phrase Book, we will use The Tragedy of Hamlet, Prince of Denmark by William Shakespeare. I'm sure most everyone can agree that just about anything the Bard said is in some sense meaningful. After all, he practically invented modern English single-handedly! Keep in mind that this will exclude the vast majority of valid phrases, including even most of Shakespeare. However, you can add phrases to the Phrase Book, if you choose.
Finally, all valid phrasenations are ranked by numbers of characters. Longer is "better". |
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Hamletations When generating a phrasagen (a mutant phrase), we will use random mutation and recombination. Starting from just two words, "the" and "question", some not-so-valid phrasagens might look like these:
Phrasenation allows one to adjust the relative frequency of each type of mutation. Once having generated a phrasagen, we must compare it to our Phrase Book in order to determine its meaningfulness. If it is not found, we will ruthlessly eliminate it. To be honest, as this world goes, is to be one man pick'd out of ten thousand. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Indexing the Bard To make this a practical matter, we must index every valid phrase in our Phrase Book. But it isn't enough to index just the first word in a phrase. We must index every single word.
But what if the first words are the same? Well, then we will compare the word that follows, and if necessary each succeeding word until we find a word which is different. For instance, "to be" is not a unique phrase, as it could be found as "to be, or not" or as "to be- that is the question". In fact, the phrase "to be" shows up 34 times as first words, including "to be your valentine", but "to be," (note the comma) only once, in "to be, or not to be". The index includes 36,176 words, including symbols. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
A Most Interesting Result O day and night, but this is wondrous strange! When testing the index, the Phrasenator outputted every indexed word followed by a specified number of words. Then the Phrasenator counted the number of unique phrases. For one word phrases, there were 4,801 unique phrases. But what about other numbers of words? For large numbers of words, the answer is surely 36,176, but what is a "large" number? Somewhat surprisingly, if you select any four words in series, the vast majority will constitute a unique phrase! And for that small percentage which are not unique, those are nearly all purposefully repeated phrases, such as "a pit of clay for to be made" from the singing Clown's refrain,
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Definitions:
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
With "O Sean Pitman", we introduced our modest project with simple concatenation and point mutation. Then, in accordance with Dr. Pitman's wishes, we added Insertion. Now we add Exchange and Complex Recombination. Each of these categories are approximately related to powers of L. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
That
he is mad, ’t is true: ’t is true ’t is pity;
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Pitman's Assertions There are more things in heaven and earth, Horatio, Than are dreamt of in your philosophy.
As L increases, Dr. Pitman claims that the ratio of valid phrases to the totality of sequence space, approaches zero. Valid sequences get lost in the vastness of sequence space. He concludes that it is impossible to evolve sequences beyond the "lowest level of complexity". However, Dr. Pitman has failed to provide a method of calculating N, much less a map of how valid sequences are distributed in sequence space. Generally, any collection of valid phrases and sentences have some validity. Language can ramble somewhat and still be valid. We could start by talking about Pitman's handwaving and suddenly, for no particular reason, change the subject to ghosts and the murder of kings. Imagine that!
Results Matter However, even using a tiny sliver of the English language—just one play by one playwright—it can be shown that phrases of substantial length can be easily evolved. Our experiments have shown that words and phrases appear to have some underlying connection related to their own evolution. As such, words and phrases make ideal subjects for evolutionary algorithms.
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
You can join the discussion on Phrasenation at talk.origins. The thread can be found here. |
|
||
|
|
Phrasenation-Genesis Zip format ~2MB. |
(Requires VBA6 which is included in Office 2000
and Excel 2000. ©2004 Zachriel |
|
|
|
Zachriel's Phrasenation brought to you by |
Hosted By Crown Mall and Designed by Web King. |