If you look closer at "michiton oladabas"[0], you will notice a smudge above the first "i" and a spot above the middle "a". So let us read it as "míchiton oladábas". Now reverse it, treating "ch" as one letter, like in Czech spelling. You get "sabádalo notichím" or "notichím sabádalo". Both mean "was investigated by 'no' Tichý" in Early Modern Czech. In 2003, I posited that a Mr Tichý, an early researcher of the Voynich Manuscript scribbled this Czech "cipher" on its last page.
Pros:
* The manuscript has been traced back to the library of Georg Baresch (1590–1622) from Prague. It may have belonged to the Holy Roman Emperor Rudolf II (1552–1612) or to his physician Jakub Horčický (1575–1622), both living in Prague.
* The accents are exactly where they are supposed to be in Czech.
* Tichý is a popular Czech surname. "Tichím" is now spelled "Tichým". AFAIK, "sa" is a dialectal variation of "se".
* What would you write on a book you have investigated?
Con:
* What the heck is 'no'?
The hypothesis was met with cold reception on the VMs mailing list and I never cared about it too much but I still find it more credible than the reading in this article.
I was thinking about it (my guess was Norbert) but why would you abbreviate your first name and merge it with the surname?
Another option is a corruption of "netichým" (by the unquiet one; there is no surname Netichý), "nad tichým" (over quiet), or "na tichém" (on quiet). Both "nad" and "na" are unaccented like "sa", which nicely explains merging them with "tichím". As for the last option, I know nothing about the probability of confusing the instrumental "tichým" with the locative "tichém" in Czech. Most importantly, though, these expressions (unless they are some obscure, antiquated idioms) really need a following noun, just like in English. Neither "multos" nor "sotlum" is a noun.
> why would you abbreviate your first name and merge it with the surname?
One might also ask "why would you write your name backwards in a book that's written in a language no one can read?". Perhaps the same thought process that lead the writer to reverse the order of the letters also encouraged them to remove punctuation and superscripting (which were often used when abbreviating names), as well as remove the space between first and last name.
Abbreviation of the first name (and not the surname) doesn't seem that odd, as evidenced by this 1640 collection of poems by "WIL. SHAKESPEARE":
Readers may also find the Copiale cipher[1] interesting. It has been around since the 1700s, unsolved, until 2011 when a team figured out (with the help of a computer or 2) it was a homophonic german cipher, and decoded the whole thing.
> an initiation ritual in which the candidate is asked to read a blank piece of paper and, on confessing inability to do so, is given eyeglasses and asked to try again, and then again after washing the eyes with a cloth, followed by an "operation" in which a single eyebrow hair is plucked.
For the amount of time and effort it took to extract this secret, that's a hilarious letdown.
> "Although certainly not as complex or secure as modern computer operated stream ciphers or block ciphers, in practice messages protected by it resisted all attempts at cryptanalysis by at least the NSA from its discovery in 1953 until Häyhänen's defection in 1957."
It's not so easy to find a unicode transcription, but I would think that would be the first step in 2020. There is some info here [1] about such a transcription but it does not feel "definitive". Does anyone know if there is a copy paste-able transcription of the manuscript anywhere?
When linear B was decoded the first thing many researchers did was to transcribe every word they could get their hands on, then sort by beginnings and endings of words [2]. I suppose that is what the poster has done since they say there are no inflected word endings, but perhaps a cipher hides these, or they do not have enough data?
It's difficult if not impossible to transcribe something in an unknown writing system, since you can't even be sure which glyphs are the same. Wikipedia has a few attempts:
There is a transliteration of the Voynich text at [1]. I have no way to judge how complete or accurate it is, but it appears to be a useful piece of work.
The most widely used transliteration, by Takahashi into EVA, is mentioned on that page, which is on Rene Zandbergen's web site. There was no intention of making EVA phonetically accurate, as we don't know how, if at all, the Voynich Manuscript text was supposed to be pronounced. However, it is reasonably pronounceable. You can find Takahashi's EVA transliteration at http://www.voynich.com/pages/PagesH.txt
For what it's worth, there has been some work on parsing Voynich words into syllables, which might be a more helpful level to work at.
"Taking into account words which couldn’t be syllabified (2.6%), words with more than three syllables (0.4%) and words with rank errors (1%) we can see that the Body Rank Order theory accounts for 96% of all word types with more than four tokens."
It's well-known that there's a grammar of glyphs within "words" (i.e. groups of glyphs separated by spaces), but it's premature to identify syllables. Most words seem to have prefixes and suffixes, without anything in between.
> the illustrations of women bathing strongly suggest that the work is concerned with distinctly European traditions of balearic treatment of illnesses
I assume the author means "balnearic", but i certainly prefer to imagine that he is referring to the foam party at Amnesia.
>I've been reading the so-called secondary literature for about a year. What compels me to come out is the discovery over this past year that for the most part commentators really do not know what they are doing.
...few things are certain but that voynich's manuscript is NOT in Latin is sure: they would have already translated it centuries ago (i.e. when it was written). This person by his admission has only been studying the problem "online" for a year on "so-called secondary literature" and "hi-res scans", so writing things like:
> What compels me to come out is the discovery over this past year that for the most part commentators really do not know what they are doing.
is simply ridiculous.
This man is arrogant and full of himself and like all arrogant people he ignores hundreds of papers on the topic written by hundreds of people before him.
It's quite possible it is a hoax intended to get a interested buyer to buy the book. If Rudolf II owned it, it's likely he payed a lot to get it. It wouldn't be a first forgery/hoax.
Bummer, he pretty much proved he was on the right path with his translation theories. Maybe this was done to protect his work? As in, hes now working to complete the translation and did not want others to beat him to it. One can hope...
His video explaining how the astronomical diagrams depicted a specific known eclipse was extremely compelling (if I am remembering it correctly).
Another intriguing video is [0] by Sukhwant Singh. I think he over-estimates how much of the VM he has decoded, but some of his observations are hard to ignore. For example:
* 08:30 interpretation of a diagram in the VM being a map of the Middle East
* 12:46 & 13:26 explanations of the distinctive headgear
* 14:00 & 14:35 & 15:00 observation that the buildings looking like mosques and minarets
* 19:46 explanation of the unusual cylindrical containers
I always found the style and imagery strongly reminiscent of some of the works of Artists included in the Prinzhorn collection (Artistry of the Mentally Ill was his major work).
It has that atmosphere of a self contained mental world.
I have a similar feeling with no rational basis. The fact that something so accessible exists and which no one can decode is a bit inspiring. I've certainly enjoyed exploring the world's languages trying to find a match for what I see in the manuscript, however unlikely it is that a non-expert could decode it. It does feel like a treasure hunt anyone can join.
Once we can read the text, it will most likely describe something mundane. Perhaps it is a text on herbology, like it appears. For me the mystery is worth more than whatever the manuscript contains. I don't think it will open the door to understanding history and a culture like the Rosetta stone did.
Alternatively, there is an AI technique for translating between languages without needing a bilingual dictionary or parallel texts[0].
There may not be enough data in the Voynich manuscript to generate an accurate translation of any word in it, but the model generated by the AI might be enough to decide which language is closest to Voynichese, or which transcriptions of the manuscript are the least "noisy".
For this to succeed, the text would have to be meaningful in some language, which is by no means certain. All languages have grammars, but nobody has yet discovered any grammar for the Voynich Manuscript operating outside of words.
If it is not meaningful in any language, then we would not expect it to obey Zipf's Law, however it has been shown that it does (as mentioned by an earlier commenter).
If it didn't obey Zipf's Law, we could safely conclude it wasn't in an unencrypted language, but you can't make the reverse inference, as lots of phenomena obey Zipf's Law which are nothing to do with language. I've generated a fake Voynich Manuscript which I can assure you is entirely meaningless, but it obey's Zipf's Law. Here it is: http://www.fmjlang.co.uk/voynich/generated-voynich-manuscrip...
“ Zipf’s law was discovered centuries after the accepted date of creation of the Voynich text. Thus, proposed solutions like the use of sixteenth-century cipher methods, although not impossible, can hardly account for the presence of Zipf’s law in the Voynich text.”
The constraint the author of the Voynich Manuscript would have been working to is that it had to look at least superficially like a natural language. The bare minimum for that is that the choice of current glyphs depends on the previous glyph, and I think that would have been obvious even at the time the Voynich Manuscript was written. I think that's a necessary (but not a sufficient) property of all alphabetic languages, and suspect it's enough to make the distribution Zipfian. It would be nice to have a mathematical proof or an experimental verification that it's always or nearly always Zipfian.
Generating text in that way doesn't require a prior knowledge of Zipf's Law, and I don't think it was unthinkable in the 15th Century.
It's not unthinkable that it could be the result of a hoax, and I admit that there aren't a lot of examples of documents even remotely like the VM, but I still think that an Occam's Razor approach would rule out a hoax.
How likely is it that someone (or even some group) trying to perpetrate a hoax would come up with a method which was not only undetectable by analytical tools available at the time but also by tools that wouldn't be invented even centuries later? I suppose we should compare it to the invention of the Vigenère cipher, in the 16th century, which was only broken in the 19th century.
Conversely, we have examples of texts written in forgotten languages which were later translated / remain untranslated, and we don't / didn't assume them to be hoaxes (assuming they obey Zipf's Law). We also have examples of people inventing new written languages for spoken languages that didn't have a (known) written form, so I think Occam's Razor would support the hypothesis that this is what happened with the VM.
The Voynich Manuscript is a unique document. Nothing else is written in the same script. A few others from the period are encrypted, notably Bellicorum Instrumentorum Liber by Giovanni Fontana, but they're trivial to decipher. Only one other I'm aware of has any naked women in baths.
For other scripts, there either are numerous texts for each (e.g. Egyptian, Sumerian, Mycaenean Greek, Khitan, Etruscan, Rongorongo), or very few short texts (e.g. Pictish). Not a stand-alone 240 page book.
But there are examples of forged documents from the middle ages, e.g. the Gospel of Barnabas and the Donation of Constantine.
I tried to decipher the Voynich Manuscript, before discovering that there were other, simpler explanations for its apparent meaningfulness (e.g. the observations described in the Montemurro & Zanette paper): http://www.fmjlang.co.uk/voynich/Voynich.html . It's possible it's some kind of verbose cipher devised by someone ahead of his time. It's even possible, though very unlikely, that it's written in a Northwest Caucasian language. I've ruled out other language groups.
The approximate adherence to Zipf's Law is telling us something about how the text was produced. When I have time, I might look into whether state machine generated output follows Zipf's Law. I suspect it does generally, and know it does for the text I generated, but I'd like solid proof and it would be another nail in the coffin of the idea it's unencrypted or lightly encrypted text. Or it might not.
Thank you for that excellent overview. It's interesting that your Principal Component Analysis[0] pointed to a Caucasian language, as did a more recent Hypervector Analysis[1]. How likely is it that a 15th century process for generating meaningless text would not only follow Zipf's Law but also give consistent and plausible results under those analyses?
I'm of the opinion that it's not supposed to make sense. I think it's a sample: a demo herbal void of any valuable content. Its purpose is to showcase the abilities of the author/illustrator and nothing more.
If the sequence of symbols was completely random (biased only by a human desire to make it look like a language, and human inability to be perfectly random) wouldn't that show up in a statistical analysis?
According to [0], "A long succession of (actually pretty good) past statistical studies has revealed that Voynichese has an abundance of mechanisms that give it internal structure, not only in terms of letter adjacency and within words generally, but also within lines, paragraphs, and pages."
Whatever process generated that structure, it seems like it required more work than is necessary to achieve the aim you suggest the writer(s) had.
I didn't say it was random. On the contrary, I'd expect it not to be. As you say, humans are bad at random, so if you told a human to sit down and write nonsense content in a nonsense script, you could probably expect some structure to emerge despite the lack of meaning. The writer might even accidentally inject unintentional interpretations by semi-consciously templating their work off of familiar texts.
Why not just copy sections of the Bible if it's just a sample of work? Why go through the extra effort of creating a new script and possibly a new language just to show how well you can write? Latin would have been the predominate language for manuscripts at this time so writing in Latin would have been a better example of the writer's skills. These kinds of manuscripts took a lot of effort and resources to put together so if someone was just trying to make a portfolio of their work, do you think they would need to produce such a large volume when several pages would have sufficed?
We use Latin in "Lorem Ipsum" because most people are unable to read it today. This makes sure that the viewer focus on the page layout and images and not the text itself.
If it was written in Latin during an era where everyone could read Latin, it wouldn't be the same concept.
> Why not just copy sections of the Bible if it's just a sample of work?
It would probably be a bad idea to juxtapose religious text with the borderline alchemical pictures in the manuscript. People are/were a superstitious lot. Also, you wouldn't want to risk angering the church. Aesthetically it would also be a bad fit, because this (probably?) isn't a Bible. For the type of content on display here, it probably wouldn't do to remind the client of Sin during the pitch.
> [why] produce such a large volume when several pages would have sufficed?
Here's a similar volume of pages with no meaning but to showcase the capability for presenting a particular type of content: https://www.squarespace.com/templates Only instead of Photography, Online Stores, Blogs the categories are Herbal, Astronomical, Balneological, etc.
> Why?
I think we can be fairly certain that whoever produced the Voynich Manuscript was at least a little hmmm... eccentric? We should probably be moderate about how much sense we expect to make from it.
Maybe they were feeling creative? AFAIK many of not all of the plants shown are fictional, so the document being the product of a creative mind seems likely.
It wouldn't be hard to construct a scheme for using the manuscript as an oracle. A single "word" or "sentence" chosen by chance could only be interpreted since it would have no translation.
Pros:
* The manuscript has been traced back to the library of Georg Baresch (1590–1622) from Prague. It may have belonged to the Holy Roman Emperor Rudolf II (1552–1612) or to his physician Jakub Horčický (1575–1622), both living in Prague.
* The accents are exactly where they are supposed to be in Czech.
* Tichý is a popular Czech surname. "Tichím" is now spelled "Tichým". AFAIK, "sa" is a dialectal variation of "se".
* What would you write on a book you have investigated?
Con:
* What the heck is 'no'?
The hypothesis was met with cold reception on the VMs mailing list and I never cared about it too much but I still find it more credible than the reading in this article.
[0] http://inamidst.com/voynich/michitonese