Full Text 01
Full Text 01
Proceedings of the
1st International Conference on
Historical Cryptology
HistoCrypt 2018
Editor
Beáta Megyesi
Published by
v
cussion among the reviewers to synchronize recommendations, the final selection of
the papers was made by the program committee. The decision was not an easy task
due to many submissions with high scores and overall positive reviews, and the time
and space constraints of the two day long main conference. Our goal was to include
papers dealing with a wide variety of topics from various scientific areas of relevance
to historical cryptology. We aimed at achieving balance between regular and short
papers, and chose to accept papers—being regular or short—with high scores as oral
talks, while we were more lenient with poster presentations. 66% of the regular papers
and 33% of the short papers were accepted for oral presentations, while 22% of the
regular papers and 50% of the short papers were accepted as posters and/or demos. In
the final program, there are 16 regular, and 5 short papers, all collected in this volume,
thematically structured, in the same order as they are presented during the conference.
In addition to the accepted papers, we are proud to present six invited keynote
speakers, distinguished researchers from France, Germany, Israel, and the US. They
cover different areas of the conference. Craig Bauer (York College of Pennsylvania)
presents highlights from his book on the world’s greatest ciphers. Katherine Ellison
(Illinois State University) talks about the central role of cryptology in the history of
reading. Rémi Géraud tells us about his and David Naccache’s work (École normale
supérieure) on a French code from the late 19th century. Ephraim Lapid (Bar-Ilan
University and IDC Herzliya) presents the thrilling story behind the British “Israeli
Enigma”.
Given the location of H ISTO C RYPT at Uppsala University, special attention is
given to the heritage of Prof. Arne Beurling and his role in breaking the German
teletype ciphers. Therefore, we invited Kjell-Ove Widman, professor emeritus in ap-
plied mathematics (University of Linköping) to talk about Arne Beurling as a math-
ematician and code breaker, and George Lasry (University of Kassel) to tell us about
his very new methods to solve the “completely hopeless” T52, which Arne Beurling
worked on and which can be found in the archives of the Swedish National Defence
Radio Establishment (FRA). And in connection to the poster and demo session, we
organize an exhibition to show four Enigma machines, usually hidden in the FRA
archive.
Lastly, the conference program also includes two workshops, organized by their
own committees. The workshop on (Automated) Cryptanalysis of Classical Ciphers
with CrypTool 2 demonstrates the open-source e-learning tool consisting of several
classical and modern cryptographic algorithms where participants can learn and prac-
tice how to use CrypTool. The workshop Solving codes rather than ciphers. Is there
a software challenge? focuses on the fascinating codes, encrypted messages word by
word, aiming at finding solutions for breaking these.
These proceedings will provide a permanent record of the program. The confer-
ence proceedings are published by the Northern European Association for Language
Technology (NEALT) Proceedings Series by Linköping University Electronic Press,
as freely available Gold Open Access. The proceedings are indexed in the DBLP com-
vi
puter science bibliography and also published in the anthology of the Association of
Computational Linguistics (ACL Anthology) in parallel.
Organizing a conference with an interesting and diverse program in a highly cross-
disciplinary field is far from easy and relies on the goodwill of many researchers in-
volved in the various scientific areas, all with their own traditions. I would like to
express my gratitude and appreciation to my great fellows on the program committee
for their invaluable work, for fruitful discussions, and for sharing the effort of creating
the program. A special thank goes to the steering committee, especially Arno Wacker,
for his support and generous advice. I would also like to record my appreciation for
the work of the 23 reviewers and 4 subreviewers for their time and effort to contribute
to the reviewing, give constructive and collegial feedback, and help the program com-
mittee in the selection of papers. Wholehearted thanks go to the six keynote speakers,
and the workshop organizers, as well as Anders H. Wik and Åsa Ljungqvist for bring-
ing to light the Enigma machines and arranging the exhibition in connection to the
demo session. I would also like to thank all authors without whom this conference
would not have taken place! Nils Blomqvist deserves a huge and special thank for
professionally serving as the proceedings co-manager. My greatest debt goes to the
local organization, Eva Pettersson, for carrying the burden of the local organization,
and Bengt Dahlqvist, for helping out with the on-line registration and the conference
website. We are also extremely pleased to have received generous sponsorship from
the Swedish Foundation for Humanities and Social Sciences allowing free registra-
tion and covered accommodation and travel costs for many conference participants.
Lastly, I am grateful to my nearest and dearest—my twins, bonus kids, and partner—
for generously giving me the space to disappear into our world of hidden secrets from
the past.
I wish you all a fruitful conference and hope you will enjoy H ISTO C RYPT 2018!
vii
Program Committee
• Beáta Megyesi (Program Chair), Uppsala University, Sweden
• Bernhard Esslinger, University of Siegen, Germany
• Otokar Grošek, Slovak University of Technology, Slovakia
• Benedek Láng, Budapest University of Technology and Economics, Hungary
• Mark Phythian, University of Leicester, UK
• Anne-Simone Rous, Saxon Academy of Sciences and Humanities, Germany
• Gerhard F. Straßer, Emeritus, Pennsylvania State University, USA
Steering Committee
• Arno Wacker, University of Kassel, Germany
• Joachim von zur Gathen, Emeritus, Bonn-Aachen International Center for In-
formation Technology, Germany
• Marek Grajek, Poland
• Klaus Schmeh, Private researcher, Germany
viii
• Camille Desenclos, Université de Haute-Alsace, France
• Mans Hulden, University of Colorado Boulder, USA
Subreviewers
• Bradley Hauer, University of Alberta, Canada
• Saeed Najafi, University of Alberta, Canada
ix
I NVITED TALK :
Abstract
Craig’s book, Unsolved! The History and Mystery of the World’s Greatest Ciphers from An-
cient Egypt to Online Secret Societies, saw print a little over a year ago. Many updates can
now be made. The talk includes highlights from the book, progress that has been made on sev-
eral ciphers contained therein, and images of more historic unsolved ciphers, as challenges for
conference attendees.
Bio
Craig P. Bauer is professor of mathematics at York College of Pennsylvania and the editor-
in-chief of Cryptologia. He was the 2011-2012 Scholar-in-Residence at the National Security
Agency (NSA) Center for Cryptologic History and is the author of two books: Secret History:
The Story of Cryptology and Unsolved!: The History and Mystery of the World’s Greatest
Ciphers from Ancient Egypt to Online Secret Societies. His television appearances include the
mini-series The Hunt for the Zodiac Killer and two episodes of Codes and Conspiracies.
xi
I NVITED TALK :
Abstract
This presentation will explore the central role of cryptology in the history of reading, when lit-
eracy became a goal of the masses rather than a special skill reserved only for the educated elite.
Beginning in the seventeenth century, instructional cryptography manuals established the foun-
dational terms and methodologies of literacy training. Cryptologers including John Wilkins,
Gustavus Selenus, Gasparis Schotti, Noah Bridges, and John Falconer sought not only to edu-
cate the public in ciphering and deciphering but to establish multimodal habits of everyday lit-
eracy; they had a vision of the future of citizen literacy that resisted the dominance of alphabetic
reading and insisted that literacy must encompass alphabets as well as mathematics, algorithms,
scientific symbols, musical notation, visual images, and digital technologies (and they did use
the term “digital”, as in requiring the use of the digits). Cryptology also provided the framework
for teaching audiences how to see the ways in which the habits of printing, page layout, and the
physical materiality of books and paper all make meaning in relation to the symbols on the
page. Though their methods did not heavily influence eighteenth- and nineteenth-century edu-
cational theorists, the revival of cryptologic curiosity during World War I, in particular, brought
the seventeenth-century methods to the attention of figures like John Matthews Manly, Edith
Rickert, the Friedmans, and others. Riverbank Laboratory even began publishing primers for
teaching kindergarteners how to read – by teaching them the bilateral cipher of Francis Bacon.
Bio
Katherine Ellison is co-editor of A Material History of Medieval and Early Modern Ciphers:
Cryptography and the History of Literacy (2017) and author of A Cultural History of Early Mod-
ern English Cryptography Manuals (2016) and Fatal News: Reading and Information Overload
in Early Eighteenth-Century Literature (2006). Professor of English at Illinois State Univer-
sity, she has published widely on cryptology, media history, and literacy in Games and War,
Early Modern Trauma, Literature Compass, the Journal for Early Modern Cultural Studies,
the Journal of the Northern Renaissance, Book History, Eighteenth-Century Fiction, Educa-
tional Research, Academic Exchange Quarterly, Maternal Pedagogies, and Sex and Death in
Eighteenth-Century Literature. She is beginning a new collection with Medievalist Dr. Su-
san Kim on John Matthews Manly and Edith Rickert and a monograph on Fop Intelligence, an
investigation of cryptology and gender identity.
xiii
I NVITED TALK :
Abstract
The Franco-Prussian war (1870-1871) was the first major European conflict during which ex-
tensive telegraph use enabled fast communication across large distances. Field officers would
therefore have to learn how to use secret codes. But training officers also raises the probability
that defectors would reveal these codes to the enemy. Practically all known secret codes at the
time could be broken if the enemy knew how they worked.
Under Kerckhoffs’ impulsion, the French military thus developed new codes, meant to
resist even if the adversary knows the encoding and decoding algorithms, but simple enough to
be explained and taught to military personnel.
Many of these codes were lost to history. One of the designs however, due to Major H.
D. Josse, has been recovered and this article describes the features, history, and role of this
particular construction. Josse’s code was considered for field deployment and underwent some
experimental tests in the late 1800s, the result of which were condensed in a short handwritten
report. During World War II, German forces got hold of documents describing Josse’s work,
and brought them to Berlin to be analysed. A few years later these documents moved to Russia,
where they have resided since.
Bio
David Naccache heads the ENS’ ISG. His research areas are code security, forensics, the au-
tomated and the manual detection of vulnerabilities. Before joining ENS Paris (PSL) he was
a professor during 10 years at the Université Paris 2 (Sorbonne Universités). He previously
worked for 15 years for Gemplus (now Gemalto), Philips (now Oberthur) and Thomson (now
Technicolor). He is a forensic expert by several courts, and the incumbant of the Law and IT
forensics chair at EOGN. David is the inventor of 170 patent families and the author of 200
publications in information security and cryptography.
Dr. Rémi Géraud is cryptologist, security researcher, member of the Information Security
Group of École normale supérieure. His research interests include the mathematics of public-
key cryptographic protocols, information security, physical and network intrusion, defensive
design, and on a broader scale the economics and geopolitics of information.
xv
I NVITED TALK :
Abstract
From its early days, the Israeli military has developed a Signals Corp to provide effective and
secure communication to the needs of the defense establishment. From the start, there was good
cooperation between the functions of cyphering and deciphering, although they were conducted
by different organizations, to assure the security and reliability of military communications.
As soon as Israel gained its independence, it became a key target for British intelligence col-
lection. British espionage activities on Israel were coordinated from the Security Intelligence
Middle East (SIME) headquarters near Cairo, and later from Cyprus. The nascent Israeli cryp-
tography was of special interest for British intelligence, as Britain still maintained a substantial
military presence in the region, especially at the Suez Canal in Egypt and in Jordan. In the
early 1950s, British intelligence embarked on a covert operation aimed at giving them access to
Israel’s most secret communications. The Israeli Defense Forces (IDF) looked for an advanced
cypher machine to replace the hand-cypher, which caused bottlenecks of huge numbers of mes-
sages. Israel succeeded in purchasing from the United Kingdom 50 Enigmas in good order, and
believing in the Enigma’s invincibility, invested substantial effort and cost to transform them in
great secrecy into Hebrew. However, before these Enigmas were put to operational use, a warn-
ing was received from several Israelis, who were former members of Bletchley Park’s staff, on
British successes in cracking the Enigma during WWII. A decision was made to abandon the
Israeli Enigmas. Most of the Hebrew Enigmas were sadly destroyed, only one example was
retained and is today on display at the IDF heritage center.
In the British Bletchley Park team were several persons who later immigrated to Israel,
including Prof. Joseph Gilis, who founded the Department of Mathematics at the Weizmann
Institute and Dr. Walter Eytan, the First Director-General of Israel’s Ministry of Foreign Affairs.
Two other Jewish experts from South Africa, Shaul Bar-Levav and Meir Shapira, were the
founders of the IDF units of ciphering at the Signal Corp and deciphering in the Sigint unit.
Colonel Shaul Shamai, a prodigious decoder of Arabic codes, was the only soldier who was
decorated by an IDF Chief of Staff who had not fought on the battlefield, a testimony to his
crucial contribution to deciphering key Arab cyphers.
Bio
Brigadier General (Res.) Dr. Ephraim Lapid is a lecturer at Bar-Ilan University and IDC Her-
zliya Israel. He served as a Senior Intelligence officer in the Israel Defense Forces (I.D.F)
and was the I.D.F. Spokesperson and Instructor in the Israeli National Defense College. After
retiring from the Israeli Military, he was a senior official in the Jewish Agency.
xvii
I NVITED TALK :
S PECIAL S ESSION ON A RNE B EURLING
Arne Beurling:
Mathematician and Code Breaker
Kjell-Ove Widman
Professor Emeritus, Sweden
Abstract
Arne Beurling was a 34-year old professor of mathematics at Uppsala University when the Sec-
ond World War broke out in 1939. He reported immediately to the Swedish SIGINT service
and was first entrusted with Soviet military codes which he helped solve, partly in cooperation
with Finnish colleagues. After the occupation of Norway in 1940, a hitherto unknown type
of encrypted traffic was picked up from telegraph cables running from Norway to Germany
through Sweden. Given the task of analysing the traffic, Beurling took the collected material
from two days in May and retreated to his office. Two weeks later he reappeared, having diag-
nosed the type of the transmission, deduced the ciphering algorithm, and found a way to attack
it. Special machines were built, and over a three-year period, more than 250 000 messages
sent between Berlin and the occupying forces in Norway were deciphered and forwarded to the
relevant Swedish authorities.
Beurling’s achievement is surely one of the more remarkable ones in the history of cryptog-
raphy, in particular since he worked with ciphered messages only and had no á priori knowledge
of the system. This talk will try to give a hint of his cryptanalytic work and the Swedish code
breaking effort during the war, as well as touch on his personality and his career as a mathe-
matician.
Bio
Kjell-Ove Widman has been professor of applied mathematics at the University of Linköping,
director of The Mittag-Leffler Institute of the Royal Swedish Academy of Science, and guest
professor at universities in Germany, Italy, Poland and the US. He became interested in cryp-
tology while doing his national service, and has worked on and off in the field since then,
consulting for governmental and private organisations and companies. He has also translated
books in mathematics and related fields.
xix
I NVITED TALK :
S PECIAL S ESSION ON A RNE B EURLING
Abstract
The Siemens and Halske T52 is a family of teleprinter encryption systems, used in WWII by
the Luftwaffe, the German Navy and Army, and German diplomatic services. Codenamed
“Sturgeon” by the Allied, it was designed to provide enhanced security, compared to the other
German teleprinter encryption system, the Lorenz SZ42 (“Tunny”). In one of the most im-
pressive feats of cryptographic genius, the first model, the T52a/b, was reconstructed by Arne
Beurling only from encrypted traffic. It was also reconstructed at Bletchley Park. Until the end
of 1942, Sweden was able to read current T52 traffic that passed through its teleprinter lines,
taking advantage of errors by German operators (e.g., messages sent in depth). At the beginning
of 1943, Germany increased their security measures, also introducing a new model, the T52d.
The T52d was a much more secure system, featuring an irregular movement of the wheels, and
a “Klartext” (autokey) function. Sweden could not read its traffic, and a Bletchley Park report
from 1944 considers the T52d problem to be “completely hopeless”.
The T52 problem (when no depth is available) is still daunting today, even with modern
computing. Since WWII, no new methods for the cryptanalysis of the T52 have been published.
The machine complexity, and its huge keyspace size, 1027 , prohibit any brute-force attack. In
this presentation, George will describe how he applied a novel statistical approach, to decipher
rare original telegrams from 1942, encrypted using the T52a/b, and found in FRA archives.
Also, he will present a first-ever practical attack on the T52d and its successor, the T52e, which
takes advantage of a subtle weakness in the design of their stepping mechanism.
Bio
George Lasry specializes in the codebreaking of historical ciphers using modern optimization
techniques. He has developed state-of-the-art attacks for a series of challenging cipher machines
and systems. In 2013, he deciphered a collection of 600 original ADFGVX ciphertexts from
1918, which provide new insights into key events in the Eastern Front of WWI. In 2017, he also
reconstructed German diplomatic and naval codebooks and deciphered hundreds of encoded
messages from 1910 to 1915. Also, George has solved several public challenges, including
the Double Transposition challenge, Chaocipher Exhibit 6, the M-209 Challenge and the 2015
Enigma Challenge. George Lasry regularly writes about his findings in Cryptologia. The sub-
ject of his Ph.D. thesis is the Cryptanalysis of Classical Ciphers with Search Metaheuristics.
xxi
Contents
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
I NVITED TALKS
Updates on the World’s Greatest Unsolved Ciphers
Craig Bauer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
Cryptology and the Fantasy of Reading
Katherine Ellison . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
A French Code from the Late 19th Century
David Naccache and Rémi Géraud . . . . . . . . . . . . . . . . . . . xv
The Israeli Enigma
Ephraim Lapid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii
xxiii
Solving Classical Ciphers with CrypTool 2
Nils Kopal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Hidden Markov Models for Vigenère Cryptanalysis
Mark Stamp, Fabio Di Troia, Miles Stamp and Jasper Huang . . . . . 39
W ORLD WAR I
The Solving of a Fleissner Grille during an Exercise by the Royal Nether-
lands Army in 1913
Karl de Leeuw . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
Deciphering German Diplomatic and Naval Attaché Messages from 1914-
1915
George Lasry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
Learning Cryptanalysis the Hard Way: A Study on German Culture of Cryp-
tology in World War I
Ingo Niebel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
New Findings in a WWI Notebook of Luigi Sacco
Paolo Bonavoglia . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
W ORLD WAR II
The First Classical Enigmas. Swedish Views on Enigma Development 1924-
1930
Anders Wik . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
An Inventory of Early Inter-Allied Enigma Cooperation
Marek Grajek . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
The Poles and Enigma after 1940: le voile se lève-t-il?
Dermot Turing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
US Navy Cryptanalytic Bombe – A Theory of Operation and Computer Sim-
ulation
Magnus Ekhall and Fredrik Hallenberg . . . . . . . . . . . . . . . . . 103
What We Know About Cipher Device ”Schlüsselgerät SG-41” so Far
Carola Dahlke . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
xxiv
The Application of Hierarchical Clustering to Homophonic Ciphers
Anna Lehofer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
Teaching and Promoting Cryptology at Faculty of Science, University of
Hradec Králové
Michael Musı́lek and Štepán Hubálovský . . . . . . . . . . . . . . . 137
Examining The Dorabella Cipher with Three Lesser-Known Cryptanalysis
Methods
Klaus Schmeh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
Design and Strength of a Feasible Electronic Ciphermachine from the 1970s
Jaap van Tuyll . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
xxv
xxvi
CONFERENCE PROGRAM
Welcome Reception 18:00-21:00 at the Linnaeus Garden1, Sweden's oldest botanic garden.
Craig Bauer
Updates on the World’s Greatest Unsolved Ciphers
Ekaterina Domnina
Nicodemo Tranchedini's Diplomatic Cipher: New Evidence
Camille Desenclos.
Unsealing the Secret: Rebuilding the Renaissance French Cryptographic Sources (1530-1630)
Juan José Cabezas, Joachim von zur Gathen, and Jorge Tiscornia
Uruguayan Cryptographic Carpet
Nils Kopal
Solving Classical Ciphers with CrypTool 2
1
http://www.botan.uu.se/our-gardens/the-linnaeus-garden/visit-the-garden/
12:30-13:30 Lunch
13.30-14.15 Keynote
Chair: Bernhard Esslinger
14:45-16.15 WW1
Chair: Marek Grajek
Karl de Leeuw
The Solving of a Fleissner Grille during an Exercise by the Royal Netherlands Army in 1913
George Lasry
Deciphering German Diplomatic and Naval Attaché Messages from 1914-1915
Ingo Niebel
Learning Cryptanalysis the Hard Way: A Study on German Culture of Cryptology in World
War I
Paolo Bonavoglia
New Findings in a WWI Notebook of Luigi Sacco
16:45-17.30 Keynote
Chair: Benedek Láng
Katherine Ellison
Cryptology and the Fantasy of Reading
2
http://www.norrlandsnation.se/
Tuesday, June 19, 2018 MAIN CONFERENCE
09:00-09:45 Keynote
Chair: Otokar Grošek
Ephraim Lapid
The Israeli Enigma
10:15-12:00 WW2
Chair: Karl de Leeuw
Anders Wik
The First Classical Enigmas. Swedish Views on Enigma Development 1924-1930
Marek Grajek
Inventory of Early Inter-Allied Enigma Cooperation
Dermot Turing
The Poles and Enigma after 1940: le voile se lève-t-il?
Carola Dahlke
What We Know about Cipher Device "Schlüsselgerät SG-41" so far
12:00-13:00 Lunch
Kjell-Ove Widman
Arne Beurling: Mathematician and Code Breaker
George Lasry
Modern codebreaking of T52
16:30-18:00 Poster and demo session with exhibition and coffee
Posters
Niels O. Faurholt
Willard's System
Anna Lehofer
The Application of Hierarchical Clustering to Homophonic Ciphers
Klaus Schmeh
Examining the Dorabella Cipher with Three Lesser-Known Cryptanalysis Methods
Demos
Nils Kopal
Solving Classical Ciphers with CrypTool 2
Extra demo by the organizers: Beáta Megyesi, Nils Blomqvist and Eva Pettersson
The DECODE database
18:00-18:15 Closing
3
http://www.fra.se/snabblankar/english.10.html
4
https://www.borgenuppsala.se/
Wednesday, June 20, 2018 WORKSHOPS
12:00-13:00 Lunch
Ekaterina Domnina
Department of Area Studies
Faculty of Foreign Languages and Area Studies
Moscow State Lomonosov University
ekaterina.domnina@ffl.msu.ru
Abstract Petersburg, Coll. 48, Box 585, no. 35, f. 1). Thus,
Likhachev added one more artefact to his large
This paper discusses a newly identified letter, assemblage of cuneiforms, papyri and paper
written by Francesco Sforza’s diplomatic documents, which was one of the largest private
agent Nicodemo Tranchedini da Pontremoli collections in Russia at the time (Figure 1).
(1413-1481). The aim of this paper is to
establish the date and purpose of this The sellers of this particular letter, the
document and to offer a partial reconstruction Charavet family from Paris or some other
of the code Tranchedini used for it. auction catalogue contributor, advertised it as a
rare find to cast an additional light on the late
1 Introduction medieval diplomatic practice. Since the
document bore only the day and month (23
In 1902 Nikolay Petrovich Likhachev (1862–
February), but not the year, they dated it to the
1936), a Russian historian and antiquary, bought
reign of the French king Louis XI (1423–1483).
an encrypted fragment (a postscript) of a
They alleged that it reported on the diplomatic
diplomatic letter. Judging by the signature, a
congress in Mantua, which was organised in
certain Nicodemus wrote it from Florence on 23
1460 by the pope Pius II to promote the idea of a
February of an unknown year (The Scientific and
new crusade (Figure 2).
Historical Archive of the Russian Institute of
History, Russian Academy of Sciences, Saint
Figure 1. The Scientific and Historical Archive of the Russian Institute of History, Russian Academy
of Sciences, Saint Petersburg, Coll. 48, Box 585, no. 35, f. 1r (with permission)
4
1450s he primarily resided at the papal court, and However, the search through Nicodemo’s
thus the letter should be attributed to the 1440s, letters from 1450s bore some fruit. Not only did I
when he visited Florence as Sforza’s agent. find a document with a cipher identical to the
letter in question but also a partical deciphering
3 Tranchedini’s diplomatic cipher of the code (Archivio di Stato di Milano,
Carteggio Visconteo Sforzesco, 41, no. 106, fol.
In order to prove this hypothesis one should look
1, 11 March 1454, Rome). Judging by the
at the code itself and search for its key. The most
handwriting, it is evident that Cicco Simonetta
obvious starting point would be the cipher
deciphered this passage himself. Then his
collection of his son Francesco Tranchedini
addition was glued with wax over the ciphered
(c.1441–c.1496), preserved in several copies.
text. (Figure 4). Taking this fragment as a
Mentored by Cicco Simonetta (1410-1480), the
starting point and using simple substitution
ducal secretary, Francesco served the Sforza
analysis, I was able to reconstruct the code
family alongside his father. He listed
(Figure 5).
Nicodemo’s cipher on fol. 3r of his treaty
(Cerioni. 1970, II; Hoeflechner, 1970). As L. This nomenclator consisted of 81 signs: 36 for
Cerioni established (1970, I:6-7), it was letters, 4 for double letters, 1 for nulls, 30 for
employed from about 1471 until 1478. This syllables, 11 for words. It is incomplete since no
particular nomenclator consisted of 253 signs, 55 other extant examples of this code seem to have
– for letters, 12 – for double letters, 8 – for nulls, survived in Tranchedini’s correspondence from
65 – for syllables, 113 – for words. the 1450s in the State Archives of Milan. It is
also important to underline that certain signs
When compared to the code of the letter in
from the 1453 letter had a different meaning
question, it did not match. This means that the
compared to that in the postscript. This means
encoded postscript belonged to the earlier date.
that the code was evolving over the time.
After that I could only hope for some good luck,
However, since no other pieces of this code are
combined with thoroughly check through
available now, it is not possible to establish how
Nicodemo’s letters from the 1440-1450s. The
often Simonetta changed Tranchedini’s
search through Tranchedini’s correspondence
nomenclator.
from the 1440s returned neither similar encoded
letters of his, nor the original letter to which this
postscript belonged.
Figure 4. Archivio di Stato di Milano, Carteggio Visconteo Sforzesco, 41, no. 106, fol. 1v, 11 March
1454, Rome, a fragment (permission no. 4218/28.13.11/13, 24/2017 issued on 18.07.2017)
5
Figure 5. A reconstructed nomenclator for Tranchedini’s cipher from the Likhachev’s collection
4 Contents and dating of the document these events. At least, there is firm evidence that
in January of this year he negotiated with the
When analysing the reconstructed contents of Florentines to win their support for Sforza and
this postscript one could easily conclude that the succeeded in this endeavour, providing him with
key figure to understand and interpret it is 20.000 florins on their behalf (Zaccaria, 2015:33-
condottiere Francesco Piccinino (or Pitticino) 34, 371).
(c.1407– 16 October 1449) (See Appendix 1).
5 Conclusions
When on 13 August 1447 Filippo Maria
Visconti, the Duke of Milan, died and the Firstly, the letter should be dated 23 February
Ambrogian republic was proclaimed, it 1449 and was almost certainly intended for
continued its rivalry with Venice for the control Simonetta, who stood behind Tranchedini’s
of the river Po valley. Condottieri Jacopo and mission to Florence in January 1449. Secondly, it
Francesco Piccinini, as well as others, used these is not yet clear where the main part of this letter
adversities to enrich themselves by constantly is, if indeed it is preserved. Thirdly, the
switching sides in return for lucrative payments document adds new data to the discussion about
from Milan and Venice. Francesco Sforza also how Sforza prepared to conquer Milan, which he
participated in this worrisome diplomacy, did in early 1450. Finally, Tranchedini’s cipher
awaiting an opportunity to seize power from the in this postscript differs from the sophisticated
republicans in Milan. On 18 October 1448 he ciphers presented by his son in his cryptologic
and Venice concluded an alliance at Rivoltella, collection, which could indicate that there might
which upset Francesco Piccinino’s plan to get have been a significant difference between
employment from Venice. Instead, in late cryptologic theory and its practical
autumn 1448 Piccinino allied himself with implementation in Renaissance Italy. The study
Sforza to make him pay for his troops’ winter of these differences could help to establish the
expenses, but then, in the following spring he way the cryptographic methodology evolved not
defected to Milan (Ferrente, 2005:28). only in Italian states, but also in other European
countries, which followed their example.
It is most probable that this postscript
belonged to the letter that Tranchedini wrote to
Simonetta in February of 1449 as he witnessed
6
Acknowledgments Appendix 1. [Nicodemo Tranchedini a
Cicco Simonetta]
I would like to thank the wonderful staff of the
Scientific and Historical Archive of the Russian Post datuz come de missere Bossi?1 in questa
Institute of History, Russian Academy of hora partendossi questo messo me, ha chiamato
Sciences, Saint Petersburg and Archivio di Stato in piaza et dictomi, che fo pronte a la sua partita
di Milano for helping me with my research and da Venezia, quando forono gelusi li capitoli fra
for granting me permission to reproduce images veneziani et Frncesco Pitticino et che ebe
from their collections. I am also grateful to the Francesco deve havere quatro milla etc et non
anonymous reviewers of the HistoCrypt 2018 scrivere?, ne fare mostrare novanta dua millia
team for their comments on this paper and to Dr d’oro o de provisone et de essere socio, che
Justine Roehmel for her continuous support of aquista de qua delta, excepta Piacenza, et deve
my research work. essere con? quello ha et che aquista pe[r] di da
Malatesta? adherente de venetiani. Et più me
References dice, che may hanno altro in boca conte?
venetiani lo o non invetatessero[sic] fare
Archivio di Stato di Milano, Carteggio Visconteo
Sforzesco, 41, no. 106. avenenare o morire di qualche altro modi che
non gli toglie per? altro che Sforza, ill. conte, ala
The Scientific and Historical Archive of the Russian qual semper me racommando. Florentiae, 23 febr.
Institute of History, Russian Academy of Sciences, in mane ides? Nicodemus
Saint Petersburg, Coll. 48, Box 585, no. 35.
Augusto Buonafalce 2008. Cicco Simonetta’s cipher- The Scientific and Historical Archive of the Russian
Breaking rules. Cryptologia, 32(1):62-70. Institute of History, Russian Academy of Sciences,
Saint Petersburg, Coll. 48, Box 585, no. 35. f. 1r
Lydia Cerioni. 1970. La Diplomazia sforzesca nella
seconda metà del Quattrocento e i suoi cifrari Translation
segreti. Vol. I-II. Il centro di ricerca, Roma.
[Nicodemo Tranchedini to Cicco Simonetta]
Serena Ferente. 2005. La sfortuna di Jacopo
Piccinino. Storia dei bracceschi in Italia (1423- P.S. From mister Bossi?, leaving at this hour,
1465). Olschki, Firenze. he sent for me, called me in the square and told
Walter Hoeflechner. 1970. Hg. Diplomatische me he was ready to leave Venice when the
Geheimschriften. Codex Vindobonensis 2398 der agreement between the Venetians and Francesco
Oesterreichischen Nationalbibliothek. Akad. Pitticino was stalled and that Francesco had and
Druck- u. Verl.-Anst., Graz. should have four thousand etс. In addition,
Aloys Meister. 1902. Die Anfaenge der modern neither he should sign? [an agreement], nor
diplomatischen Geheimschrift: Beitraege zur demonstrate ninety two thousand in gold or
Geschichte der italienischen Kryptographie des XV. under condition to be an ally, that he obtained at
Jahrhunderts. F. Schöning, Paderborn this delta [of Po], save for Piacenza, and he
Paola Sverzellati. 1998. Per la biografia di Nicodemo should keep those [lands] he already has, which
Tranchedini da Pontremoli, ambasciatore sforzesco. he has just conquered from Malatesta?, the
Aevum, 72(2):485-557. Venetians’ ally. In addition, he told me that
never did the Venetians plot to poison or kill the
Raffaella Maria Zaccaria. 2015. Ed. Il carteggio della
Signoria fiorentina all’epoca del cancellierato di count? by any other means and that they would
Carlo Marsuppini (1444-1453). Inventario e not change Sforza, the illustrious count, for
regesti. Pubblicazioni degli archivi di Stato di anyone else, to whom I always commend myself.
Firenze. Strumenti CXCIX. Edifir-Edizioni, Firenze. Florence, 23 Febr. In the hand of Nicodemus
1
Question mark indicates the signs, the
meaning of which is not 100 per cent certain.
7
Unsealing the Secret: Rebuilding the Renaissance French
Cryptographic Sources (1530-1630)
Camille Desenclos
CRÉSAT (EA-3436)
Université de Haute-Alsace
camille.desenclos@uha.fr
10
part of the art of diplomacy, this information cryptographic writing, which needed to be fast
needed to remain secret. Otherwise it would help and simple, after all. Through its many
other countries to break the French ciphers. That examples4 Brulart de Léon's treatise describes
can easily – though only partly – explain the cryptographic mechanisms, recommended
silence of these theoretical works. cipher-text characters, and so on. Following
Cryptographic treatises could therefore be Cicco Simonetta's work, this treatise went
useful sources both to learn how Renaissance further. It presents practical encryption processes
people understood the encryption process and which should enhance the protection of
how and why they used it. The first such work, information such as not leaving any space within
written in French, was nevertheless published the cipher-text; frequently using cipher-text
only at the very end of the 16 th century characters without any value (so-called nulls) so
(Vigenère, 1586). Blaise de Vigenère – like that rarely used characters would not lead to their
François Viète a few years after him – was in the value or nature; disguising the frequency of
service of the French King for several years. cipher-text characters, and so on. Brulart de Léon
Their proximity to centers of power suggests an thus proposed concrete rules for encryption. By
influence or even a participation in the following these recommendations, the writer
conception of ciphering tables. However, these could hide the origin of his letter and prevent any
treatises seem to have remained strictly interception. Brulart de Léon, as Cicco Simonetta
theoretical, even though some rare before him, probably dedicated his work to the
implementations could be observed, at least in state office. As a former diplomat, Brulart de
the 17th century (De Leeuw 2015). They Léon5 claimed to take advantage of his own
conceived complex cryptographic systems, but diplomatic experience and to propose various
they cannot be considered as practical encryption solutions to the main issues of the daily
manuals. Vigenère's work did indeed describe encryption practice that he himself has been
theoretical cryptographic mechanisms and tried faced. But even if Brulart de Léon' s
to conceive of a perfect, unbreakable, and thus recommendations paid better heed to the
almost ideal cipher2. Although Vigenère's work concrete diplomatic needs, they remained
was clearly not intended for regular users but complex, constraining and hardly compatible
only for other scholars or scientists, noblemen with the speed requested by diplomatic writing.
and diplomats could not use Vigenère's Whatever its initial or real goal, Brulart de
proposals, anyway, because of their lacks of Léon's work has stayed off the record. Only the
mathematical skills and their restricted writing original handwritten version has been preserved,
time. Yet if these works strictly remained and no written or printed copy has apparently
theoretical and had no influence on the been produced. Furthermore, its form looks more
cryptographic practice, Vigenère or Viète knew like a personal memorandum: there is no
real-life cryptography though not as authors of introduction and no inscription; the work has
cryptographic treatises but rather by working been preserved in the same manuscript along
directly with the regular creators of ciphers while with other personal notes and memorandums. If
decrypting enciphered letters for the Duke of Brulart de Léon's work was used by the state
Nevers (Vigenère) or for the French King office, it would have been preserved with the
(Viète). state office archives. Anyway, just like any other
Several technical diplomatic treatises, theoretical cryptographic treatise, Brulart de
however, such as Traicté des chiffres by Charles Léon's manuscript highlights only one aspect of
Brulart de Léon (circa 1630)3 intended to provide cryptographic practice. It describes the technical
practical solutions to the issues of daily aspects (how to choose cipher-text characters
2
Blaise de Vigenère explained it quite clearly in his 3
French National Library, fr. 17538, fol. 48sq.
dedicace to Antoine Séguier: “Ce traicté donques sera de
semblables usages de chiffres, diversifiez en plusieurs 4
Brulart de Léon's treatise, however, does not only present
manieres; tant pour incidemment parcourir ce qui se standard methods but also rare systems like a ciphering
presentera à propos de ces beaux et cachez mysteres, wheel.
adombrez sous l'escorce de l'escriture; que pour à l'imitation
5
de cela en trasser beaucoup de rares et à peu de gens Brulart de Léon has been ambassador to the Republic of
divulguez artifices […] et la plus grand' part provenans de Venice from 1611 to 1620, then extraordinary ambassador
nostre forge et meditation; non encore que nous scachions to the city of Avignon (1625) and to Switzerland (1628-
touchez jusques icy d'aucun” (Vigenère, 1586, page 4). 1630).
11
while conceiving ciphering tables, how to write tricky. They are rarely preserved within the same
cipher-text while drafting a letter) and tried to manuscript (or box) along with the related
improve them in order to increase the protection enciphered letters. For example, the cipher of
of information. But it never questions general Jean Hotman, the French resident to the Holy
aspects: what kind of information has to be Roman Empire between 1609 and 1614, can be
enciphered? Why? According to which found in the French National Library within the
principles? How were the ciphering tables used manuscript fr. 4030. On the other hand, his
and how were encryption and decipherment enciphered correspondence with the French King
operated? is now kept in manuscripts fr. 15924 to fr. 15930.
M o r e o v e r, t h e m a n i f o l d R e n a i s s a n c e
3 What About Primary Sources? denominations for the ciphering tables
(“jargon”7, “cipher”8, “key”9...) and the absence
Unlike their contemporary documentation about
of any name and/or date on the verso or on the
the cryptographic uses and practices, a
top of the ciphering tables seem to prevent
substantial amount of Renaissance French
historians from identifying the origins and usages
enciphered letters has been preserved. By
of these tables. Even more, the Renaissance
chance, most of them have survived with their
designations often mix tables and enciphered
deciphered text (in margins, between the lines or
letters. Both can be designated as ciphers
on a separate sheet). However, even if letters,
(“chiffres”)10. Thus identifying the right typology
like no other sources, transcribe perfectly the
of the cryptographic sources and matching
Renaissance cryptographic culture and practices,
enciphered letters to their related ciphering tables
enciphered letters present to historians a major
are real issues for historians.
issue. Not all letters contain a decipherment or, at
We have counted the cryptographic sources
least, their separate deciphered text has been lost.
which were already described and identified at
That can prevent the reading and understanding
the French National Library. In 2014, the catalog
of the content of such sources.
mentioned only 60 ciphering tables, a great deal
Upon receipt the state office systematically
less than their actual holdings. In fact, only one
wrote the decipherment on the letter so that the
fifth of the manuscripts are fully described in the
state secretary could more easily read the whole
catalog. In addition to cursory descriptions of the
piece. But if the recipient deciphered the letter
other manuscripts, some mistakes and omissions
himself, there was no need to rewrite the
(some bibliographic records were written in the
deciphered text on the original letter. The
19th century) have distorted these results, which
separate sheet on which the recipient processed
do not represent the diversity of political and
the decipherment could easily be lost, deleted or
diplomatic sources. Most of the diplomatic
even integrated into another set of documents.
correspondences, for example, are only described
The original enciphered letter thus becomes
i n a f e w w o r d s 11. A s us ual , pr i m ar y
unreadable for historians without the ciphering
table or cryptanalytic skills. Many letters still 7
French National Library, Cinq-Cent Colbert 474, fol. 1:
have kept their secrets6.
“Jargon au deschifre” [Deciphering jargon].
For several letters, however, their related
ciphering tables still exist. Although they should 8
French National Library, fr. 4053, fol. 57: “Chiffre de
be deleted at the end of each embassy or long- monseigneur le marechal” [Cipher of M. the marshal].
term correspondence, they have often been
9
preserved, sometimes by the state office itself, French National Library, fr. 3629, fol. 42: “Clef pour
and are now one of the most reliable sources of deschiffrer les lettres de Madame de Raiz” [Deciphering key
cryptographic uses and mechanisms. More than for the letters of Ms. de Raiz].
enciphered letters, ciphering tables make the 10
French National Library, fr. 3634, fol. 5: “Chiffre reçu le
understanding of the cryptographic systems and
dernier octobre à Meun par le duc de Nevers” [Cipher
their contextual or structural adaptations easier.
which was received the last day of October in Meun by the
However, the ciphering tables have faced duke of Nevers]. This cipher is thus a fully enciphered letter
different fates and their identification can be written to the Duke of Nevers.
6 11
No letter from Henri IV to François Savary de Brèves, The manuscript fr. 16113, for example, is labelled
French ambassador to the Ottoman Empire, can be read, as “Dépêches originales adressées à la Cour par divers
the decipherment has not been written directly on the ambassadeurs et agents français en Espagne” [Original
enciphered letters (French National Library, fr. 3541). letters from several French ambassadors and agents in Spain
12
cryptographic sources are widely dispersed, correspondences or reports) is systematically
poorly described or identified, or even checked. So far, 179 ciphering tables and more
completely missing. than 2 100 enciphered letters (including circa
Because of the need to access and analyze 200 non-deciphered letters) have been found and
cryptographic primary sources materially and described. At this stage, we presume that 50 non-
intellectually, a long-term research project is deciphered letters, and probably more in the
currently conducted in collaboration with the future, could be deciphered. Wherever possible,
French National Library in order to re-establish a this identification work leads to the correction of
direct contact with Renaissance French some incomplete bibliographic records. In
cryptographic sources. The French National addition we mention the presence of cipher-text
Library counts as the main repository for and/or decipherment, add dates and names if they
political and diplomatic sources (until the mid- can be restored, and so on. Nevertheless, only the
1620's). According to the Renaissance archival bibliographic records which already present a
practices, almost all diplomatic correspondences full description will be corrected. The aim of this
and reports before 1626 have made their way to project is not to completely describe the political
the French National Library, along with several and diplomatic collections at the French National
political correspondences from the second half of Library but to re-connect the cryptographic
the 16th century. Both kinds of sources are now sources to each other.
preserved in the collections of “manuscrits Thanks to this identification work, we should
français” [French manuscripts] and “nouvelles be able to cross-check ciphering tables and
acquisitions françaises” [French new enciphered letters and link the letters to their
acquisitions]. The “collection d'érudits” related tables. Each enciphered letter whose
[scholars' collection] presents exceptional ciphering table has not yet been found, will be
documents, too12. In fact, a major part of the compared to the anonymous ciphering tables.
French cryptographic sources (before the 1630's) The cross-checking (through cryptographic
is kept at the French National Library and forms systems and no more by names) will not be
a vast and representative corpus of Renaissance successful for each enciphered letter. More
cryptographic uses and practices, even if further enciphered letters from different writers than
research in the French National Archives and in ciphering tables have been preserved.
the French Diplomatic Archives will be Nevertheless, some ciphering tables, if still
mandatory. By studying the remaining ciphering missing, could be reconstructed thanks to the
tables, by observing actual encryption practices, decipherment in the letters. By comparing the
by analyzing additions on the verso or top of cipher-text and the deciphered text, the
ciphering tables, by comparing cryptographic cryptographic patterns could be understood and
systems, we hope to rebuild, at least partly, the restored to a great extent.
Renaissance French cryptographic practices, uses
and cultures. In that perspective, comparisons 4 First Results
with other European cryptographic practices
Our first results, though still incomplete, have
through case studies or similar projects such as
confirmed the real necessity to embrace
the one conducted by Benedek Lang (2018) on
cryptographic sources as material objects and to
Hungarian Early Modern cryptographic
look at them in a broader perspective. They are
practices, could lead to a useful, if not essential,
not only the implementation of cryptographic
distinction between European, “national” and
patterns, which could interest the history of
contextual cryptographic patterns.
sciences or technology, but a true testimony of a
As a first step in this ongoing project, we are
culture of political information. Facing only the
locating, identifying and dating every preserved
technical mechanisms is not enough. Of course,
ciphering table and enciphered letter. Every
both the history of technology and the history of
manuscript whose description suggests
sciences are essential to the understanding of the
cryptographic documents (mention of original
technical mechanisms and their evolutions. The
to the French Court]. However, it contains a ciphering table cryptographic sources, however, deserve to be
of André de Cochefilet, baron of Vaucelas, ambassador to s ubjected to different approaches and
Spain from 1609 to 1615. methodologies in order to merely surpass
political history or the history of technology.
12
The manuscript Clairambault 360 for example preserves More than any other political source,
the ciphering table of Henri IV and Maurice of Hesse.
13
cryptographic ones do indeed involve political, both were indeed designated as “chiffre” only. In
diplomatic, scientific, social and cultural history. the future, however, we must find out if both
Identifying cryptographic sources at the ciphering and deciphering tables were conceived
French National Library has required prior and written systematically, as they will be
research. In order to prevent hypotheses based on beginning with Rossignol, or if the existence of
better known, but modern, practices and uses, we one or the other relied on specific uses or users.
first needed to reassess the Renaissance patterns: Beyond the denominations, defining
which words referred to ciphers and similarities between the cryptographic processes
cryptographic practices? Did these is essential for the upcoming cross-checks. If the
denominations possess any specific value? French cryptographic systems were mostly based
Specific words can already be highlighted: on substitution, they became more complex and
“jargon”, “chiffre” [cipher], “clef” [key], rational during the 16th century. At the beginning
“deschiffre” [deciphered text], “table”. The word of the 16th century, ciphers only presented few
“jargon” especially referred systematically to the cipher-text characters (simple substitution and
same object and practice: a ciphering table using limited nomenclator). In the second half of the
a substitution system, by words and not by 16th century, though, especially from the 1580's,
characters (for example: the word rose for the homophonic substitution was introduced (two to
French King). Such tables were never called five cipher-text characters for one single plain-
anything else but “jargon”. On the contrary, the text letter), nomenclators were extended (around
word “chiffre” had many uses. If the main use, one hundred words on an average) and new
according to our modern practice, concerned cipher-text characters appeared: characters
ciphering tables, fully enciphered reports or without any value, canceling characters,
anonymous letters were sometimes designated repeating characters 14. These improvements,
(on the verso) as “chiffres”, too. The origin of however, did not go as far as the theoretical
this confusion could be related to the use, by recommendations of cryptographers were
diplomats mostly, of the expression “en chiffre” concerned. A large majority of ciphers, even in
[with cipher]. Moreover, if “deschiffre” is an the 1620's, still relied on homophonic
early modern word for both decipherment and substitution, much simpler for diplomats and
deciphered text, the cipher-text was hardly ever noblemen.
designated by “chiffre” but by “en chiffre” [with The fast technical evolution of French
cipher/enciphered]. The denominations of cryptographic practices makes the identification
cryptographic tools and productions were not yet of ciphering tables easier and confers to our
standardized: marginal mentions rarely described study an extra historical perspective. For each
the typology of documents in detail but provided ciphering table, its main features are highlighted
names or dates13. These mentions aimed to make and analyzed: enciphering pattern (simple
the identification of the document easier and substitution, homophonic substitution, jargon,
quicker for the state office, the diplomats or more nomenclators), type of cipher-text characters
generally its recipient. Ciphering tables were (Latin and/or Greek alphabet and/or numbers
often sent as attachment or handed over in and/or symbols). Most of the time, a careful
person; there was then no need for any additional analysis can lead to a precise dating (to within
mentions. At last, the expression “ciphering one or two decades at most). In addition to this
table” comes from modern usages. Renaissance first analysis, we get a closer look at the cipher-
cryptography was not yet practiced as an applied text characters as they give us the best clues for
science. It still relied on a spontaneous approach the cross-checking stage. In the same way that
as shown by the alphabetical and thematic encryption patterns have been improved, the
organization within ciphering tables. Everyone form of the cipher-text characters has become
had to be able to use such tables, even without more and more rational and easy to generate so
any cryptographic or algorithmic knowledge. that they can be written and read faster. From
Thus the distinction between ciphering tables and symbols or highly-modified Latin characters15,
deciphering tables was probably spontaneous;
14
See for example, French National Library, fr. 3668,
13
French National Library, fr. 3462, n.f.: “Chiffre reformé fol. 72: Cipher of the count of Tillières and the French King,
pour Levant duquel a esté envoyé un double à Monsieur de 1625.
Breves ambassadeur en avril 1604” [Modified cipher for the
15
Levant whose duplicate has been sent to M. de Breves, See, for example, French National Library, fr. 3329, fol. 2:
ambassador in April 1604]. Cipher of Jacques d'Humières and François de Balzac.
14
cipher-text characters were increasingly (few lines, one page or the whole letter) will help
transformed into numbers. Symbolic characters, to better understand the needs for writing secret
which are easy to spot, are thus a specific feature information.
of the first half of the 16th century, even if, until Diplomacy was obviously the main, and by
the 1580's, some examples, mostly in political the way first, user of ciphers. The oldest
ciphers, can still be found. From the 1560's on, enciphered letter which so far we have found in
however, symbols gradually disappeared and the French National Library, dates from 1526 and
were replaced by numbers or Latin or Greek comes from a French diplomatic agent in
characters. Therefore the presence of symbols Rome16. A vast majority of enciphered letters
within a cipher is a significant clue about the from the first half of the 16th century and beyond
date or, for ciphers after 1590, a real specific that from the first decades of the 17th century
feature. Nomenclators finally help dating the comes from the diplomatic practice. However, if
ciphering tables. The names within the ciphers were an essential tool for diplomacy, they
nomenclators represented indeed the main were not used systematically and on a daily
noblemen, ministers, clergymen or diplomats basis. Not every diplomatic agent was provided
from a specific time. For example, the with a ciphering table. Only the high-ranking
anonymous ciphering table in the manuscript diplomats possessed one or several such
fr. 3329 can be dated from 1574-1577 as the instruments. In fact, ciphering tables replicated
nomenclator includes the marshal of Montluc. the diplomatic hierarchy. On the contrary, the
Blaise de Montluc had been appointed marshal in political use of ciphers was not based on
1574 and died in 1577. Such a precise dating is hierarchy but only on needs, as it was not linked
of course not always possible, especially for the to professional or temporary tenure. Although
oldest ciphers which did not use large ciphers were only used by the French diplomacy
nomenclators. A date range can still be defined in during the first half of the 16th century, the
those cases. At last, nomenclators, as well as the French nobility started to also use ciphers during
improvements of the cryptographic patterns and the second half of the 16th century. Noblemen did
characters, inform on the circumstances of their not yet use ciphers for personal matters (Lang,
usage. A high proportion of foreign names 2014) but only for political purposes. Far from
reveals a diplomatic use, and if a country, for being anecdotal this use increased from the end
instance Spain, is more represented, it is highly of the 1570's and became more diversified in its
likely the cipher was used by a French practices and the origin of its users. That reveals
ambassador to Spain. how deeply ciphers were interwoven with the
But whatever its rise, cryptography is not used custom of political writing. The agitated political
in every Renaissance correspondence. The context in France substantially explains this
operation remained arduous both for writers and evolution: noblemen were watched by the royal
readers and was limited to what was considered power, French or foreign factions were watching
as crucial or secret information. Thus the each other …. Thus ciphers became essential to
presence and amount of cipher-text reveal the the political correspondence: they protected
significant political value of the text. However, information, reputation and sometimes physical
this was not representative of a specific time. integrity. Further counts and identifications will
Fully enciphered letters were already written in insert these examples from the 1580's in a
the first half of the 16th century, and an integral broader perspective. The corpus that is finally
encryption never became a standard. In addition expected should provide some more information
to their long and arduous writing, ciphers did not about the proportion of each use (political or
need to be systematical but, on the contrary, had diplomatic). We hope thereby to be able to
to adapt themselves as much as possible to the predict more precisely the motivations for the
evolving contextual needs: diplomatic conflict, political use of ciphers and confirm, or
war, insecure postal routes, and so on. invalidate, the current example of the 1580's. The
Information was not by nature secret; only the political use of ciphers could be permanent or
collecting of information and/or its use within a strictly limited to momentary needs (mostly
given context made its veiling inevitable. In the during disorders).
future, these usage hypotheses will require 16
French National Library, fr. 2984: letters from Nicolas
broader statistics, but the persistent general
Raincé to Anne de Montmorency. Nicolas Raincé was the
writing patterns seem logical for now. Moreover,
secretary of Jean du Bellay, cardinal and French
the amount of cipher-text within a given letter representative in Rome.
15
Anyway, political ciphers were not as advanced practiced in Renaissance France. We aim to
as their diplomatic counterparts. The needs were bridge a substantial divide between the
very different: users were not “professional” production and collect of information and the
agents; they had not been trained and this decision-making process: the material process of
political use was still rather new. Above all the the writing of political information. From then
main need remained speed, before safety. Except we could re-build a history of Renaissance
in some rare cases like Vigenère who served the French cryptography, not only in the perspective
Duke of Nevers in the 1580's, cryptographers of the history of sciences but also as part of a
served the French King, not other noblemen. The global history of information.
reduced complexity of their ciphering tables was
completely logical: there were more symbols; Acknowledgments
nomenclators were shorter, and homophonic
substitution was less advanced. The differences Part of the project has been supported by a Mark
between political ciphers, mostly the ones during Pigott Research Grant (2015). We would like to
the Catholic League, and diplomatic ciphers thank Gerhard F. Straßer and the anonymous
reveal how French diplomacy mastered the reviewers for their valuable comments,
cryptographic practice and did its best to meet suggestions and help with grammatical and
the agents' daily needs by constantly improving lexical issues.
the protection of information and facilitating the
encryption and decipherment operations. References
Renaissance French diplomacy acted like a François de Callières. 1716. De la manière de
laboratory in which ciphers and their négocier avec les souverains. La Compagnie,
implementation were constantly tested and Amsterdam.
improved. Its practices and patterns were then Jean-Pierre Devos. 1950. Les chiffres de Philippe II
reused in wider circles, few years or decades (1555-1598) et du Despacho universal durant le
after their conception by the French diplomacy. XVIIe siècle. Académie royale de Belgique,
Bruxelles.
5 Conclusion
Alain Hugon. 2004. Au service du Roi Catholique,
Two essential elements have been highlighted “honorables ambassadeurs” et “divins espions”:
during these first years of our research project: représentations diplomatiques et service secret
dans les relations hispano-françaises de 1598 à
the need for a methodology which is adapted to
1635. Casa de Velasquez, Madrid.
the cryptographic features (in order to proceed
successfully to the identification, analysis and David Kahn. 1996. The Codebreakers. Simon and
cross-checking of our ongoing corpus) as well as Schuster, New York.
the need for studying not only the ciphers but the Benedek Lang. 2014. “People's Secrets: Towards a
general context in which they were employed. If Social History of Early Modern Cryptography”. In
this research project is far from completion, The Sixteenth Century Journal. 45.2: 291-308.
some hypotheses can be stated about the general Benedek Lang. 2018. “Real-Life Cryptology:
uses and issues of cryptography within the Enciphering Practice in Early Modern Hungary”.
political and diplomatic society of the French In Katherine Ellison and Susan Kim (eds), A
Renaissance. Building on these first results and Material History of Medieval and Early Modern
future findings, we aim to study the encryption Ciphers: Cryptography and the History of
mechanisms and their improvements until Literacy. Routledge, New York/London, pages
cryptography became an applied science. We will 223-240.
thus be able to observe the intellectual evolution Karl de Leeuw. 2014. “Books, Science, and the Rise
of French cryptography: its increased use in of the Black Chambers in Early Modern Europe”.
political and diplomatic correspondences; the In Anne-Simone Rous and Martin Mulsow (dir.),
dichotomy between the practiced cryptography Geheime Post: Kryptologie und Steganographie
and its theory; the variations between the der diplomatischen Korrespondenz europaïscher
diplomatic or political cryptographic uses. We Höfe während der Frühen Neuzeit. Duncker &
hope to highlight the adaptation of ciphers to Humblot, Berlin, pages 87-99.
geographical locations and/or to political and Claire Martin. 2010. Mémoires de Benjamin Aubéry
diplomatic context and finally to understand the du Maurier, ambassadeur protestant de Louis XIII
refinement and complexity of cryptography as (1566-1636). Droz, Genève.
16
Jacques de Monts-de-Savasse, “Les chiffres de la Jean-Michel Ribera. 2007. Diplomatie et espionnage:
correspondance diplomatique des ambassadeurs les ambassadeurs du roi de France auprès de
d'Henri IV en l'année 1590”. In Pierre Albert (dir.). Philippe II du traité du Cateau-Cambrésis (1559)
1997. Correspondance jadis et naguère: congrès à la mort de Henri III (1589). Honoré Champion,
national des sociétés historiques et scientifiques. Paris.
Comité des travaux historiques et scientifiques.
Alain Tallon. 2010. L'Europe au XVIe siècle: États et
Paris, pages 219-228.
relations internationales. Presses universitaires de
Valérie Nachef, Jacques Patarin and Armel Dubois- France, Paris.
Nayt. 2016. “Mary of Guise's Enciphered Letters”.
Blaise de Vigenère. 1586. Traicté des chiffres ou
In Peter Y. A. Ryan, David Naccache and Jean-
secretes manieres d'escrire. Abel L'Angelier, Paris.
Jacques Quisquater (eds.). The New Codebreakers:
Essays Dedicated to David Kahn on the Occasion
of His 85th Birthday. Springer, Berlin, pages 3-24.
17
C RYPTANALYSIS OF C LASSICAL A LGORITHMS
20
Uruguayan cryptographic carpet
Juan José Cabezas Joachim von zur Gathen Jorge Tiscornia
Instituto de Computación B-IT, Universität Bonn Presidencia Uruguay
Facultad de Ingeniería Germany Montevideo, Uruguay
Universidad de la República gathen@ jtiscornia@
Montevideo, Uruguay bit.uni-bonn.de presidencia.gub.uy
jcabezas@fing.edu.uy
testimony of its type concerning this historic pe- scribes the Tupamaros’ situation as well: “It was
riod. the best of times, it was the worst of times, it was
The only other encryption methods by weaving the age of wisdom, it was the age of foolishness,
or knotting that we are aware of are the Inca qui- it was the epoch of belief, it was the epoch of in-
pus and the encrypted quilts on the underground credulity, it was the season of Light, it was the sea-
railway for black slaves fleeing from the USA to son of Darkness, it was the spring of hope, it was
Canada in the 19th century (see Tobin (1999)). the winter of despair, we had everything before us,
we had nothing before us, we were all going di-
The Tupamaros were inspired by Dickens’ Tale rect to Heaven, we were all going direct the other
of Two Cities (Dickens, 1859). It distills the atmo- way.”.
sphere in pre-revolutionary France around 1789.
A tough tavern owner, a central character in the 2 The encryption method
book, works as the eyes and ears of the pre-
revolution. She knits diligently accounts into her In the following, we distinguish typographically
knitware, of evil persons, their deeds, and of spies, between ciphertext and plaintext.
dreaming of future vengeance. “Knitted, in her The coding method, communicated by Ricardo
own stitches and her own symbols, it will always García to us, uses a simple substitution on 18 let-
be as plain to her as the sun.” For any malefac- ters. Each letter is encoded by a pair of horizon-
tor, it would be easier “to erase himself from ex- tally adjacent colors, allowing six colors for the
istence, than to erase one letter of his name or first item and three for the second one.
crimes from the knitted register of Madame De- The encoding goes as follows:
farge.” Dickens says nothing about her encryption O G B L W P color
method, but these words were enough to fire the M A R K O S O
penitentiary inmates’ imagination and inspire the D I N T E L G
idea of their carpet. Dickens’ poetic introductory J U P V H F B
sentence refers to the French Revolution, but it de- Table 1: The code.
22
using the following six colors: columns. Moths have totally or partially damaged
color abbr about 10% of the color pairs, and the borders are
orange, yellow O frayed, as is visible in the figures.
dark green G We distilled from a high-resolution digital im-
beige, pink, ochre B age of the carpet a machine-readable matrix of col-
light green L ors. Our software then transformed the resulting
white, light gray W RGB values to one of the six colors. This had
purple P to be done in a robust way so that similar colors
Table 2: The colors. were transformed to the same value, but distinct
colors were properly distinguished. Then adjacent
The lines stand for Marcos, (door) lintel, and
pairs of knots were deciphered as individual let-
JUP (see above). VHF presumably is a filler with
ters, proper word separations introduced, and the
the remaining letters and unlikely to refer to very
final decipherment produced. This was less than
high frequency. The system thus employs a re-
straightforward, and we encountered the following
duced alphabet of 18 letters: A, D, E, F, H, I, J, K,
problems.
L, M, N, O, P, R, S, T, U, V. Taking into account the
phonetics and orthography of the Latin American 1. Damage to the carpet.
Spanish language, some of them represent more
than one letter: 2. About 5% of the colors ran into adjacent
knots, affecting their (automatic) legibility.
• K represents the letter k, and qu, and also c
when pronounced like k. 3. After 35 years, colors have degraded. In
some parts, it is difficult to distinguish be-
• V represents v and b. tween white and ochre, and between light and
• S represents s, and also c when pronounced dark green.
like s. 4. The reading ambiguities mentioned above
• J represents j and g. sometimes make interpretation difficult. For
example, the string AKNTPERONOA is to be
• I represents i and y. read as a CNT pero no a . . . (to CNT but
not to . . . ) where CNT is the Convención Na-
It was not meant as a simplified orthography of
cional de Trabajadores (National Convention
Spanish, but can be taken as such. (In today’s text
of Workers).
messages in Spanish, K is often used for qu.) For
example, the list of color pairs 4 Decryption
(W,O),(P,G),(G,O),(L,O),
The digitized image is presented in Postscript,
(W,G),(L,G),(G,O),(P,G)
which is then converted to plaintext. We illustrate
encrypts the text OLAKETAL. Interword spaces, the process on the carpet’s second line, magnified
the (silent) initial H, and punctuation marks are not in Figures 3 through 5 and its decryption in Figure
present, and the list represents the phrase Hola que 6.
tal (Hi, how are you?). Each dot of color is given as an RGB (Newman
The absence of spaces, accents, and punctua- and Sprouil, 1983) triple of red, green, and blue
tion marks means that a string of colors may rep- values, each ranging from 0 to 25.
resent more than one grammatically and semanti- We took samples from various sections of the
cally valid text. In our decipherment, such ambi- carpet and determined the range of RGB values
guities were an obstacle, and this might be worse for each color, and also the fractions of these val-
if this method was used elsewhere without the a ues, in order to be able to account for dark or
priori knowledge we had about the context. light sections. That is, 6 7 8 and 7 8 9 repre-
sent the same hue, the latter slightly lighter than
3 The current state of the carpet
the former. Overlap of these values and fractions
The carpet measures 55.3 × 36.8 cm and contains occurred mainly for G (dark green) and L (light
almost 13 000 knots for about 6400 encrypted let- green), and for B (beige) and W (white). Since G
ters, arranged in a matrix of 67 rows and 96 and B occur more frequently in the carpet than L
23
Figure 2: The bottom left shows moth damage and lost material on the fringes.
Figure 3: The second line, left part. It starts at top left and ends at bottom right.
GO BO GO PB GO PG WG BO WO PG GB WG OB WO GG OG
a r a f a l e r o l u e j o i d
GO LB GG WG OB WO PO PO WG BB BO WO OG GB PO WG
a v i e j o s s e p r o d u s e
GO BG G? ?? LO GG GO OG GB BO GO BG LG WG
a n a r k i a d u r a n t e
Figure 6: Part of the second line transcribed (upper line) and decrypted (lower line).
24
and W, respectively, we opted to use the former in N[U|*L(SLF)ARD(DINTEL)SARROLLOIDAR
ambiguous cases. ESPLIKASION[I|*K[D|*EP*OTA
The next step was to convert these color val- TRESTENDENSIAAN<I>TIM<D>
ues into text. We covered each knot in the carpet LN[a|*(KTV)J[N|*FALSO*
by small horizontal rectangles, whose width was
about that of the whole knot. Our matrix algorithm N[U|*L(SLF)AR DESARROLLO I DAR
(James D. Foley and Hughes, 1990) scanned each ESPLIKASION [I|*K[D|*EP*OTA
rectangle and looked for a dominant color among TRES TENDENSIA AN<I>TIM<D>LN[a|
the rectangles of each knot. If a dominant color (KTV)J[N|* FALSO
was found, it was considered the color of the knot.
In a pair of adjacent knots starting in an “even” N[U|*L(SLF)AR DESARROLLO I DAR
position, there are six possible colors for its first ESPLIKASION [I|*K[D|*EP*OTA
member and three for its second one. Invalid dom- TRES TENDENSIA ANTI MLN [a|
inant colors, damaged areas, and the absence of a (KTV)J[N|* FALSO
dominant color are also reported.
In cases of doubt, we also employed a vec- pULSAR DESARROLLO I DAR
tor search using vertical vectors in the middle of ESPLIKASION DErrOTA
the knot. If two or more occurrences of a valid TRES TENDENSIA ANTI MLN
color were detected, then four vertical vectors de- KoN FALSO
termined its dominance in the knot. . . . pulsar desarrollo y dar explicación derrota.
In the end, we obtained a sequence of knot col- Tres - tendencia anti-MLN con falso ... (. . . further
ors, with numerous unresolved cases. the development and give an explanation of [our]
Bypassing the intermediate steps explained be- defeat. Third—anti-MLN tendency with false. . . )
low, we take the carpet’s second line as an example A translation into plaintext would have been
in Figures 3 through 6. hard without the personal acquaintance of Tis-
cornia with the situation and political context of
5 Recovering the plaintext the MLN and the penitentiary at Libertad around
1980.
In another example, from the carpet’s third line,
we illustrate some of the steps from the raw iden- 6 Conclusions
tification of letters, with many unclear positions,
into plaintext. This was no easy task and re- We have recovered the plaintext of about 95% of
quired substantial manual intervention. We first the carpet; only small parts of it are damaged be-
discovered some obvious plaintext snippets, then yond recognition. The carpet is now becoming a
searched for valid words before and after such valuable testimony of Uruguayan and Latin Amer-
pieces, and in the case of damaged knots, had to ican history.
visually inspect the carpet. From a cryptographical point of view, the fol-
Letters without parenthesis or bracket are con- lowing aspects are particularly interesting:
sidered correct by the software. * is an un-
recognized letter, (MARKOS), (DINTEL), and • The inmates have succeeded in encrypting a
(JUPVHF) show six possible choices, and sim- substantial amount of information by means
ilarly (MDJ), (AIU), (RNJ), (KTV), (OEH), and of material accessible to prisoners in the pen-
(SLF) correspond to three possibilities. These in- itentiary.
dicate that the first or second color of a pair, re-
spectively, may be damaged. [c1|c2 means that • The construction of the carpet presumably in-
letter c1 seems more likely than letter c2 in this volved a large number of hours, but then,
place, and < c > indicates the letter c, but with time is the one thing that is abundant in
low probability. prison.
So here are the steps from the original letters
to plaintext. Lower case letters in the fourth text • The prisoners used a simple coding mecha-
present guessed corrections of the automatic color nism in a clever way which even fooled the
readings. exit checks at the prison.
25
In terms of cryptanalytic techniques, our task https://www.fing.edu.uy/˜jcabezas/
was trivial once the sequence of colors was estab- papers/ElTapizMLN2015.pdf.
lished. However, given just that sequence with its
many errors and ambiguities, it is not clear how
easy this task would have been without the knowl- [First line.] ...ona: sólo tu debe conocer vía
edge of the encryption in Table 2. y forma. Traduce esto, al final te aclaro. Hazlo
In synthesis, we have an original piece of cryp- llevar ...ama Falero.
tography, well conceived and well implemented. It [Introduction.] Luego ida viejos se produce
was secure, efficient, and economic under the dire anarquía durante tres años por causas:
circumstances of the penitentiary. The fact that we 1. pérdida confianza política y personal a todo
could decipher it 35 years later shows the success nivel,
of their method. 2. incapacidad dirigentes de impulsar desar-
rollo y dar explicación derrota y
7 The first lines of the computer
generated transcription. 3. tendencia anti-MLN con falso marxismo-
leninismo, destruyendo en vez de elevar, todo
The first four lines of the transcription read as fol- esto con poca comunicación y tensión repre-
siva.
lows:
(DINTEL)MD[A|*FA(DINTEL)*E*OLUEJO Arriba hay confianza y crece, fierreros y divi-
sionistas retroceden.
IDAVIEJO(MARKOS)SP<M>RODUSE
El correcto marxismo-leninismo va mas lento,
ANUJKIAD(AIU)RA(DINTEL)TETRESA(RNP)
dirección autocrítica MLN, un paso necesario y
(DINTEL)OSPORKAUSASUNO(SLF)
defectuoso.
[Previous events.] Luego del año 55, la
[*|*E(MARKOS)II<*>DI(KTV) izquierda marxista será determinada por dos he-
(JUPVHF)(DINTEL)FIATS chos:
APO<*>LITIKAI<P>UER 1. lucha de clases desatada por crisis
SONA(DINTEL)AT(JUPVHF) económica
DONI(KTV)E(DINTEL)
I[I|*O<*>SINKANASDI<A>RIJE< 2. discusión ideológica internacional entre vía
violenta o pacífica al socialismo.1
T>ITESN<*>EIM(SLF)
Acknowledgements
N[U|*L(SLF)ARD(DINTEL)SARROLLOIDAR
ESPLIKASION[I|*K[D|*EP*OTATR Alfredo “Tuba” Viola brought the authors to-
ESTENDENSIAAN<I>TIM<D>L gether during a course given by the second author
N[a|*(KTV)J[N|*FALSO* in Montevideo. Without his support, this paper
would not have come into being, and we thank him
DP[R|*SLE<I>NINDESTRUIENDO for it.
ENVE(MARKOS) About the authors. JJC is a professor of com-
DEELEV[N|*V[R|*T<*>ODOESTOK puter science at the Instituto de Computación in
ON(JUPVHF)OKAKOM<*>UNIA<I>D the Universidad de la República, Uruguay. He was
[A|*SIONITE(DINTEL)SIO* 1 [M]ona: only you must know the method and form.
Translate this, at the end I explain it. Take it [. . . ama] Falero.
Distilling cleartext from this is not always obvi- After the older leaders left, we had anarchy during three
ous. years for various reasons: 1. loss of political and personal
confidence at all levels, 2. inability of the leaders to further
development and explain our defeat, 3. anti-MLN tendency
8 Parts of the carpet’s cleartext. with false marxism-leninism, destroying rather than elevat-
ing, all this with little communication and repressive tension.
We present the initial part of the cleartext. The On the higher floors [where the leaders were housed, floors
opinions and political points of view expressed in 3 to 5] we have growing confidence, [but] warriors [who
want to continue the armed struggle] and divisionists [who
this document do not, in any way, reflect necessar- prefer a political party for the struggle] retreat. The correct
ily those of the authors or their institutions. Com- marxism-leninism goes more slowly, in the direction of MLN
self-criticism, a necessary step that is missing.
ments between brackets are the authors’. Since 1955, the marxist left has been determined by two
facts: 1. class struggle unleashed by the economic crisis,
A complete version of the clear- 2. international ideological discussion between violent and
text (in Spanish) can be found at peaceful road to socialism.
26
severely injured in 1970 while manufacturing a
bomb in his workshop and fled the country, hidden
in the trunk of a car. JvzG is an emeritus professor
of computer science at the Universität Bonn, Ger-
many, and has no experience in building bombs.
JT (Jorge Carlos Tiscornia Bazzi) is an Uruguayan
writer and was a member of the Tupamaro Colona
15, together with JJC. He now works at the Presi-
dencia Uruguay. During his 4646 days in the pen-
itentiary of Libertad, from 1972 to 1985, he kept
a secret diary on small slips of paper, normally
used to roll cigarettes. He hid them in wooden
clogs that he used in the shower. They are now
published (Tiscornia (2012)) and provide moving
insights into the (in)human conditions in prison,
see also Tiscornia (2014). They were turned into a
documentary movie (Charlo (2014)).
References
José Pedro Charlo. 2014. El almanaque. Documentary
movie. Argentina, Spain, Uruguay.
Charles Dickens. 1859. A Tale of Two Cities. Chap-
man & Hall, London.
Steven K. Feiner James D. Foley, Andries van Dam and
John F. Hughes. 1990. Computer Graphics Princi-
ples and Practice. Adisson Wesley.
27
Solving Classical Ciphers with CrypTool 2
Nils Kopal
Applied Information Security – University of Kassel
Pfannkuchstr. 1, 34121 Kassel, Germany
nils.kopal@uni-kassel.de
30
to break a substitution ciphers aims at recovering we have either a plaintext, a monoalphabetic sub-
the original letter distribution. stituted text, or a transposed text. And it is prob-
Homophone substitutions as well as polyalpha- ably German. On the other hand, having an IC
betic substitutions flatten the distribution of let- close to 3.8% indicates that we have a polyalpha-
ters, hence, aiming to destroy the possibility to betic encrypted text. Clearly, the IC is more ac-
break the cipher with statistics. Nevertheless, hav- curate having long ciphertexts. Identification of
ing enough ciphertext and using sophisticated al- homophone ciphers can be done by counting the
gorithms, e.g. hill climbing and simulated anneal- number of different used letters or symbols. If the
ing, it is still possible to break them. number is above the expected alphabet size, it is
Transposition ciphers can also be attacked with probably a homophone substitution.
the help of statistics. Since transposition ciphers State-of-the-art for breaking classical ciphers
do not change the letters, the frequency of the un- are search metaheuristics (Lasry, 2018). Because
igrams in plaintext and ciphertext are exactly the with classical ciphers, a “better guessed key” often
same. Thus, to break transposition ciphers, text yields a “better decryption” of a ciphertext, such
statistics of higher orders (bigrams, trigrams, tetra- algorithms are able to “improve” a key to come
grams, or n-grams in general) are used to break close to the correct key and often finally reveal
them. Besides that, similar sophisticated algo- the correct key. “Better” in this context means,
rithms, e.g. hill climbing and simulated annealing, that the putative plaintext that is obtained by de-
are used to break transposition ciphers. crypting a given ciphertext is rated higher by a
so-called cost or fitness function. An example for
For breaking a classical cipher, it is useful to
such a function is the aforementioned IoC, which
know the language of the plaintext. It is possible to
comes close to a value indicating natural language
break a cipher using a “wrong” language, but the
when the key comes closer to the original one. A
correct one yields a higher chance of success. For
common and very successfully used search meta-
cryptanalysis most of the algorithms implemented
heuristic is hill climbing. A hill climbing algo-
in CT2 contain a set of multiple languages, e.g.
rithm first randomly guesses a putative “start key”.
English, German, French, Spanish, Italian, Latin,
Then, it rates its cost value using a cost function.
and Greek. In many cases, the language of an en-
After that, it tries to “improve” the key by ran-
crypted book is known to the cryptanalyst or can
domly changing elements of the key. With the
be guessed by its (historical) context.
Vigenère cipher for example, it would change the
To identify the type of the cipher, whether it is a first letter of the keyword. After changing the let-
substitution cipher or a transposition cipher, crypt- ter, it again computes the cost function. If the re-
analysts use the Index of Coincidence (IC) (Fried- sult is higher than for the previous key, the new
man, 1987). The IC, invented by William Fried- key is accepted. Otherwise, the new key is dis-
man, is the probability of two randomly drawn let- carded and another modified one is tested. The
ters out of a text to be identical. For English texts algorithm performs these steps until no new mod-
the IC is about 6.6% and for German texts about ified key can be found that yields a higher cost
7.8%. Simple monoalphabetic encryption, where value, i.e. the hill (= local maximum) of the fitness
a single letter is replaced by another letter, does score is reached. Most of our classical cryptana-
not change the IC of the text. Same applies to all lytic implementations in CT2 are based on such a
transposition ciphers, since these do not change hill climbing approach.
the text frequencies. Polyalphabetic substitution
aims at changing the letter distribution of a text to
3 An Introduction to CrypTool 2
become the uniform distribution. Thus, the IC is
1
about 26 ≈ 3.8% (where 26 is the length of the ci- CrypTool 2 (CT2) is an open-source tool for e-
phertext alphabet and all letters are used equally learning cryptology. The CrypTool community
distributed). Homophone substitution also aims at aims to integrate into CT2 the best known and
changing the letter distribution of a text to become most powerful algorithms to automatically break
the uniform distribution, but here the IC is about (classical and modern) ciphers. Additionally, our
1
n , where n is the amount of different symbols in goal is to make CT2 a tool that can be used by ev-
the text. eryone who needs to break a classical cipher. An-
Thus, having an IC close to 6.6% indicates that other well-known Windows analyzer for classical
31
ciphers is CryptoCrack (Pilcrow, 2018).
CT2 consists of a set of six main components:
the Startcenter, the Wizard, the WorkspaceMan-
ager, the Online Help, the templates, and the Cryp-
Cloud, which we present in detail in the following.
The Startcenter is the first screen appearing
when CT2 starts. From here, a user can come to
every other component by just clicking an icon.
The Wizard is intended for CT2 users that are
not yet very familiar with the topics cryptography
or cryptanalysis. The user just selects step by step
what he wants to do. The wizard displays at each
step a small set of choices for the user.
The WorkspaceManager is the heart of CT2
since it enables the user to create arbitrary cas-
cades of ciphers and cryptanalysis methods us-
ing graphical icons (components) that can be
connected. To create a cascade, the user may
drag&drop components (ciphers, analysis meth-
ods, and tools) onto the so-called workspace. Af-
ter that, he has to connect the components using
the connectors of each component. This can be
done by dragging connection lines between the in- Figure 1: CT2 Workspace with Caesar Cipher
puts (small triangles) and outputs (also small trian-
gles) using the mouse. Data in CT2 can be of dif- CT2 contains a huge Online Help describing
ferent types, e.g. text, numbers, binary data. The each component. By pressing F1 on a selected
type of data is indicated by a unique color. A sim- component of the WorkspaceManager, CT2 auto-
ple rule is, that connections between the same col- matically opens the online help of the correspond-
ors are always possible. Connections between dif- ing component.
ferent colors (data types) may also be possible, but CT2 also contains a huge set of more than 200
then data has to be converted. CT2 can do this au- so-called Templates. A template shows how to
tomatically in many cases, but sometimes special create a specific cipher or a cryptanalytic scenario
data converters are needed. using the graphical programming language and is
Figure 1 shows a sample workspace containing ready to use. The Startcenter contains a search
a so-called Caesar cipher (very simple monoal- field that enables the user to search for specific
phabetic substitution) component, a TextInput templates using keywords.
component enabling the user to enter text, and Finally, the CrypCloud (Kopal, 2018) is a
a TextOutput component displaying the final en- cloud framework built in CT2. We developed it
crypted text. The connectors are the small colored as a real-world prototype for evaluating distribu-
triangles. The connections are the lines between tion algorithms for distributed cryptanalysis using
the triangles. The color of the connectors and con- a multitude of computers.
nections indicate the data types (here text). When
the user wants to execute the flow, he has to start it 4 A Step-by-Step Approach for
by hitting the Play button in the top menu of CT2. Analyzing Classical Ciphers in
Currently, CT2 contains more than 160 different CrypTool 2
components for encryption, decryption, cryptanal-
ysis, etc. Many components that can be put onto In this section, we show a step-by-step approach
the workspace have a special visualization that can for analyzing classical ciphers in CT2. The first
be viewed when opening the component by double step is to make the cipher processable for CT2, so
clicking on it. Figure 2 shows such a maximized we create a digital transcription of the ciphertext.
visualization of a standard component. Then, we identify the type of the cipher. The third
32
step then finally breaks the cipher with CT2.
33
component which automatically solved a Vigenère
cipher (“The Declaration of Independence” of the
US, encrypted with a Vigenère cipher. The solver
automatically tested every keylength between 5
and 20 using hill climbing. Only about ten sec-
onds are needed for the component to automati-
cally break the cipher. The decrypted text is auto-
matically outputted by the component and can be
displayed by an TextOutput component.
34
Vigenère Analyzer component to break it.
We automatically test all key lengths between 1
and 20. Figure 12 shows the final result of the
Vigenère Analyzer component. The component
displays a toplist of “best” decryptions based on
a cost function that rates the quality of the de-
crypted texts. The higher the cost value (sum of n-
gram probabilities of English language) the higher
Figure 8: Encrypted Message in a Bottle Sent by the place in the toplist. Furthermore, the com-
General Johnston ponent shows the used keyword or pass phrase.
With “MANCHESTERBLUFF” (15 letters), the
message can be broken. The analysis run took 5
seconds on a standard desktop computer with 2.4
GHz. We present the final plaintext in Figure 13.
35
Figure 12: Breaking the Encrypted Message in a
Bottle with the Vigenère Analyzer
Figure 15: Letter Frequency Analysis of the Borg
Cipher
36
Figure 18: Borg Cipher – Revealed Plaintext by
Monoalphabetic Substitution Analyzer
37
bets. Currently, the monoalphabetic substitution coincidence and its applications in cryptanalysis.
analyzer needs (for the transcription) a specific in- Aegean Park Press California.
put alphabet consisting of Latin letters. Till end of
James J Gillogly. 1995. Ciphertext-Only Cryptanaly-
2018 all kind of symbols a computer can process sis of Enigma. Cryptologia, 19(4):405–413.
will be possible (e.g. a support of UTF-8 charac-
ters). Furthermore, new kinds of classical ciphers Nils Kopal, Olga Kieselmann, Arno Wacker, and Bern-
and cryptanalytic methods will be added. Exam- hard Esslinger. 2014. CrypTool 2.0. Datenschutz
und Datensicherheit-DuD, 38(10):701–708.
ples are grilles and codebooks, which were exten-
sively used in history. Nils Kopal. 2018. Secure Volunteer Comput-
The CT2 team highly welcomes suggestions, ing for Distributed Cryptanalysis. http:
wishes, and ideas of historians, cryptanalysts, and //www.upress.uni-kassel.de/katalog/
abstract.php?978-3-7376-0426-0.
everybody else for additional ciphers and auto-
George Lasry, Nils Kopal, and Arno Wacker. 2016a.
mated cryptanalysis methods which should be in- Automated Known-Plaintext Cryptanalysis of Short
cluded in CT2 in the future. The list shown in Ta- Hagelin M-209 Messages. Cryptologia, 40(1):49–
ble 1 is open for new entries proposed by every- 69.
one. Since CT2 is open-source software, we wel-
George Lasry, Nils Kopal, and Arno Wacker. 2016b.
come everyone in contributing to the CT2 project Ciphertext-only cryptanalysis of Hagelin M-209
(programmers, testers, etc). Finally, everyone in- pins and lugs. Cryptologia, 40(2):141–176.
terested in CT2 may download the software for
free from https://www.cryptool.org/. George Lasry, Nils Kopal, and Arno Wacker. 2016c.
Cryptanalysis of columnar transposition cipher with
long keys. Cryptologia, 40(4):374–398.
References George Lasry, Ingo Niebel, Nils Kopal, and Arno
Nada Aldarrab, Kevin Knight, and Beata Megyesi. Wacker. 2017. Deciphering ADFGVX messages
2018. The Borg.lat.898 Cipher. http://stp. from the Eastern Front of World War I. Cryptolo-
lingfil.uu.se/~bea/borg/. gia, 41(2):101–136.
Michael J Cowan. 2008. Breaking short playfair ci- George Lasry. 2018. A Methodology for the Crypt-
phers with the simulated annealing algorithm. Cryp- analysis of Classical Ciphers with Search Meta-
tologia, 32(1):71–83. heuristics. kassel university press GmbH.
Daily Mail Reporter. 2010. CIA codebreaker reveals Thomas G Mahon and James Gillogly. 2008. Decod-
147-year-old Civil War message about the Confed- ing the IRA. Mercier Press Ltd.
erate army’s desperation. https://dailym.ai/
2JkVFCu. Beata Megyesi, Kevin Knight, and Nada Aldarrab.
Amrapali Dhavare, Richard M Low, and Mark Stamp. 2017. DECODE – Automatic Decryption of Histor-
2013. Efficient cryptanalysis of homophonic substi- ical Manuscripts. http://stp.lingfil.uu.se/
tution ciphers. Cryptologia, 37(3):250–281. ~bea/decode/.
William S Forsyth and Reihaneh Safavi-Naini. 1993. Phil Pilcrow. 2018. CryptoCrack. http://www.
Automated cryptanalysis of substitution ciphers. cryptoprograms.com/.
Cryptologia, 17(4):407–418.
Tobias Schrödel. 2008. Breaking Short Vigenere Ci-
William Frederick Friedman. 1987. The index of phers. Cryptologia, 32(4):334–347.
38
Hidden Markov Models for Vigenère Cryptanalysis
Mark Stamp∗ Fabio Di Troia† Miles Stamp Jasper Huang
Department of Computer Science Los Gatos High School Lynbrook High School
San Jose State University Los Gatos, California San Jose, California
San Jose, California milez000782@gmail.com jhuang821@student.fuhsd.org
∗ mark.stamp@sjsu.edu
† fabioditroia@msn.com
That is, each C in the keyword specifies a shift monograph statistics. The monograph statistics
by 3, each A represents a shift by 0, and each T for standard English appear in Table 1.
is a shift by 19, and the keyword is repeated as Let κe denote the IC for English text. If we com-
many times as needed. Of course, if the keyword pute the IC for a large selection of English text,
is known, it is trivial to decrypt a Vigenère cipher- then based on Table 1, we would expect to find
text.
κe = 0.0822 + 0.0152 + 0.0282
2.2 Friedman Test
+ · · · + 0.0202 + 0.0012 ≈ 0.0656.
When attempting to break a Vigenère ciphertext,
the first step is to determine the length of the key- On the other hand, if we have random text drawn
word. The Friedman test (Friedman, 1987), which from the 26 letter English alphabet, we would ex-
is based on the index of coincidence (IC), is a pect to find that the IC is
well-known method for determining the length of
the keyword, provided that sufficient ciphertext is κr = (1/26)2 + (1/26)2 + · · · + (1/26)2 ≈ 0.0385.
available. An alternative method of finding the
For a simple substitution cipher, we relabel the
keyword length is the Kasiski test (Kasiski, 1863);
letters, which has no effect on the IC. That is,
here we focus on the Friedman test. In any case,
when a monoalphabetic substitution is applied to
once the keyword is known, the Vigenère cipher
English plaintext, the IC of the ciphertext is the
consists of a sequence of shift ciphers, and the
same as that of the plaintext. Friedman noted that
shifts can be determined by a variety of means.
for a polyalphabetic substitution, the larger the
The IC measures the “repeat rate,” i.e., the prob-
number of alphabets, the closer the IC is to κr .
ability that two randomly selected letters from a
Hence, for a polyalphabetic substitution, we can
given string are identical. This test relies on the
use the observed IC to estimate the number of al-
non-uniformity of letter frequencies in the under-
phabets and, in particular, the length of the key-
lying plaintext.
word in a Vigenère cipher.
Suppose that we have a string of text of length N
Let L be the length of the Vigenère keyword,
with na > 0 occurrences of A and nb > 0 occur-
and assume that the ciphertext is of length N. Then
rences of B, and so on. If we randomly select two
we have L Caesar’s ciphers. To simplify the nota-
letters (without replacement) from this string, the
tion, we assume that each of these L ciphers has
probability that the letters match is given by
exactly N/L letters. Under this assumption, the
na (na − 1) nb (nb − 1) nz (nz − 1) probability of selecting two letters from the same
+ +···+ Caesar’s cipher is given by
N(N − 1) N(N − 1) N(N − 1)
N − N/L
where c is the size of the alphabet, ni is the fre- .
N −1
quency of the ith symbol, and N is the length of the
string. For English text (without spaces, punctua- In the former case, the letters are derived from the
tion, or case), we have c = 26, and the the expected same simple substitution (in fact, Caesar’s cipher),
frequency of each ni is known from the language so the chance that they match is κe , while in the
40
latter case, the letters are from different Caesar’s the corresponding observation Oi . That is, row i
ciphers, so the chance that they match is about κr . of the B matrix contains a discrete probability dis-
Let κc be the computed IC for a given Vigenère tribution that gives the probabilities of the vari-
ciphertext. Then κc is the probability of selecting ous observation symbols when the hidden Markov
two letters at random that match and, evidently, process is in state i. As we show below, the com-
this probability is given by ponent matrices of an HMM can reveal informa-
tion about the underlying data that is not otherwise
N/L − 1 N − N/L
κc = κe + κr . (2) readily apparent to a human analyst. This could be
N −1 N −1 considered an advantage of an HMM over other
Solving equation (2) for L, we obtain more opaque forms of machine learning, such as
neural networks.
N(κe − κr )
L= . The following notation (Stamp, 2004) is com-
N(κc − κr ) − (κe − κr )
monly used for HMMs:
Since N is large relative to κe , κr , and κc , we can
approximate the keyword length by T = length of the observation sequence
N = number of states in the model
κe − κr
L= (3) M = number of observation symbols
κc − κr Q = {q0 , q1 , . . . , qN−1 }
For the case of English text, the expected IC = distinct states of the Markov process
is κe ≈ 0.0656, while for the random case (and V = {0, 1, . . . , M − 1}
under the assumption that we have 26 symbols), = set of possible observations
the IC is κr ≈ 0.0385. Recall that κc is the IC for A = state transition probabilities
the ciphertext, which is computed as in (1). Thus, B = observation probability matrix
we can approximate the Vigenère keyword length π = initial state distribution
using (3). In practice, when attempting to break O = (O0 , O1 , . . . , OT −1 )
a Vigenère ciphertext message, we would need to = observation sequence.
test various keyword lengths near the value given
by (3). Note that the observations are associated with the
In Section 3, we compare an HMM-based tech- integers 0, 1, . . . , M − 1, since this simplifies the
nique to the results obtained using the standard ap- notation with no loss of generality. Consequently,
proach to Vigenère cryptanalysis, as discussed in we have Oi ∈ V for i = 0, 1, . . . , T − 1.
this section. For our test cases, we find that the If we are given a sequence of observations of
HMM outperforms the Friedman test, in the sense length T , denoted (O1 , O2 , . . . , OT ), we can train
of giving us a more precise result for the keyword an HMM, that is, we can determine matrices A
length. In addition, the HMM simultaneously re- and B in Figure 1 that maximize the probabil-
covers the shifts, so that the entire Vigenère key is ity of this training sequence. The HMM train-
determined. ing process can be viewed as a discrete hill climb
on the high dimensional parameter space of the
2.3 Hidden Markov Models matrices A and B, and an initial state distribu-
True to its name, a hidden Markov model (HMM) tion matrix that is denoted as π . Once we have
includes a Markov process that is “hidden,” in the trained an HMM, we can use the resulting model,
sense that it is not directly observable. Along with denoted λ = (A, B, π ), to compute a score for a
this hidden Markov process, an HMM includes a given observation sequence—the higher the score,
sequence of observations that are probabilistically the more closely the scored sequence matches the
related to the (hidden) states. An HMM can be training sequence.
viewed as a machine learning technique that relies The HMM matrix A is N × N, while B is N × M
on a discrete hill climb algorithm for training. and π is 1 × N. Here, N is the number of hid-
A generic HMM is illustrated in Figure 1, den states and M is the number of distinct obser-
where A is an N × N matrix that defines the state vation symbols. All three of these matrices are row
transitions in the underlying (hidden) Markov pro- stochastic, that is, each row satisfies the conditions
cess, and the matrix B contains discrete probabil- of a discrete probability distribution. To train an
ity distributions that relate each hidden state Xi to HMM, we specify N, the number of hidden states,
41
A A A A
X0 X1 X2 ··· XT −1
B B B B
O0 O1 O2 ··· OT −1
while M, the number of distinct observation sym- and description here closely follows that in the tu-
bols, is determined from the data. torial (Stamp, 2004).
Typically, the matrices that define the HMM, In a classic illustration of the strengths of
i.e., λ = (A, B, π ), are initialized so that they are the HMM technique, (Cave and Neuwirth, 1980)
approximately uniform. That is, each element show that HMMs can be successfully applied to
of A and π is initialized to approximately 1/N, English text analysis. In (Stamp, 2004), the spe-
while each element of B is initialized to approx- cific English text example in Table 2 is given. In
imately 1/M. In addition, each row is subject to this case, the observations consist of the 26 letters
the row stochastic condition. Also, we cannot use and word space, for a total of M = 27 symbols,
an exact uniform initialization as this represents a and the analyst chose to use N = 2 hidden states.
peak in the hill climb from which the model is un- The B matrix is initialized so that each element is
able to climb. approximately 1/27, subject to the row stochas-
On the other hand, if we know something spe- tic condition—the precise initial values used in
cific about the problem, we can sometimes use this this example are given in first 2 columns in Ta-
knowledge when initializing the matrices, which ble 2. After training the HMM using 50,000 ob-
can serve to speed convergence and reduce the servations, the resulting transpose of the B matrix
data requirements. For example, in (Vobbilisetty is given in the final 2 columns of Table 2.
et al., 2017) it is shown that an HMM can be used
From the example in Table 2, we see that when
to recover the key in a simple substitution cipher-
the Markov process is in (hidden) state 1, the prob-
text, where the underlying language is English.
ability that the observed symbol is a is 0.13845,
In this case, the A matrix corresponds to English
the probability that the observed symbol is b
language digraph statistics, and hence we can ini-
is 0.00000, the the probability that the observed
tialize the A matrix based on such statistics, and
symbol is c is 0.00062, the the probability that the
there is no need to re-estimate A when training the
observed symbol is d is 0.00000, the probability
model.
that the observed symbol is e is 0.21404, and so
An HMM is a machine learning technique in the on. On the other hand, if the model is in (hid-
sense that very little is required of the human an- den) state 2, then the probability that the observed
alyst. Specifically, we need to specify the number symbol is a is 0.00075, the probability that the ob-
of hidden states N, but all other initial parameters served symbol is b is 0.02311, the probability that
are derived from the data, or can be generated at the observed symbol is c is 0.05614, the proba-
random. During training, we rely entirely on the bility that the observed symbol is d is 0.06937,
“machine” (specifically, the HMM training algo- the probability that the observed symbol is e
rithm) to generate the model. Surprisingly often, is 0.00000, and so on. In this case, we can clearly
the HMM training algorithm succeeds in automat- see that the 2 hidden states correspond to conso-
ically extracting relevant and useful information nants and vowels. Since no a priori assumption
from the data. was made about the letters, this simple example
For additional information on HMMs, the stan- nicely illustrates the “machine learning” aspect of
dard reference is (Rabiner, 1989). The notation an HMM.
42
Initial Final text example, we see that the B matrix contains
a 0.03735 0.03909 0.13845 0.00075
b 0.03408 0.03537 0.00000 0.02311
the interesting information.
c 0.03455 0.03537 0.00062 0.05614 Again, an HMM is defined by the 3 matrices, A,
d 0.03828 0.03909 0.00000 0.06937 B and π , and it is standard practice to denote an
e 0.03782 0.03583 0.21404 0.00000
f 0.03922 0.03630 0.00000 0.03559 HMM as λ = (A, B, π ). We also want to empha-
g 0.03688 0.04048 0.00081 0.02724 size that each of these matrices is row stochastic,
h 0.03408 0.03537 0.00066 0.07278
i 0.03875 0.03816 0.12275 0.00000
with each row representing a discrete probability
j 0.04062 0.03909 0.00000 0.00365 distribution.
k 0.03735 0.03490 0.00182 0.00703
l 0.03968 0.03723 0.00049 0.07231
Now, suppose that we train an HMM with 2 hid-
m 0.03548 0.03537 0.00000 0.03889 den states on simple substitution ciphertext, where
n 0.03735 0.03909 0.00000 0.11461 the plaintext is English. The resulting model will
o 0.04062 0.03397 0.13156 0.00000
p 0.03595 0.03397 0.00040 0.03674 partition the ciphertext letters into those corre-
q 0.03641 0.03816 0.00000 0.00153 sponding to consonants and vowels. On the other
r 0.03408 0.03676 0.00000 0.10225 hand, if we set the number of hidden states N to
s 0.04062 0.04048 0.00000 0.11042
t 0.03548 0.03443 0.01102 0.14392 equal the number of symbols (i.e., either N = 26
u 0.03922 0.03537 0.04508 0.00000 or N = 27, depending on whether we include word
v 0.04062 0.03955 0.00000 0.01621 spaces), the simple substitution key can be eas-
w 0.03455 0.03816 0.00000 0.02303
x 0.03595 0.03723 0.00000 0.00447 ily determined from a converged B matrix of an
y 0.03408 0.03769 0.00019 0.02587 HMM (Vobbilisetty et al., 2017). Furthermore, in
z 0.03408 0.03955 0.00000 0.00110 this latter case, the A matrix contains digraph prob-
space 0.03688 0.03397 0.33211 0.01298
sum 1.00000 1.00000 1.00000 1.00000 abilities of the English plaintext.
An analogous HMM-based attack applies to ho-
Table 2: Initial and final BT for English plaintext mophonic substitution ciphers. However, in the
homophonic substitution case, the key recovery
from the B matrix is slightly more complex as the
For the example in Table 2, the converged A ma- number of symbols mapping to each plaintext let-
trix as given in (Stamp, 2004) is ter is typically unknown (Vobbilisetty et al., 2017).
For these HMM-based cryptanalytic models to
0.25596 0.74404
A= converge, we generally require large amounts of
0.71571 0.28429
ciphertext, making such attacks impractical for
This A matrix tells us that when the Markov pro- most classic cryptanalysis problems. However,
cess is in (hidden) state 1, the probability that it since HMM training is a hill climb technique, ran-
transitions to state 1 is 0.25596, while the proba- dom restarts can be used in an attempt to gener-
bility that it transitions to state 2 is 0.74404. Sim- ate an improved solution. It is shown in (Berg-
ilarly, if the Markov process is in state 2, it next Kirkpatrick and Klein, 2013), and from a slightly
transitions to state 1 with probability 0.71571, and different perspective in (Vobbilisetty et al., 2017),
it stays in state 2 with probability 0.28429. In this that by using large numbers of random restarts, the
case, the A matrix is not particularly interesting, as performance of HMM-based attacks can surpass
this matrix simply gives the probability of transi- other techniques, in the sense of requiring less ci-
tioning from a consonant to a vowel, a vowel to a phertext. For example, it is shown in (Vobbilisetty
consonant, and so on. et al., 2017) that HMMs can outperform Jakob-
As mentioned above, an HMM also includes an sen’s algorithm (Jakobsen, 1995), which is a well-
initial state distribution denoted as π , which for known general-purpose simple substitution solv-
the example above converges (Stamp, 2004) to ing technique that is based on digraph statistics.
In this paper, we consider HMM-based crypt-
0.00000 1.00000 analysis of the classic Vigenère cipher. For the
π=
Vigenère cryptanalysis problem considered here,
This tells us that the model started in the second we will train an HMM, then we show that by
hidden state which, according to the converged B examining the resulting matrices A, B, and π of
matrix, corresponds to the vowel state. Again, this a converged model, we can easily determine the
is not particularly enlightening. For this English Vigenère key.
43
2.4 Related Work In the next section, we give experimental results
In (Berg-Kirkpatrick and Klein, 2013) an expec- for an HMM-based attack on a Vigenère cipher.
tation maximization (EM) technique is applied to We also provide some discussion of our results,
homophonic substitutions, with the goal of ana- and we compare our technique to the GAN-based
lyzing the unsolved Zodiac 340 cipher. The EM approach mentioned above.
technique in (Berg-Kirkpatrick and Klein, 2013)
3 Experimental Results
is analogous to the HMM process discussed in the
previous section. A novelty of this work is the use First, we train an HMM with N = 3 hidden states
of an extremely large number of random restarts on a Vigenère ciphertext that was generated us-
to improve on the hill climb results. ing the keyword CAT. Note that in this experiment
The paper (Lee, 2002) appears to be the first we have selected the number of hidden states N
to explicitly apply HMMs (or similar) to substi- to be equal to the keyword length. Also, we have
tution ciphers. However, the work in (Cave and used an observation sequence (i.e., English text)
Neuwirth, 1980), which focused on English text of length 1,000 extracted from the Brown Cor-
analysis, anticipates later cipher-based studies. pus (Francis and Kucera, 1969). In all of our ex-
In (Vobbilisetty et al., 2017), HMMs are ap- periments, we have removed all special charac-
plied to simple and homophonic substitutions, and ters and word space, and all letters have been con-
a careful comparison is made to other automated verted to lower case, resulting in M = 26 distinct
cryptanalysis techniques. This work shows that observation symbols.
HMMs can achieve superior results in many cases, For this experiment, the converged A matrix is
although the computational expense can also be give by
quite high.
0.00000 0.00000 1.00000
The work presented here is motivated in part
by the recent paper (Gomez et al., 2018), where A = 1.00000 0.00000 0.00000
it is shown that a generative adversarial network 0.00000 1.00000 0.00000
(GAN), which is a type of neural network, can
In contrast to the English text and simple sub-
be used to successfully break a Vigenère cipher.
stitution examples discussed in Section 2, here
However, this GAN-based Vigenère attack as-
the A matrix is very informative—for one thing,
sumes unlimited ciphertext, which is unrealistic
this A matrix tells us that the transition between
in any classic cryptanalysis context. In addi-
the N = 3 hidden states is actually deterministic.
tion, in (Gomez et al., 2018) it is claimed that a
From the nature of the Vigenère cipher, it is clear
strength of the GAN technique is its ability to han-
that these states correspond to individual column
dle a large vocabulary (up to 200 symbols), which
shifts, and hence this is a result that we would ex-
seems to be of somewhat dubious value in the con-
pect for a keyword of length 3.
text of Vigenère cryptanalysis. Finally, as is gen-
erally true of neural networks, the resulting GAN The corresponding B matrix appears in Table 3,
is opaque, leaving the authors to make statements which reveals that the first hidden state corre-
such as the following (Gomez et al., 2018): sponds to a shift of 0 (i.e., keyword letter A),
as the probabilities approximately match the ex-
For both ciphers, the first mappings to pected letter frequencies of English. We also see
be correctly determined were those of that the second hidden state corresponds to a shift
the most frequently occurring vocabu- by 2 (i.e., keyword letter C) since the letter fre-
lary elements, suggesting that the net- quencies in this column are offset by 2 from those
work does indeed perform some form of of English, while the final column corresponds to
frequency analysis to distinguish outlier a shift by 19 (i.e., keyword letter T).
frequencies in the two banks of text. From the converged B matrix and the state tran-
sitions in the converged A matrix, we deduce that
The implication here is that the authors are forced the keyword must be either ATC, TCA, or CAT. In
to conjecture as to the relative importance of the this specific example, we also find that the initial
various features in the GAN, since such basic in- state distribution matrix π converges to
formation is not at all clear from an examination
of the model itself. π = 0.00000 1.00000 0.00000
44
a 0.08761 0.01290 0.04950 Since some state transitions are deterministic, we
b 0.01560 0.00000 0.06811 suspect that the keyword length is less than 4 in
c 0.03540 0.07411 0.00480 this case. Similarly, an HMM with N = 5 hidden
d 0.04290 0.01470 0.00450 states yields
e 0.13171 0.03030 0.04320
0.00 0.00 0.00 1.00 0.00
f 0.02100 0.04740 0.02520
0.00 0.00 1.00 0.00 0.00
g 0.02190 0.12181 0.06661
0.04170 0.02160 0.07291 A= 0.00 0.00 0.00 1.00 0.00
h
0.00 0.47 0.00 0.00 0.53
i 0.06841 0.01470 0.02610
j 0.00180 0.04470 0.00060 1.00 0.00 0.00 0.00 0.00
k 0.00600 0.08101 0.06541 which, again, implies that the keyword length is
l 0.04080 0.00360 0.06541 likely less than 5. Finally, we point out that multi-
m 0.02340 0.00300 0.09871 ples of the keyword length behave similarly—for
n 0.06001 0.04800 0.02520 this example, with N = 6 hidden states we obtain
o 0.08131 0.02280 0.01050
p 0.02430 0.06721 0.01680 0.00 0.00 0.54 0.00 0.46 0.00
q 0.00090 0.07711 0.00300 0.00 0.00 0.00 0.00 0.00 1.00
0.06601 0.02160 0.02130 0.00 0.00 0.00 1.00 0.00 0.00
r
A=
0.06151 0.00090 0.00030
s 0.49 0.51 0.00 0.00 0.00 0.00
0.09481 0.06451 0.07951
t 0.00 0.00 0.00 1.00 0.00 0.00
u 0.02910 0.06541 0.01500 0.00 0.00 0.00 1.00 0.00 0.00
v 0.01020 0.09781 0.03090
w 0.01500 0.03180 0.03870 From these results, we conclude that the A ma-
x 0.00180 0.00960 0.13171 trix in a converged HMM will enable us to pre-
y 0.01590 0.02040 0.02310 cisely determine the keyword length used to en-
z 0.00090 0.00300 0.01290 crypt a Vigenère ciphertext. Furthermore, if suf-
ficient ciphertext is available so that English letter
distributions are (roughly) apparent, the B matrix,
Table 3: Final BT for Vigenère ciphertext with
together with the initial state matrix π , will com-
keyword CAT
pletely determine the keyword. That is, we simply
train HMMs with different values of N until we
This implies that we start in the second hidden obtain a deterministic A matrix, and then we use
state, which corresponds to C, and hence we have the corresponding B and π matrices to determine
determined that the keyword is CAT. the Vigenère key. Due to the fact that an HMM is a
Suppose that instead of using N = 3 hidden hill climb, to obtain a converged model, we might
states, we train an HMM with N = 2 hidden states need to train each HMM multiple times with dif-
using the same Vigenère encrypted data as in the ferent randomly-selected starting values.
previous example. In this case, we find that the A Next, we consider the amount of ciphertext
matrix converges to needed to determine the Vigenère key using this
HMM-based attack. Of course, the amount of ci-
0.75236 0.24764
A= phertext will depend on the length of the keyword.
0.34235 0.65765 We tested a few small keyword lengths until
we found an initialization that yielded a solution.
which tells us that we do not have deterministic
Then we reduced the amount of ciphertext until the
transitions between the states, and hence the key-
HMM was unable to solve the problem. This gives
word length is greater than 2.
us an upper bound on the amount of ciphertext
If, on the other hand, we attempt to train a
needed, at least in these selected cases. In these
model with N = 4 hidden states, we obtain
experiments, we define a “solution” as a trained
0.00000 1.00000 0.00000 0.00000 HMM where the average of the maximum value
0.00000 0.00000 0.99884 0.00116 in each row of the A matrix is at least 0.99. Our
A= 1.00000 0.00000 0.00000 0.00000
results are given in Table 4, based on 100 random
0.00000 0.15595 0.00000 0.84405 restarts of the HMM for each test case.
45
Keyword Minimum Friedman References
Keyword
length ciphertext test Taylor Berg-Kirkpatrick and Dan Klein. 2013. De-
IT 2 175 1.4235 cipherment with a million random restarts. In Pro-
DOG 3 250 3.7209 ceedings of the Conference on Empirical Methods in
MORE 4 450 3.8208 Natural Language Processing, pages 874–878.
NEVER 5 1200 3.6467 Robert L. Cave and Lee P. Neuwirth. 1980. Hid-
SECURE 6 1400 4.5545 den Markov models for English. In J. D. Fergu-
ZOMBIES 7 1300 9.9334 son, editor, Hidden Markov Models for Speech,
IDA-CRD, Princeton, NJ, October 1980, pages 16–
56. https://www.cs.sjsu.edu/~stamp/RUA/
Table 4: HMM attack (100 random restarts) CaveNeuwirth/index.html.
46
W ORLD WAR I
48
The Solving of a Fleissner Grille during an Exercise by the Royal
Netherlands Army in 1913
Karl de Leeuw
University of Amsterdam / Informatics Institute
Science Park 904, 1098 XH Amsterdam
karl.de.leeuw@xs4all.nl
50
Figure 2: the supposedly intercepted message. Source: Nationaal Archief Den Haag
51
guidelines did mention this possibility, but only as expected the message to contain information about
a complicating measure, to be applied at will. A troop movements, not yet known to the field com-
second complicating measure mentioned was the mander, orders, or reconnaissance about the en-
filling of the mask before rotation with nulls and emy.
starting writing the actual message after turning The first row of the cryptogram contained the
the backside up. This procedure could only be in- letters ’i’, ’c’. ’h’, and ’t’, likely constituting the
dicated if the center for rotation was filled with last syllable of the word of ’bericht’; the last row
two letters: one to indicate the original position of contained ’b’, ’e’ and ’r’, constituting the first syl-
the mask and one to indicate how it had been laid lable of the same word. These letters had to cor-
after the backside had been turned up.6 respond with three punch holes of the mask. Con-
sequently, these squares had to remain black when
3 Reconstructing the grille in use the grille is turned. The drawing of the mask could
The approach taken by Captain Van der Harst to now begin. The last row contained one more prob-
solve the cryptogram can be derived from his per- able word: ’g’, ’r’, ’y’, ’p’, (attack). This word
sonal notes, handed over after the exercise to the is likely to occur in a sentence like this: ’gryp
commander in chief Lieutenant-General Snijders.7 morgen vyand aan’ (attack the enemy tomorrow).
The captain started his analysis by stating that the These words can be constituted from letters also to
size of the letter square, consisting 15 columns and be found in the first and second row, indicating the
30 rows, probably indicated a plain use of the turn- position of the punch holes when side II is put on
ing grille twice, without columns added. This im- top. Another probable grouping of words would
plied that the letter square had to be divided into be: ’met uwe geheele macht’ (with your entire
two halves of 15 rows each and that the punch hole force). Detecting of these words makes the unveil-
in the exact middle of each letter square would ing of the second cipher block almost complete.
have to contain a letter indicating which side of The text occurring in the punch holes when side
the grille had to be put on top first. IV is put on top can now be reconstructed: ’or u
vastgestelde stations zijn uitgeladen kondschaps-
The square in the middle of the first cipher block
ber’ (Railway stations allocated by you reconnais-
contained two letters, however, ’cg’, the square in
sance mess...). This implicates that the square is
the middle of the second cipher block only one let-
to be used first with side IV being put on top, be-
ter: ’d’. Therefore, Van der Harst decided to pro-
fore side I, II and III are moved to this position,
ceed with the second cipher block.
correponding with the letter ’d’ in the middle of
The captain subsequently asked which words
the second cipher block. The remaining letters oc-
were likely to emerge in the message, words that
curring when side III is put on top constitute rub-
could be detected easily, because of their spelling
bish. The entire text emerging in this cipher block
that is to say. He mentioned several: ’vyand’ (en-
is now clear:
emy), because the ’y’ does not occur very often
in Dutch; ’bericht’ (message), because the trigram ’door u vastgestelde stations zijn uitgeladen vol-
’cht’ is rare; and ’opperbevelhebber’ (commander gens kondschapsbericht heeft vyand te helder min-
in chief), because this word contained two dou- stens drie divisies ontscheept gryp morgen vyand
blings of consonants: ’pp’ and ’bb’ which is rare aan met uwe geheele macht en werp hem terug op-
in Dutch also. Generally speaking, Van der Harst perbevelhebber slot’
(allocated railwaystations are unloaded ac-
6 Ibid., inv. nr. 82: Aanwijzing voor het gebruik van
cording to reconnaissance message the enemy has
geheimschrift (Clues for the use of Secret Writing). It is un-
clear to me how this recommendation was to be put into prac- disembarked at least three divisions in helder at-
tice, if a cable gram was actually sent. After all, all of this de- tack the enemy tomorrow with your entire force
pended on the neatly reorganizing the cryptogram in groups
of four letters. Clearly, in one way or the other, it had to be
and throw him back commander in chief end).
indicated that the telegram contained two letters that had to With help of the reconstructed grille, but only
be placed in one square, but how? after moving the mask in various positions in a
7 Ibid., inv.nr. 305: | Methode van ontcijfering van
het geheimschrift (Method of Deciphering of Cryptogram). process of trial and error, the following message
Annex to the letter sent by Lieutenant-General C.J. Sni- emerged:
jders to staff captains P. Huizer, E.F. Insinger and P.J. Van
Munnekrede, s’Gravenhage, 7 November 1913, GS no 138, Derde divisie, korps RA en vliegafd zullen
Geheim. hedenavond en nacht worden aangevoerd en be-
52
have the advantage that Army and Navy could ex-
change secret messages without additional effort.8
He also wanted them to take notice of a system,
described recently in a French journal.9
Unfortunately, this committee lacked all code
breaking experience. It proved, however, to be
well versed in the cryptologic literature of the
day. It cited among other titles Les chiffres secrets
dévoilées by E. Bazeries (1901); Etude sur la cryp-
tographie by A. Collon (1906); Kryptographik
by L. Kluber (1809); Die Geheimschriften im di-
enste des Geschäfts- und Verkeherslebens, by H.
Schneikert (1905); not to mention of course the
well-known Handbuch der Kryptographie by E.
Fleissner von Wostrowitz (1881).10 This sufficed,
however, to discourage adoption of the cipher sys-
tem used by the navy, because this consisted of
a simple Caesar alphabet to encipher the existing
optical signal register whenever needed, offering
no genuine protection at all.11 What is more, ac-
cording to the committee, a common cipher sys-
tem for army and navy was unnecessary and even
Figure 3: The grille as teconstructed by Van der dangerous: unnecessary because army and navy
Harst. Source: Nationaal Archief units were not be in direct contact, orders be-
ing always given top down; and dangerous, be-
halve verpleging en veldhosp afd morgenochtend cause the distribution of ciphers would become too
om zes uur aan de door u vastgestelde stations zijn widespread to offer security any longer. Encryp-
uitgeladen volgens kondschapsbericht heeft vyand tion had to remain limited to messages exchanged
te helder minstens drie divisies ontscheept gryp between the GHQ and the field commanders.12
morgen vyand aan met uwe geheele macht en werp Surprisingly, a careful examination of the liter-
hem terug opperbevelhebber slot’ ature had lead the committee to believe that the
(third division, royal horse artillery and air- turning grille was one of the strongest encryp-
craft unit will be conveyed this evening and night tion devices available, as no convincing cases were
and except for hospital staff and hospital equip- presented of its solution. It did believe, however,
ment tomorrow morning at six o’ clock unloaded that the way in which the system was used in the
at the designated railwaystations according to re- Netherlands, was ready for improvement.13 In the
connaissance message the enemy has disembarked view of his colleagues, Captain van der Harst was
at least three divisions in helder attack enemy to- able to break the cipher, only because he had a
morrow with your entire force and throw him back some idea what the messages was about; because
commander in chief end) he was well aware what probable words to look
8 Ibid., inv.nr. 305: Lieutenant-General C.J. Snijders
4 The staff report
to staff captains P. Huizer, E.F. Insinger and P.J. Van
Munnekrede.s’Gravenhage, 7 November 1913, GS no 138,
On 7 November 1913 General C.J. Snijders de- Geheim.
cided to put the matter before a committee of three 9 Génie Civil,XXIII (26), 420.
10 The Hague, Nationaal Archief, Departement van Oorlog,
staff officers: the captains P. Huizer, E.F. Insinger
Generale Staf, inv. nr. 305: Beschouwingen en voorstellen
and P.J. Van Munnekrede. He asked whether the in verband met het bij den Generale Staf in gebruik zijnde
turning grille could be improved or had to be re- geheimschrift. d.d. 30 May 1914. (Reflections and Proposi-
placed altogether. If the last was to be case, he tions with regard to Secret Writing as practiced by the Gen-
eral Staff.
demanded to pay attention to the question whether 11 Ibid., 4.
the system in use by the Royal Netherlands Navy 12 Ibid., 5-6.
53
for; and, last but not least, because no complicat- General and governor in charge of the Royal Mil-
ing measures, such as the adding of columns to itary Academy.
hide the real rotating center of the cipher block
were ever taken.14 Much in line with the original
suggestion made by Captain Van Mens in 1911, References
the committee recommended the enciphering of Edouard Fleissner von Wostrowitz. 1881. Handbuch
the original message before putting it under a turn- der Kryptographie. Seidl & Sohn, Wien, Austria.
ing grille by way of a Vigenère, carefully explain- Carl Friedrich von Hindenburg. 1796. Fragen eines
ing how a Vigenère worked.15 Ungenannten über die Art durch Gitter geheim
The committee did not go into the actual crypt- zu schreiben. Archiv der reinen und angewandten
analysis of the message. It was well aware that it Mathematik III: 347–351, V: 81-99.
lacked the hands-on experience, needed in an ac- David Kahn. 1967. The Codebreakers. The Story of
tual war. Therefore, it recommended the appoint- Secret Writing. Macmillan Publishing Company,
ment and training of an additional staff officer to New York, USA.
gain expertise in this particular field. It doubted, Wim Klinkert. 2017. ’Espionage Is Practised Here on
however, that this job was suited for a career of- a Vast Scale’. The Neutral Netherlands, 1914-1940.
ficer, who had to rotate jobs on a regular basis. Floribert Baudet et al., Perspectives on Military In-
The mindset needed was one of patience, perse- telligence from the First World War to Mali. Between
Learning and Law. T.M.C. Asser Press, The Hague,
verance and wisdom: with the possible exception The Netherlands, 23-54.
of perseverance attributes difficult to find among
people who joined the army in most cases, because Karl de Leeuw and Hans van der Meer. 1995. A Turn-
ing Grille from the Ancestral Castle of the Dutch
they wanted to see action. The committee believed Stadtholders. Cryptologia, XIX(2), 153-164.
that a reserve officer would be better suited for this
task, because he would lack the ambition to make Karl de Leeuw. 2015. ’The Institution of Modern
a career in the army to start with. Descent was Cryptology in the Netherlands and the Netherlands
East Indies, 1914-1935.’ Intelligence and National
irrelevant, in this particular case. Security, 30: 26-46.
5 Conclusion
Less than a month after the committee had com-
pleted its report, Archduke Franz Ferdinand and
his spouse were murdered and less than two
months later war broke out, changing the face of
the continent. In this context it should not sur-
prise us that the committee’s advice was followed.
Henri Koot, a young lieutenant from the colonial
army who happened to be in the country to fol-
low a training program, possessed all the required
qualities and proved to be able to lay the ground-
work for the institution of modern cryptology in
the Netherlands, as Karl de Leeuw (2015) has
shown. Koot – recognizably of mixed descent –
was highly intelligent, but also modest and obedi-
ent to the extreme and he had no career expecta-
tions outside the colonial army whatsoever. Nor
should it, after all that has been said, surprise us,
that Van der Harst – who clearly had demonstrated
his talent as a cryptologist – wasn’t called upon
to do the job. He was to rise high in the Royal
Netherlands Army, ending his career as a Major
14 Ibid., 8.
15 Ibid., Bijlage B.
54
Deciphering German Diplomatic
and Naval Attaché Messages from 1914-1915
George Lasry
University of Kassel
Germany
george.lasry@gmail.com
56
pecially to embassies in countries without a bor- • Words and Expressions: A set of pages
der with Germany, such as Spain. To increase the (randomly numbered) dedicated to words,
security of existing codebooks without replacing expressions and some full sentences. Each
them, German cryptographers often applied a su- page contains 100 entries.
perencipherment (an additional encipherment) on
the numerical codes. Methods of superencipher- • Persons: A set of pages dedicated to names
ment either consisted of transpositions (changing of persons, and entities such as banks and
the order of the digits in the numerical code), sub- commercial shipping lines. Those pages, ran-
stitutions (replacing a digit with another digit), domly numbered, were sparsely populated
or additives (some number mathematically added (usually only ten names out of the possible
to the numerical codes). At first, superencipher- 100 in a page).
ment methods were simple or used over a long
time span, allowing Room 40, the section in the • Supplement: A set of pages with additional
British Admiralty responsible for cryptanalysis, to names and places, with numerical codes ran-
recover the keys on a regular basis.2 domly assigned.
Toward the end of the war, the Germans intro-
duced more sophisticated methods, but often made
the severe mistake of communicating the details of Except for the Dreinummerheft, which consists
a new superencipherment method in a message en- of 3 digits, the numerical codes have 4 or 5 dig-
coded with a previous version, already known to its. The three leftmost digits (for a 5-digit code)
Room 40.3 or the two leftmost digits (for a 4-digit code) rep-
resent the page number. The second digit to the
2.4 German Diplomatic Code Books right represents the block number. Each page has
ten blocks (from 0 to 9), each block containing ten
Germany started the war with several families of words. For example, code 10275 corresponds to
diplomatic codebooks in place, mainly the 13040 the word Dampfer (steamer), and it is the sixth
and the 18470 families. A family of codebooks word (the last digit is 5, we start counting from
includes several codebooks derived from one an- 0) in block 7 of page 102. The order of the pages
other. Those first German diplomatic codebooks (the page numbers) does not correspond to the al-
usually consisted of the following sections: phabetical order of the words they contain. Fur-
thermore, the order of the 10-word blocks inside
• Dreinummerheft(3-digit code): This section a page does not correspond to the alphabetical or-
was common to all codebooks in the 18470 der of their contained words. However, all the 100
and 13040 families. The prewar 18102 code- words inside a page are always relatively close al-
book also used the same Dreinummerheft phabetically. Also, the ten words inside each block
codes. The Dreinummerheft consists of 3- are in alphabetical order. While those codebooks
digit codes, from 000 to 999. They repre- are nominally two-part codebooks, it is still possi-
sent numbers (000 to 500, 00 to 99) and dates ble to deduce the meaning of one numerical code
(January 1 to December 31). The mapping based on the meaning of another code on the same
follows an almost predictable pattern. As a page or block.
result, to fully reconstitute the Dreinummer-
The 18470 and its derivatives, such as the
heft, an adversarial code-breaking organiza-
12444, the 1777, and the 2310, were fully recov-
tion such as Room 40 needed to know the
ered by Room 40, aided by the capture of code-
meaning of only a few of the Dreinummer-
book 3512 in Persia in 1915. Interestingly, it
heft codes.
seems that Room 40 never shared their copy of
the captured codebook 3512 with their US coun-
• Places: A set of pages (randomly numbered)
terparts, despite closely cooperating in various do-
dedicated to names of cities, countries, na-
mains. Room 40 was able to analytically recon-
tionalities, and foreign government institu-
struct most parts of the 13040 codebook (which
tions. Each page contains 100 entries.
was used to encipher the famous Zimmerman
2 (Gannon, 2010), p. 130. Telegram), as well as its derivatives, the 5950
3 (Gannon, 2010), p. 261, footnote 20. and the 26040 (the 13040 superenciphered using
57
a constant additive).4,5 entities into groups of 5 digits. It consists of sev-
The German diplomatic services also developed eral sections, for words and expressions, for names
a series of two-part hat codes, such as the 5300, of places and ships, as well as for indicating posi-
6400, 7500, 8600, and 9700 codebooks. Room 40 tions of ships on maps. The Stützel report (see
was able to recover large parts of those codes, and Section 2.4) caused great alarm at the German
in particular, codebook 7500, used to encipher one Admiralty, and the Navy introduced new, more
of the versions of the Zimmerman Telegram.6 complex superencipherment methods. Based on
Interestingly, despite the capture of a copy of the voluminous numbers of transcripts in British
the 3512 codebook, and the publication of the National Archives in Kew, which also mention
Zimmermann Telegram, the German diplomatic the types of code and superencipherment, those
cryptographers never realized that both the 18470 probably did not pose serious problems to Room
and 13040, and their relatives, had been com- 40’s codebreakers. Using decrypts from the traffic
promised. In a report from April 1917, Her- between Berlin and the naval attaché in Madrid,
man Stützel, a German Navy cryptographer, de- Room 40 was able to unravel and prevent various
scribes how he was able to decipher messages en- plots and espionage activities conducted from the
coded with the 18470 codebook, only from inter- German embassy in Madrid.7
cepted communications. He was also able to de-
cipher messages encoded with the 5300 hat code 3 Deciphering the Genoa Cryptograms
with various superencipherment methods (Stützel,
1969). Ironically, Room 40 intercepted and deci- In this section, we present the step-by-step process
phered a message containing the report. The re- of deciphering the majority of the cryptograms in
action of the German diplomatic services to the the Genoa collection. We describe the processes
report is unknown. The Imperial Navy swiftly of classifying the various types of cryptograms, of
reacted, implementing a series of new complica- reconstructing a diplomatic codebook, of identi-
tions on top of their naval attaché codes (see Sec- fying the superencipherment method for a naval
tion 2.5). attaché code, and of recovering its key. This de-
tective work also required the retrieval and survey
2.5 Naval Code Books of a multitude of documents from archives in Ger-
At the outset of the war, the Navy had several many, the UK, and the US, with the assistance of
codebooks in use for various purposes, including leading experts. The work continued with building
the Signalbuch der Kaiserlichen Marine (SKM), a computerized database of the cryptograms, suc-
used mainly for signaling and communications be- cessfully deciphering most of them, and validating
tween ships, and the Handelsverkehrsbuch (HVB), the decryptions with newly found documents.
for communications with merchant ships. For
3.1 Classifying the Cryptograms
communicating with naval attachés, the Imperial
Navy also employed the Satzbuch (SB), as well as At first, we obtained six files from the RAV Genua
the Verkehrsbuch (VB). The SKM, HVB, SB, and collection at the PA AA, containing both plaintexts
VB were all one-part codebooks. The VB and the and cryptograms.8,9,10,11,12,13
SB were usually superenciphered, but at the be- After analyzing the structure of the cryp-
ginning of the war, the keys were not frequently tograms, we were able to divide them into four
changed. The German Navy was slow to realize categories:
that copies of its books had fallen into enemy’s
7 (Gannon, 2010), Chapter 13 - The Spanish Interception.
hands, early on in the war. Later on, the Navy 8 PA AA - RAV Genua 09, Acten betreffend Ziffern 1867-
implemented various methods for superencipher- 1908.
ment, and also introduced new codebooks such as 9 PA AA - RAV Genua 10, Chiffrierwesen 1898-1913.
10 PA AA - RAV Genua 11, Sammlung der Chiffres 1889-
the Flottenfunkspruchbuch (FFB), which replaced
1908.
the SKM in 1917. 11 PA AA - RAV Genua 12, Sammlung der Chiffres mit
The main codebook for communicating with Ausschluss der Korrespondenz mit den Marinebehörden, Bd.
naval attachés, the Verkehrsbuch, maps words and 2, 1904-1914.
12 PA AA - RAV Genua 13, Chiffrierte Telegramme 1914-
4 (Gannon, 2010), p. 130. 1915
5 (Gannon, 2010), p. 205. 13 PA AA - RAV Genua 14, Telegramme in Chiffre. 1914-
6 (Gannon, 2010), p. 131. 1915.
58
• Sequences of letters: Two messages from meaning for the codes in those cryptograms. With
1897 and 1898, each composed of series of those, we were able to reconstruct about 10% of
letters, from a to z. After a quick analy- the 18470 codebook and to produce fragmentary
sis, we identified the encryption method to be decryptions for some of the messages in the Genoa
Vigenère, and we deciphered the two cryp- files.
tograms. The German plaintexts contain ref- In codebook 18470, while the pages are scram-
erences to another cipher system, as well bled, the words inside each page (such as the
as new keywords for that system, for which words with codes between 12100 and 12199) are
there are no corresponding cryptograms in alphabetically close. Based on this, we tried to
the Genoa collection. guess assignments for unknown numerical codes
in pages for which we had other known assign-
• 5-digit codes with indicator 1847X: A set ments. A team of linguists investing time on the
of messages composed of groups of 3, 4, or problem would probably have been able to recon-
5 digits, from December 1913 to mid-1915. struct large parts of the codebook and decipher
Those cryptograms have an indicator of the most of the cryptograms, given the availability of
form 1847X (18470 to 18479, usually 18470) hundreds of them. However, such resources were
as one of the first groups. not available to the author. To progress, either a
• 5-digit codes with indicator 1810X: A set copy of the codebook, or some plaintexts match-
of messages composed of groups of 3, 4, or 5 ing the cryptograms were required. A search for
digits, from 1898 to November 1913. Those matching plaintexts in archives produced only a
cryptograms have an indicator of the form single message, dated August 1, 1914, sent by the
1810X (18100 to 18109, usually 18102) as German consul in Genoa, von Herff, to the Ger-
one of the first groups. man Foreign Office. 14
It reads as follows:
• 10-letter codes: A set of nine messages from
August 1914, composed of groups of 10 let- Nummber 7. Im hiesigen Hafen
ters each, sent between the Kaiserliche Ma- liegende englische Dampfer der White
rine Admiralsstab (German Imperial Navy Star Line und British India Company
Admiralty), German warships Goeben and ‘Celtic’ und ‘Malda’ sind von ihren
Breslau, and the German consulate in Genoa. Gesellschaften angewiesen möglichst
rasch auslaufen und westlisch.15
3.2 Deciphering Diplomatic Codebook 18470
Cryptograms The plaintext is a report about British ships
leaving Genoa westbound. Using the date, the
Following the successful decryption of the Vi- message length, and the correspondents, we were
genère messages, we first analyzed the cryp- able to locate the original ciphertext in the Genoa
tograms with the 1847X indicators. In the archive files.16
records, a few hundred of them are available. The code corresponding to the word Dampfer
Although plaintexts also appear in the original (steamer), 10275, also appears in Mendelsohn’s
records, we could not match any of them to a cor- study and has the same meaning. Other codes cor-
responding cryptogram with a 1847X indicator. respond to words or expressions located in alpha-
We found a key document on the subject, Stud- betical positions as expected from Mendelsohn’s
ies in German Diplomatic Codes Employed dur- interpretation of the 18470 code. Based on this,
ing the World War, written by Charles J. Mendel- we were able to conclude that not only the mes-
sohn, and compiled into a War Department re- sages with the 1847X indicators were indeed en-
port in 1937 (Mendelsohn, 1937). The first of coded with the 18470 codebook, but that they
its three sections is named Code 18470 and Its
14 PA AA, R 19875, Bl. 31. Generalkonsul von Herff an
Derivatives. It describes the structure of codebook
das Auswärtige Amt.
18470, based on a 1918-19 study by Mendelsohn 15 ‘In local port anchored steamers of the White Star Line
and a team of cryptographers at the Military Intel- and British India Company ‘Celtic’ and ‘Malda’ have been
ligence Division of the General Staff in Washing- instructed by their companies to leave port as soon as possible
and (sail) westbound.’
ton. The study also includes a few original mes- 16 PA AA - RAV Genua 13, Chiffrierte Telegramme 1914-
sages encoded using 18470, as well as the German 1915, p. 6.
59
were encoded without any additional encipher- tually, and after an extensive trial-and-error pro-
ment. This finding was an important step. But cess, we were able to reconstruct almost the entire
while the plaintext also provided the meaning for mapping between the codebooks. While some ran-
a few additional codes, this was not enough to dom elements of the mapping created some chal-
progress with the decryption of other 18470 mes- lenges, the compilers of the 18470 derivatives (in-
sages in the collection. cluding the 3512) had applied several regular pat-
We started to look for copies of original terns in the process, which helped us significantly
codebooks. Copies of various WW1 German (and also weaken the security of the codebook).
codebooks are available at the British National After the mapping was established, we wrote a
Archives at Kew, including naval codes such as the special software and used it to decrypt all the mes-
SKM, captured in 1914 from the German warship sages encoded with 18470, except for a few names
Magdeburg. The archives also include a version which appear in a special supplement of the code-
of code 13040, reconstructed via cryptanalysis by books (and for which there is no conversion for-
Room 40, and used to encode the (in)famous Zim- mula or pattern).
mermann Telegram in 1917. The successful de-
3.3 Diplomatic Codebook 18102
cipherment of the Zimmermann Telegram, along
Cryptograms
with German pursuance of unrestricted submarine
warfare, contributed to the entry of the United After successfully reconstructing codebook
States into the war. However, neither the 18470 18470, we turned to the 1810X cryptograms.
codebook, nor the 18102 appear in British, Ger- Several plaintexts from October and November
man, or US archives. 1913 announce the transition from the 18102
In his study, Mendelsohn described how the codebook to the 18470 codebook, and an order
18470 codebook was part of a larger family of to destroy all physical copies of the 18102.
codes, including the 12444, the 1777, and the 2310 Unfortunately, we were unable to find copies of
codebooks, all derived from the same division of the 18102 codebook in any of the relevant British,
words and expressions to pages, the pages being German or US archives. An analysis of the ranges
reshuffled differently. We could not find any of of pages showed that the 18102 code could not
the 18470 derivatives listed by Mendelsohn in US, have been a derivative of the 18470. Codebook
British, or German archives. Further research in 18102 might still be a derivative of the 13040
the British National Archives at Kew produced an- codebook, as the 13040 was also in use before the
other document, The Political Branch of Room 40, war, but there is no evidence in that direction, and
which mentions two other codebooks, 89374 and further work is needed to check this hypothesis.
3512, captured by the British in Persia in 1915. Lacking a corresponding plaintext for any of the
According to this report, an analysis by the Polit- 18102 messages, or a derivative of this code, we
ical Branch led to the conclusion that those two were neither able to reconstruct the codebook,
codes stem from the same source, albeit reordered nor to decipher any of the cryptograms. Since
differently.17 the 18102 and the 18470 share the encoding of
numbers and dates (Dreinummerheft), it might be
Several recent papers also link those two code-
possible to look for matching plaintext-ciphertexts
books to the 18470 family, and this assumption
in the files based on the message serial numbers.
was strengthened by the fact that Mendelsohn
mentions there existed at least one member of that 3.4 Deciphering Naval Codebook
family, unknown to him (Freeman, 2006; Kelly, Cryptograms
2013). Fortunately, a copy of the 3512 codebook
The last category is comprised of only nine cryp-
is available at Kew.18
tograms, sent in August 1914, and involving Navy
After obtaining a photocopy of the 3512 code- recipients or senders. They consist of 10-letter
book, we needed to establish the precise relation- codes, such as DUMOSEPIRE or CLYHMUIMUS, with
ship between the 3512 and the 18470, using the the prefix (the first five letters) of one of the 10-
known numerical codes from Mendelsohn. Even- letter codes often used as the prefix in another
17 (ADM, 223) ADM 223/773, George Young, Political 10-letter code, or the suffix (the last 5 letters) of
Branch of Room 40, Section ‘89374 and 3512’. one code often used as the suffix of another code.
18 (HW, 7) HW 7/26 German Codebook Number 3512. A likely codebook candidate appeared to be the
60
HVB, used for communicating with German mer- sources (Lorey, 1928). Unfortunately, we could
chant ships. While the HVB is primarily a 4-letter not (yet) draw any conclusions from this sample
code, each code also has a 10-letter equivalent, alone.22
composed of a combination of a 5-letter prefix A breakthrough came from a review of the mes-
and a 5-letter suffix. However, none of the HVB sages sent in the 18470 codebook, which by then
prefixes or suffixes seemed to match any of those we were already able to decipher. A message from
found in the Genoa cryptograms. The HVB also Berlin was sent in 18470 code to Genoa on August
had an optional substitution superencipherment.19 1, 1914, with the following instructions:
This substitution preserves the vowel-consonant
structure of the original ten letters, and since this Nummer 9 unter Bezugnahme auf Telegr.
characteristic can be used to validate possible out- Nummer 10. Schlüsselzahlen zu Marine
puts, we were able to rule out the possibility that Chiffres lauten: Schlüssel B: 469,
the cryptograms were encoded with HVB with reserve B: 718. Auswärtig. Amt.23
substitution.
In a serious breach of security, this message
The next obvious candidate was the VB, in-
specifies the primary key (469) as well as the re-
tended for naval and military attaché communica-
serve key (718) for the Navy’s cipher. We hy-
tions. The VB consists of 5-digit codes. With the
pothesized that those could be the key for some
assistance of other scholars, the author was able to
superencipherment. The next step was to look
obtain a photocopy of the VB, as well as a copy of
for references to any of the two keys, hoping this
a VB supplement.20,21
might help to identify the type of superencipher-
The supplement describes a mapping of 5-digit
ment. We were unable to find any reference to
VB codes to 5-letter prefixes (representing the first
key 469. However, the author vaguely remem-
three digits) and 5-digit suffixes (representing the
bered a mention of key 718, in the multitude of
last two digits). Those prefixes matched those
archive files already reviewed. Luckily, an exten-
found in the Genoa files. Therefore, we were able
sive survey of all the material gathered so far re-
to map all the 10-letter codes in the collection, into
sulted in the (re)discovery of a reference to key
their 5-digit equivalents. However, none of these
718, in Mendelsohn’s study (Mendelsohn, 1937).
5-digit codes would map to words or expressions
The third chapter lists several methods for the su-
in VB which have a logical or relevant meaning,
perencipherment of codes. One of them is based
indicating that some form of superencipherment
on sliders (Schieber in German), which consist of
had been employed. After all, naval communica-
a set of three substitution slides. Those slides map
tions were deemed to be more sensitive than reg-
some of the digits of the 5-digit codes to other
ular diplomatic communications. There was no
digits according to some random pattern. A 3-
clue, however, about the specific type of superen-
digit key specifies the starting position of each one
cipherment employed here. At this stage, the re-
of the three sliders. Mendelsohn provides the or-
search had reached a dead end regarding the 10-
dering for a set of 3 sliders used before the war
letter cryptograms.
and until 1917, described in Table 1 (Mendelsohn,
After several months of extensive research, we
1937). In this example, the sliders are set to key
found in the British National Archives at Kew a
718, and are to be applied on the second, third,
message sent on August 3, 1914, to the Goeben
and fourth digits (the first and last digits are kept
warship by the Admiralty in enciphered VB. The
unchanged).
file consists of a log of English transcripts (trans-
Interestingly, the example given by Mendel-
lations) of VB messages from 1914, intercepted
sohn uses key 718 which happens to be the re-
and deciphered by Room 40. The message from
serve naval key mentioned in the 18470 message.
August 3, 1914, is the only one in the file for
This was a clear indication that the 10-letter cryp-
which the cryptogram is also available. The
tograms might have been superenciphered using
German plaintext was also available from other
22 (ADM, 137) ADM 137/4065, Log of intercepted Ger-
19 (ADM, 137) ADM 137/4320, Chiffresschlüssel H.V.B. man signals in Verkehrsbuch code from various sources
1913. 1914-1915, entry 113.
20 (ADM, 137) ADM 137/4374, Verkehrsbuch (VB) 1908. 23 ‘Number 9 with reference to telegram number 10. The
21 (ADM, 137) ADM 137/4314, Verkehrsbuch Supple- keys for Marine cipher are: Key B: 469, reserve B: 718. For-
ment. eign Office.’
61
Original Second Third Fourth them included plaintexts, many of which could be
Digit Digit Digit Digit matched to original 18470 cryptograms based on
Becomes Becomes Becomes their serial numbers. The matching could not be
0 7 1 8 done before as the serial numbers appear encoded
1 0 9 3 in the cryptograms. Further analysis showed that
2 9 4 4 those new files include plaintexts for about 40%
3 2 6 6 of the 18470 cryptograms, and it was possible to
4 6 2 5 validate that they had been (mostly) corrected de-
5 3 7 2 ciphered.25,26
6 5 3 7 A third file contained messages from 1910 en-
7 8 5 1 coded using VB with sliders. Surprisingly, those
8 1 0 9 could be decrypted using slider key 469, which
9 4 8 0 indicates that this key was in effect for several
years and until the war broke, highlighting a se-
Table 1: Slider for VB vere breach of security.27
Those decryptions further confirmed the cor-
rectness of our solutions for the 1914 naval mes-
sliders. Next, we tried to decode some of the 10-
sages in the collection.
letter cryptograms using the sliders with key 718,
but this failed to produce any plausible plaintext. 4 The Contents of the Cryptograms
Another option was key 469. We tested that slider
key on one of the cryptograms and obtained a few The ”RAV Genua - Generalkonsulat Genua” col-
German words related to Kohle (coal), a topic very lection at the PA AA covers the period from 1867
much relevant to the escape of the Goeben. When to May 24, 1915, when the German consulate was
applying the sliders with key 469 to other cryp- closed after Italy entered the war on the side of
tograms, we could finally recover plausible plain- the Entente powers. It also covers the period from
texts. To further validate those findings, we tried 1921, after the consulate reopened, until the end of
to apply the same slider method to the message World War 2. Our research focuses on the first pe-
from August 3, 1914, sent from the Admiralty to riod, and especially on the years 1913, 1914, and
the Goeben. While this message could not be de- 1915. The records cover a wide area of topics, in-
ciphered using key 469, further analysis showed cluding administrative and legal matters (such as
that another key was applied, namely 5288, with passports and visas), protocol, local politics, naval
the 3rd slider (at key position 8) also being applied intelligence, economy, trade, and shipping.
to the fifth digit (in addition to being applied to the Of particular interest are the decryptions related
fourth digit). This message reads as follows: to three subjects, namely the declarations of war in
summer 1914, the role played by the consulate in
August 3 Bündnis geschlossen mit gathering naval intelligence, and its role in assist-
Türkei Goeben Breslau sofort gehen ing the Goeben and Breslau warships to escape to
24
nach Konstantinopel bescheinigen. the Dardanelles. The latter event had a significant
impact on the war in the Mediterranean Sea and
We had thus achieved a complete solution for
the Middle East.
the elusive 10-letter naval cryptograms in the
Genoa collection. We were now certain that those 4.1 No War Without Declaration
consisted of VB codes superenciphered with slid- World War I was one of the last modern, major
ers, using key 469. military conflicts in Europe which started with for-
3.5 New Genoa Files mal declarations of war, by all parties involved.
Countries felt obliged to formally declare war, as
Our project did not end here. One year after suc-
part of an official international protocol defined
cessfully deciphering the cryptograms in the first
25 PA
six files, we were able to obtain three new files AA - RAV Genua 74, Kriegsgefahr 1914-1915.
26 PA AA - RAV Genua 77, Krieg, Militärsachen 1914-
from the Genoa collection in the PA AA. Two of
1915.
24 ‘August 3:Alliance with Turkey concluded. Goeben and 27 PA AA - RAV Genua 68, Chiffres nach d. Marine 1907-
Breslau should at once sail to Constantinople.’ 1914.
62
at The Hague Peace Conference of 1907, and for to the region one of her most modern warships,
internal legal and political reasons. With a for- the battle cruiser Goeben, together with the light
mal declaration, a country could start mobilizing cruiser Breslau, under the command of Rear Ad-
its army. Also, military and merchant navy ships miral Souchon. Given the vast superiority of the
had to be informed that they should leave hostile British and French fleets in the Mediterranean Sea,
ports, to avoid being seized. As this was usually those two lone ships were threatened to be iso-
done before issuing the formal declaration of war, lated, captured or destroyed, as the war broke out.
any signs of movement of ships in times of cri- Souchon was first ordered to escape via the Gibral-
sis might indicate an upcoming declaration of war. tar straights but instead decided to attack French
The Genoa collection includes a series of mes- facilities in North Africa. After the attack, with
sages informing the consulate of the various dec- the westbound route being blocked, he was or-
larations of war, and of their impact such as the dered to reach the Dardanelles, following the sign-
freedom of movement of German nationals. From ing of an alliance between the Ottoman Empire
the first declaration of war between Russia and the and the German Empire in the beginning of Au-
Austro-Hungarian Empire, and throughout August gust 1914. To successfully escape vastly superior
1914, the tensions escalate, and this is reflected in enemy forces, the Goeben needed large quantities
the communications. For example, on August 2, of coal, required to reach higher speeds. Sup-
1914, the following message is sent from the Ger- ply of coal in sufficient quantities could only be
man Foreign Office to Genoa: found in Italy or obtained from German merchant
ships. For that purpose, the German Foreign Of-
Nummer 8. Durch allerhöchst fice instructed its local representations to assist the
kabinettsorder ist Mobilmachung Goeben and Breslau to secure large quantities of
angeordnet. Bitte deutsche Schiffe coal. This effort is reflected in several messages,
im dortig Amtsbezirk ohne rücksicht encrypted with the 18470 codebook, as well as
auf Geheimhaltung weiter warnen with the VB with superencipherment. For exam-
und Dienstpflicht zur Rückkehr ple, the following message was sent on August 1,
auffordern. Jagow.28 1914, from von Herff, the consul in Genoa, to Rear
Admiral Souchon:
4.2 Naval Intelligence
A large number of cryptograms relate to naval in- Goeben - Messina. Auf Ersuchen von
telligence collected mainly from public sources, Breslau: Kohlendampfer ist nicht
such as newspapers, or German nationals return- vorhanden. Deutsches Kohlendepot
ing from British and French colonies. Movements ist bemüht, möglichst viele Kohle
of ships, including warships as well as merchant kaufen, hoffen Montag 2000 Tonnen
ships transporting troops, are routinely reported. gemischte gut Kohle zu sammeln und
An example of such a report is given in Sec- Bescheid zu geben. Welche Menge von
tion 3.2. Kohle gebraucht und wohin zu liefern?
Herff29
4.3 Assistance to the Goeben and Breslau
Warships Other records describe the requisition of Ger-
man merchant ships and their coal, the securing
The most interesting findings in the decrypted of funds for transactions, and negotiations with
records are about the extensive assistance given by Italian authorities. Eventually, the Goeben and
the German consulate in Genoa (as well as other Breslau were able to obtain significant quantities
German representations in the region), to the Ger- of coal, allowing them to escape the British and
man warships Goeben and Breslau in their escape French fleets, and to reach the Dardanelles. They
to the Dardanelles, in August 1914. To extend joined the Ottoman fleet under the Ottoman flag.
its presence and influence in the Mediterranean Their attack on Russian facilities, carried indepen-
Sea, the German Empire had before the war sent
29 ‘Goeben - Messina. At the request of Breslau: Coal
28 ‘Number 8. By highest cabinet decision mobilization has steamer is not available. German coal depot working hard
been ordered. Please continue warning German ships in the to buy as much coal as possible and expects to collect 2000
local district, regardless of confidentiality, and request those tons of mixed, good quality coal on Monday, and will report
liable for [military] service to return. Jagow.’ on it. How much coal is needed and where to deliver? Herff’
63
dently of their Turkish counterparts, later precip- References
itated the entry of the Ottoman Empire into the ADM. 137. Admiralty: Historical Section: Records
war (Van der Vat, 2000). As a result, the Entente used for Official History, First World War. The Na-
powers had to divert significant resources to the tional Archives.
Mediterranean Sea and the Middle East, including ADM. 223. Admiralty: Naval Intelligence Division
for the catastrophic Dardanelles offensive in 1915. and Operational Intelligence Centre: Intelligence
The critical role played by the consulate in Genoa Reports and Papers. The National Archives.
is for the first time exposed in the decrypted mes-
Peter Freeman. 2006. The Zimmermann Telegram Re-
sages from the Genoa collection. visited: A Reconciliation of the Primary Sources.
Cryptologia, 30(2):98–150.
5 Conclusion
Paul Gannon. 2010. Inside Room 40: The Codebreak-
This research highlights inherent weaknesses in ers of World War 1. Ian Allan Publishing Ltd.
German cryptographic methods and procedures HW. 7. Room 40 and successors: World War I Official
for diplomatic and naval communications at the Histories. The National Archives.
beginning of WW1, as follows:
Saul Kelly. 2013. Room 47: The Persian Prelude to the
Zimmermann Telegram. Cryptologia, 37(1):11–50.
• Most of the confidential diplomatic commu-
nications relied on codebooks, which were in George Lasry, Ingo Niebel, Nils Kopal, and Arno
use for long periods of time. Also, the com- Wacker. 2017. Deciphering ADFGVX messages
pilers of codebooks often used regular pat- from the Eastern Front of World War I. Cryptolo-
gia, 41(2):101–136.
terns, rather than fully random patterns, to
map certain elements of the codebook to their Hermann Lorey. 1928. Der Krieg in den türkischen
equivalent numerical codes, thus facilitating Gewässern: Bd. Die Mittelmeer-Division, volume 1.
ES Mittler.
the work of adversarial codebreakers.
Charles J. Mendelsohn. 1937. Studies in German
• Instead of issuing entirely new codebooks, Diplomatic Codes Employed during the World War.
the German cryptographic services created War Department, Office of the Chief Signal Officer,
Government Printing Office, Washington, DC. Reg-
new variants of existing codebooks by only
ister 191.
modifying the order of their pages. As a re-
sult, the capture of one codebook was often PAAA. c. 1915. RAV Genua - Records from the
enough in order to reconstruct other related German General Consulate in Genoa. Politisches
Archiv des Auswärtigen Amtes.
codebooks.
Hermann Stützel. 1969. Geheimschrift und Entzif-
• The key for the superencipherment of ferung im Ersten Weltkrieg. Truppenpraxis, 7:541–
one codebook was often transmitted using 545.
another, possibly compromised codebook. Geoff Sullivan and Frode Weierud. 2005. Breaking
Moreover, the superencipherment methods, German Army Ciphers. Cryptologia, 29(3):193–
as well as the keys, were infrequently mod- 232.
ified. Dan Van der Vat. 2000. The Ship that Changed the
World. The Escape of the Goeben to the Dardanelles
As a result of those weaknesses, the author was in 1914. Edinburgh.
able to decipher the vast majority of the Genoa en-
coded traffic, using methods which are very sim-
ilar to those employed by Room 40 and other
WW1 codebreaking agencies. The decipherment
of the cryptograms in the Genoa collection also
exposes new historical material related to key de-
velopments and events in 1914-1915. Further re-
search is underway to analyze the contents of the
messages, and their historical context and signifi-
cance.
64
Learning Cryptanalysis the Hard Way:
A Study on German Culture of Cryptology in World War I
Dr. Ingo Niebel
Historian and journalist
Kasparstr. 10
50670 Köln, FR Germany
ingo.niebel@berriak-news.de
66
results, but also to link it with other fields and 2 Methods
disciplines. That approach will be explained in
the following three parts. Since the first decade of the 21st century, we
count with declassified records and information
In the following section, I will present the recovered from encrypted radiograms. The US
assumptions on which I have based this foreign secret service, the Central Intelligence
investigation. In two subsections, my aim is to Agency (CIA), and its technological partner, the
define my understanding of intelligence and why National Security Agency (NSA), published
it is still a "missing dimension" in historical documents related to cryptology
historiography. This in turn leads to the second including the names of persons, on their
subsection, which encroaches upon the German webpages. In parallel, the community of non-
"culture of cryptology". The third section governmental researchers, who dedicate
focuses on telecommunication and its impact on themselves to historical cryptology, were seeking
cryptology as in the earlier 20th century, the field unsolved messages from both World Wars. So,
of telecommunication was a relatively new with on the one hand, historians and cryptologists
unknown advantages and disadvantages. Finally, have access to new sources, on the other,
I shall refer to the sources, with a focus on the cryptanalysts provide "new" records and insights
the problems that historians encounter when by recovering and solving forgotten cryptograms.
using records obtained from the intelligence (Lasry et al. 2017, Sullivan and Weierud 2005)
services. All these new sources need to be put in a greater
academic context.
Following which, I shall present some of my
first results in chronological order. The four The ongoing investigation is based on two
subsections describe different aspects of the suppositions: Firstly, every state creates the
German cryptology culture. It begins with intelligence community it considers necessary.
specific terms Germans referred to in cryptology Therefore, the organizational charts of its
encyclopedias. The second subsection resumes ministries and armed forces can reveal the
the case when a German citizen publicly accused importance given to the secret and cryptologic
the Foreign Office of having plagiarized his services. Investigating these structures, also
code-system. The quarrel reveals that the Foreign sheds light on how the government allowed
Office showed no concern for security. The privateers to handle cryptology.
Crypto-Crisis of 1917 served two purposes, on
the one hand, it discussed the problems a Secondly, the saying “once an agent, always
historian deals with when he or she has to rely on an agent” defines the other mainline of research.
intelligence records; on the other, it indicated It focusses on the persons who worked for one or
also how such a source can push the various secret and/or cryptologic institutions.
investigation forward. The last subsection Both research fields are connected by seeking the
provides a firsthand explanation as to why there interactions between institutions and their
is still no comprehensive study on German personnel. That implies that one must follow the
cryptology. organizational change in the departments. It
would be interesting to know the social,
The fourth and last section brings us back professional, and cryptologic background of the
from the past to the presence. It provides some personnel.
hints as to why the German cryptology culture of
1914/1918 is linked in some way to the today's Today it is common to talk about the
"information security culture". This would also intelligence community by referring to all
provide some proposals for further governmental, military, and police institutions
investigations. dealing with intelligence. In parallel, we have the
cryptology community, composed also of
Due to time and space restrictions, and to the officials, privateers, and their departments or
fact that this article resumes the status of a firms. In contrast to their British and US
current investigation, it raises no claim to counterparts, the German cryptologists remain
completeness. relatively unknown. This fact makes both
cryptology and intelligence a part of a "missing
dimension" in historiography.
67
2.1 Intelligence, a "Missing Dimension" sense as information" concluded Kahn (2001).In
my mind, information becomes intelligence
Spies were additional pawns on the great according to the importance that is given to e.g.,
chessboard where the European powers played things, individuals, organizations, data, messages
the tragedy of World War I. As previously at a concrete time and for a specific aim. 1
mentioned, some crucial moves performed by
the decision-makers of one or the other sides, For delimiting intelligence as a "missing
were based on the intelligence gathered by their dimension", I consider its three principles very
radio stations and cryptanalysts. The question helfpful, which according to Kahn (2001),
lies in understanding to what extent this kind of describe its function. First, it helps to optimize
intelligence influenced military and political one’s resources. Second, intelligence "is an
decision-making. auxiliary, not a primary, element in war".
Thirdly, it is "essential to the defense but not the
In the 1980s some German and British authors offense". The yet mentioned battles of the Marne
had already mentioned intelligence as being the and Tannenberg seem to confirm Kahn’s theory.
"missing dimension" in political and military At this point we have the intersection between
historiography. (Höhne 1993:7) After analyzing intelligence and cryptology, but it is not
several dozen international publications on the necessarily the only one.
topic, Larsen (2014:282) concludes that this In 2016 the German Historical Institute
military conflict "remains in many ways London (GHIL) held a conference on "Cultures
underexplored by intelligence scholars." In fact, of Intelligence". In that context, the GHIL stated
he found less than a handful of German works on that "Culture was understood to include the role
that subject. of intelligence services in society and/or the
state, the representation of intelligence in the
This problem is caused partly by the public sphere and among the members of the
intelligence agencies themselves, because it is military/intelligence community itself, as well as
part of their nature to act secretly, without the interests, assumptions, and operating
always acting in a legal or morally correct procedures of intelligence." (Sassmann, Schmidt
manner. So, for the sake of security, the 2016:135) This definition can be used to define
intelligence services have good reason not to the German culture of cryptology.
share their records with historians who, on the
other, without these documents could not 2.2 About a German Culture of Cryptology
evaluate how large the "missing dimension" It is difficult to answer whether it is "a" or "the"
really is. German culture of cryptology because it depends
on the epoch. When we refer to a time before
Although intelligence services release their
1870/71, we should preferably use "a", because
records from time to time, historians cannot
the culture in question may be linked to a
expect to receive complete files. Due to the fact
specific kingdom on German soil. For example,
that deception and cover-ups belong to the
Rous (2011) analyzed the Saxon culture of
working tools of secret services and their assets,
cryptology in the 17th and 18th century. But we
scholars are forced to crosscheck every disclosed
unfortunately lack a similar investigation on the
information. Moreover, this makes their
Prussian culture prior to 1870.
investigations more complicated. On the other
hand, just Paul Gannon (2010) proved, referring When the Germans created their second
to the British interception log books, that His empire, the Prussian king got also their emperor.
Majesty’s codebreakers could read enciphered As a result all key areas such as the military,
German Naval messages already months before foreign, economic and home policy, for instance,
the war broke out and the Room 40 was installed. became centralized to the Prussian capital,
In consequence, his finding contradicts the
official version and requires reviewing of the 1
This is another very Anglo-Saxon definition of
prewar history of British cryptological efforts. "intelligence", it differs to how German secret service
officers used to interpret "Information" which, according to
Another complication derives from the them, becomes "Nachricht" (intelligence) when it is
necessity to define what intelligence really confirmed by other sources.
means. "I define intelligence in the broadest
68
Berlin. In the light of lacking documents, we codes and keys from the cipher-bureau. The
have to assume that the overwhelming presence Foreign Office provided the naval attachés with
of cryptography and the absence of cryptology codes, too.
reflects the Prussian culture of cryptology.
One research line focuses on the indivuuals
A characteristic of the new Reich was that the within the the Army and the Navy structures
ruling aristocracy managed to integrate the who dealt with cryptography. The other research
bourgeoisie in the new project. Instead of line concentrates on the "Abteilung für
democratizing the state, by getting rid of the Nachrichtenmittel" (department of
aristocrats, the bourgeois supported the communication means)2 of the Royal Prussian
monarchy. So entrepreneurs and bankers pushed War Ministry, another on the "Reichsmarineamt"
Germany’s industrialization and implementation (Empire's Navy Office). Both produced codes
of new technologies. Their crème de la crème and ciphers, and delivered them to the troops
were further ennobled. “Nonetheless the and the civil authorities. They primarily had
Wilhelminian Germany was still an authoritarian administrative functions, not operative.
society with a static social order of considerable Therefore, another research line looks for the
stickiness”, states the historian Wolfgang importance the armed services gave to
Mommsen (1995:71). cryptography in the education of their officers.
This and further investigations on the social The common denominator of these three
order should be taken into consideration, as they governmental institutions was that they favored
might explain the absence of an intelligence and cryptography, reducing cryptanalysis to a
cryptologic community, and also the "guessing" or buying codes and keys on the
incompetence of the armed services and the black market. This German credo of
Foreign Office to develop intercepting and cryptography expressed itself by the main code
codebreaking capabilities, as it had occurred in books such as the Handelsschiffsverkehrsbuch
the United Kingdom. The known facts indicate (HVB), the Signalbuch der Kaiserlichen Marine
that listening to foreign conversations and (SKM) and the Verkehrsbuch (VB). These
reading confidential messages could have put the became one of the essential parts of the very
above mentioned rigid order and separation of specific German culture of cryptology
powers at risk.
Opposite to the governmental cryptographic
From this point of view, the use of structures we find a considerable number of non-
cryptography seemed have come into place as a governmental cryptologists, who along the 19th
measure to guarantee the established order. In
and at the beginning of the 20th century,
fact, only officers and high ranking civil servants
were allowed to cipher and decipher encoded published a certain number of articles and books
messages. Following that logic, another measure on secret scripts and their decipherment. This
was to avoid the promotion of cryptanalytic allowed people interested in the specific field
skills. access information without major problems.
Germany’s oldest cryptographic institution 2.3 Telecommunication and cryptology
was Chiffrierbureau of the Foreign Office. It was
built in 1814 and belonged to the ministry's Telecommunication is in many ways essential for
Zentraldepartement. (PAAA 1936), thereby understanding the development of the German
putting the Chiffrierbureau and its personnel in culture of cryptology. On the one hand, the new
the focus of the current investigation. During that technology changed how people handled
time, the head of the government, the chancellor, communication. Telegraph, radio and telephone
was responsible for foreign policy. But his role replaced the traditional royal messenger services,
was limited, as he was only a counselor to the as the depeches were delivered faster by wire,
monarch.
2
The Kaiser acted also as the commander-in- The German "Nachrichten" can mean "news",
chief of the armed services. It is known that he "intelligence", "signals" or "communications". That makes
is complicate to decide whether a "Nachrichtenoffizier" is
used cryptography, but the extent to which it was
an "intelligence" or a "signals/communications officer".
used remains unknown. He supposedly got the
69
wave and cable. This mode of communication into their decision-making. This in itself
seemed to secure to everyone who believed that constituted another learning process because in
his or her codes were unbreakable. 1914 they had still not changed their plan of
attack that had been drawn up in 1905 under
At the very beginning of World War I, the very different circumstances. Nine years later,
British destroyed the German transatlantic they still believed they could win the war in the
telegraph cables. So they forced the Germans to west by the same manner as in 1870/71. They
communicate via radio with their colonies and thought that once again infantry, artillery, and
embassies or by telegraph connections. These cavalry plus modern weapons would bring
connections could be monitored by British victory, but not the less regarded signals troops.
telecommunication companies. In both cases,
London could intercept the communication and The information that is not included in the
try to read the encoded messages. publications has to be found in the archives. This
makes the project difficult because the principal
For this project it is necessary to take into Prussian-German military archives vanished
account this fact because there are several during World War II. The records of the different
German thesis in which lawyers addressed the cryptologic departments were either destroyed or
legal and strategic issues of such a violation of captured by the victors who delivered them to
the postal secrecy long before the Royal Navy their cryptologic or intelligence services. As
made their worries real in 1914. mentioned before, the CIA and NSA declassified
such documents, as also did the British services.
Another important aspect is that the In consequence, the respective holdings could
importance of SIGINT can only be understood if aid in recovering such information that is lacking
we know the technical equipment of the signals in the German archives.
troops and its limitation. Because of the
technology, climate and geography, messages A first look into the Political Archive of the
had to very often be repeated. This increased the German Foreign Ministry (PAAA) showed that
possibility of a radiogram being intercepted. In the entire holdings of the Chiffrierbüro have
this way, experienced cryptologists and analysts disappeared. The existence of the cipher-bureau
could complete crippled messages. is only confirmed because its name appears in
the organizational charts of the ministry, and on
In this context it is also necessary to refer to several documents which can be found in other
the technical efforts to mechanize the encoding holdings. If there has once been a
and decoding process. Although the German correspondence, for example, between the
Army would purchase the Enigma only in the cipher-bureau, the Army and the Navy on codes,
1920s, some documents indicate that, at least, its it not longer exists, at least not in this archive.
theoretical development might have already The unpublished memories of cryptologists such
started before World War I. as those of Adolf Paschke somewhat enlightened
2.4 "Ad fontes" - To the Sources the gloomy situation.
As I mentioned above, Kahn’s publications on The situation in the German Federal Archive,
cryptology are essential because they frame the the Bundesarchiv, is slightly different. On the
investigation. Articles such as those written by one hand, there are only a few sources related to
Stützel (1969), Brückner (2005), and Samuels the cryptography in the Army, on the other hand
(2016) give further information on facts and there is much more information on the
sources regarding the German cryptology. In cryptological work done by the Navy before and
contrast, selected monographies on the French, during World War I. The first impression after a
British, US cryptology and SIGINT describe the stay in the Military Archive of the Bundesarchiv
“hostile environment” in which the German at Freiburg is that there is more information than
culture of cryptology started to grow in the I expected.
summer of 1914.
Due to the fact, that the archives of the
The investigation takes into account, how the Prussian Army and the War Ministry were
military commanders integrated cryptology destroyed, there is some hope that the
SIGINT and this kind of intelligence gathering correspondent holdings of the Bavarian State
Archive could close this gap in some way. The
70
research in the regional archive of North Rhine- 3 How Germans learnt cryptology
Westfalia provided some information on how the
Prussian Interior Ministry introduced 3.1 Ignoring cryptanalysis and SIGINT
cryptography in its communication with the
In search of reasons to explain the German
regional military institutions. 3
fixation on cryptography, I consulted several
In this context, the "William F. Friedman editions of the popular encyclopedias such as
Collection of Official Papers", as called by the Meyer’s Konversations-Lexikon and the
NSA, is of particular interest. It contains more Brockhaus. Between the 19th and 20th centuries,
than 7,600 documents spanning over 52,000 both publications not only ignored the existence
pages. The collection can be searched and of the word "Kryptologie", but also indicated that
downloaded as a PDF via Internet. 4 Due to the the term should be replaced by "Geheimschrift"
close relationship between the cryptologic and or "Chiffre". Since the beginning of the 19th
intelligences communities of the US and the UK, century, the cryptologic horizon seemed to be
the NSA collection must be seen in connection to limited to cryptography.
the respective holdings in the British National
This limitation is curious because just a retired
Archives at Kew, as some German related
officer published a classic on cryptology in 1863.
documents of supposed US origin were gathered
Major Friedrich Wilhelm Kasiski titled his book
in fact by their English "cousins".
"Die Geheimschriften und die Dechiffrir-Kunst"
In this context, and from a purely academic (Secret scripts and the art of decipherment). As
point of view, the decrypts of intercepted the title indicates, it reflects upon our modern
German radiograms published by Lasry et al. understanding of cryptology and is based on
(2017) present a special kind of document. To cryptography and cryptanalysis.
some extent, they are "retranslations" from an
The facts collected on Kasiski indicate that he
original text which was encoded and sent by
had nothing to do with the cryptology while in
radio. Albeit the cryptanalysts broke the code
the military. Though he dedicated his book to the
and got a plaintext again, the latter should be
acting war minister Albrecht von Roon, the
compared with the original message, if possible.
author addresses him only as his former
In any case, researchers need an organizational commander. It seems that the military hierarchy
chart of the institution in question. This is decided to ignore both Kasiski’s cryptological
essential for two reasons. First, an organizational efforts and SIGINT as well.
chart helps to identify the departments concerned
“In Germany to be sure, the General Staff
with cryptology inside a ministry, which can be
thought of such possibilities, but down to the
helpful if the search using keywords was not
outbreak of World War I had undertaken
successful. Second, an organizational chart
practically nothing. Even in the Foreign Office
uncovers the position of a cryptologic section in
nothing had been done in this direction which
the respective structure. It makes a difference if
was worthy of mention” states the signals officer
it is attached directly to the minister's bureau or
Wilhelm Flicke.5 Only during the battle of
if it is a department or if it positioned on a lower
Tannenberg in 1914, the high command would
level being only a section or a subsection. So, the
discover the advantages of SIGINT and
archives and their holdings themselves generate
cryptanalysis. It took several months until the
valuable "intelligence" on the German culture of
new possibilities were included into its military
cryptology.
organization.
71
Chiffrierbureau underestimated cryptanalysis and can only result if the entire cipher is betrayed or
the interception of foreign messages by technical essential parts and keys come to the knowledge
means. Security did not seem to feature high on of a foreign government. Of course, there is no
their list of priorities. absolute security against betrayal and the only
aid is the frequent change of cipher and of keys,
It seems strange, at first, that until the end of which is abundantly provided for here." 6
World War I the Auswärtiges Amt published in
the "Handbook for the German Reich" the This information on the Foreign Office's culture
identities of all the officials who worked for its of cryptology is provided by a document kept in
Chiffrierbureau. This extent of governmental the above-mentioned Friedman Collection. The
transparency included the names of individuals NSA labeled it "CRYPTOGRAPHIC SYSTEMS
who were civil servants and all the medals they USED BY GERMAN FOREIGN OFFICE; THE
had been awarded. Any foreign intelligence ZIMMERMAN [sic] TELEGRAM.” A
serviceman would have been grateful to get his handwritten remark on the first page of the PDF
hands on the list of potential targets, who had indicates that it belonged to a folder where
access to classified material. Friedman stored various information on the
Zimmermann-telegram. This thereby leads to the
Secondly, neither the Ministry nor the cipher- beginning of the problem.
bureau seemed to be concerned when in 1872 the
printer M. Niethe accused both to have stolen his The NSA, as the CIA, does not scan entire
code-system. The fact that several editions of his folders but only documents. At this point, we see
book can be found in various German public again the primacy of intelligence and information
libraries proves his enthusiasm but also that over the archival context of a document, as
neither the Reich government nor the described in chapter 2.4. From the historical
Auswärtiges Amt tried to silence him using point of view we are not dealing with an original
censorship, albeit at one point during the conflict but with a copy, meaning, an English translation.
Niethe was summoned by the police.
Though the translator seemed to be a
The background information on the cipher- professional -he or she even reproduces the
bureau personnel mentioned in the Handbook, layout of the German original- but we are
and the Niethe case are the basis for further unaware of how Friedman got possession of the
investigation on the culture of cryptology document. Nor do we have further information
followed by the Auswärtiges Amt. on the remaining original German texts. Despite
all these questions, Zimmermann's statement and
3.3 The Crypto-Crisis of 1917 other correspondences scanned into the PDF
The publication of the Zimmermann-telegram seem to be genuine because they are supported
in spring of 1917 not only brought the US into by the article Stützel (1969) mentionededited in a
the war but also exposed the opinion of the West German military publication.
Auswärtiges Amt about the security of its codes. In 1917, Stützel was a "lieutenant of the
The incident caused a major discussion reserve", as we can read in one of the translated
reagarding cryptography between the Foreign letters. He proved that in terms of the strength
Office, High Army Command, the Army and the and security of its codes, the quoted assumption
Navy. The “crypto-crisis” can be considered also of the Foreign Office was inaccurate. Stützel
as the endstart of the German cryptology because intercepted and solved the encrypted messages
from that point on, cryptanalysis was used as a sent between the Auswärtiges Amt and the
means to test the strength or weakness of German Embassy at Madrid. His discovery
German codes. generated the mentioned discussion on insecure
On 23 March 1917, the secretary of State, codes.
Alfred Zimmermann, wrote to the representative This incident is important because it uncovers
of his Foreign Office at the General different aspects of the German cryptology
Headquarters, the baron Kurt von Lersner: culture. First, it stresses the role of the
"Decipherment of these telegrams is simply
impossible even for the most clever specialists. It 6
The PDF’s filename is 41716799075610.pdf
72
Chiffrierbureau as the unique provider of Though the political system changed, and the
diplomatic codes. Second, to believe that its military had to downsize its structures according
codes are unbreakable can be considered as to the Treaty of Versailles, the Army and the
ignorance but it also expresses the inflexibility Navy maintained their principal SIGINT and
that was characteristic of imperial Germany. cryptology organizations, which also included
Third, itthis above mentioned stickiness made it part of the personnel.
impossible that the governmental structures
reacted quickly regarding changes in its Lasry et al. (2017) provide solved radiograms
structures and codes. Finally, Stützel’s sent by the signals captain Walther Seifert. After
cryptanalysis on the diplomatic codes questioned the collapse and defeat of 1918, he switched over
not only the expertise of the Chiffrierbureau but to the Chiffrierstelle (cipher-section) of the
also theput at risk the trust competence of the Reichswehrministerium (Ministry of the Armed
entire division of the the armed services in the Forces). In 1933, he was a part of the founders of
Foreign Office. the Forschungsamt (Research Office). The latter
became the technical intelligence agency of the
3.4 The imposed silence National-Socialist Germany, which was a part of
Hermann Göring’s Reich Air Ministry, and
In the 1920s, the former Austrian captain, Seifert its head of cryptanalysis.
Andreas Figl, planned to publish his memoires
and experiences as the head of the cryptologic Albert Praun started his military career in the
section of the Austro-Hungarian army signals troops of the Bavarian Army. He later
intelligence service, Evidenzbüro. This took over several military commands until 1944
publication reveals reasons as to why German and then became the Army's Chief Signals
cryptology of World War I never was treated by Officer. From 1956 to 1965 he headed the
its protagonists as the British did. SIGINT department of the West German foreign
intelligence service, the Bundesnachrichtendienst
After the first of three volumes werewas (BND). Although Praun published some articles
published, Figl was pressurized to step back from on that subject, he kept his imposed silence.
his project. "The action against me came from
the [Austrian] Federal Army and -as I 4 The Presence of the Past
ascertained later- from the German General
Staff", Figl recognized in his unpublished In spite of the mentioned publications and
memories.7 He states that the intelligence and sources, the imposed silence on the German
cryptologic communities held opposite opinions culture of cryptology persists. The research on
on weather he and his colleagues were still this area has recently begun and some questions
bound by the duty of secrecy or not. Figl thought might never be answered. Investigating the
he was no longer bound as the state he swore to - German culture of cryptology prior to and during
the Austro-Hungarian Monarchy- was no longer World War I is linked to our modern security
in existence since 1918, when it broke into culture because both are parts of the same chain.
several independent republics. The Austrian
Emperor had to abdicate and go into exile, There are at least two further links, those of
similar to his German incumbent. cryptology and intelligence during World War II
and the Cold War. For the latter it would be
Obviously, on the other side of the Alps, the interesting to know whether the cultures of
German military saw that quite differently. From cryptology in the two German states were
the legal perspective, one has to question different because of their opposite
whether the duty of secrecy sworn before 1918 politicalideological views due to their particular
persisted or not. The oath was considered intelligence cultures. The next step would be to
legitimate if it was to the German Reich but not compare it at least with the French, English, US,
to the Kaiser and king. The latter, although and Russian cultures of cryptology, if possible.
converted from monarchy into republic, persisted
as the official denomination of the new state. But before we follow the chain up to the
present time, we should look back from the
German Reich of 1871 to the earlier epoch of the
7
18th-century-Black Chambers. Perhaps, on the
Bundesarchiv, MSG 2_18031
one hand, this investigation can provide
73
information for closing the gap between the Prof. Arno Wacker, Dr. Nils Kopal, and Dr.
cryptologic and intelligence system of the late George Lasry who invited me to reconstruct the
19th century and that of Prussian king Frederick historical context of the German messages they
the Great in the 18th century. If, on the other, due had solved. I also thank the three anonymous
to the lack of reliable sources, it could be helpful reviewers whose commentaries made me rethink
to compare at least the code-systems used in both some aspects of this article. Last but not least, I
periods. Maybe similarities could be found and am deeply indebted to Dr. Roopika Menon who
shed some light on Prussian cryptology and its helped me to improve my English text.
continuity.
References
Describing the learning process the German
culture of cryptology, it underwent, between Maria Bada and Angela Sasse. 2014. Cyber Security
1870 and 1918, several changes. The truereal Awareness Campaigns: Why do they fail to change
activity of the Chiffrierbureau will never be behaviour?
discovered but at least the files on its personnel http://discovery.ucl.ac.uk/1468954/1/Awareness%2
could provide information on how they entered 0CampaignsDraftWorkingPaper.pdf, last seen
the section and what kind of preparation they 15.01.2018.
undertook for their work. In this context it would Böhme, Hartmut: Vom Cultus zur
be interesting to analyze the path and networks
Kultur(wissenschaft). Zur historischen Semantik
of those cryptologists who started their career in
des Kulturbegriffs. In: Renate Glaser/Matthias
the Army or in the Navy.
Luserke (Ed.). 1996. Literaturwissenschaft –
A part of a learning process is also how people Kulturwissenschaft. Positionen, Themen,
handle their successes and above all their Perspektiven. Westdeutscher Verlag, Opladen: 48-
failures. As Figl mentioned, the German military 68.
and political elites avoided being held liable for Hilmar-Detlef Brückner. 2005. Germany's first
their failures in matters of cryptology and
Cryptanalysis on the Western Front: Decrypting
intelligence. They covered up their first major
British and French Naval Ciphers in World War I.
defeat in France by calling the correspondent
Cryptologia, 29(1):1-22.
battle the "wonder of the Marne". In this and
other cases, the history of German cryptology Paul Gannon. 2010. Inside Room 40: The
can correct the greater picture of World War I by Codebreakers of World War 1. Ian Allan
demythologizing some of its narratives. Publishing, Hersham.
In this context, the fact that humans tend to Heinz Höhne. 1993. Der Krieg im Dunkeln. Macht
copy behaviors, becomes a problem. To change und Einfluss der deutschen und russischen
certain cultures renders itself even more difficult Geheimdienste. [The war in the dark. Power and
influence of the German and Russian secret
if people are not used to questioning ideals. The
services.] (Special printing). Gondrom Verlag,
official silence imposed on cryptology and its Bindlach.
history was absolutely not helpful. This might
explain the reason, amongst others, why in David Kahn. 1996. The Codebreakers. The Story of
World War II German officials kept using the Secret Writing, [Kindle, ipad mini version].
Enigma cipher machine even though they knew Downloaded from Amazon.com.
of its weaknesses. Referring to Zimmermann's David Kahn. 2001. An Historical Theory of
statement on code security, one has to question Intelligence. Intelligence and National Security,
human ignorance because till date some things 16:79-92. http://david-kahn.com/articles-
were not meant to be. In this context matches the historical-theory-intelligence.htm, last seen
warning, the US-philosopher George Santayana 14.01.2012.
gave us: "Those who cannot remember the past
are condemned to repeat it." Friedrich Wilhelm Kasiski. 1863. Die
Geheimschriften und die Dechiffrir-Kunst. [The
Acknowledgements Secret Scripts and the Art of Decipherment] E.S.
Mittler, Berlin.
I would like to express my appreciation to
Prof. Christof Paar (Bochum) who brought me
from history into the world of cryptology, to
74
Daniel Larsen. 2014. Intelligence in the First World German Historical Institute London Bulletin,
War: The State of the Field. Intelligence and 38(2):135-140.
National Security, 29(2):282-302.
Hermann Stützel. 1969. Geheimschrift und
George Lasry, Ingo Niebel, Nils Kopal, and Arno Entzifferung im Ersten Weltkrieg [Code and
Wacker. 2017. Deciphering ADFGVX messages Decipherment in World War I]. Truppenpraxis
7:541-545.
from the Eastern Front of World War I.
Cryptologia 41(2): 101-136. Geoff Sullivan and Frode Weierud. 2005. Breaking
German Army Ciphers. Cryptologia, 29(3):193-
Wolfgang J. Mommsen. 1995. Bürgerstolz und
232.
Weltmachtstreben. Deutschland unter Wilhelm II.
1890 bis 1918. Propyläen Verlag, Berlin.
David Paull Nickles. 2003. Under the Wire: How the
Telegraph Changed Diplomacy. Harvard Univ.
Press, Cambridge, Mass.
M. Niethe. 1875. Das "Suum cuique" in neuer
Interpretation seitens des Auswärtigen Amts:
nothwendig gewordener Anhang zu des Verfassers
Werk: Das bei der Chiffrir-Abtheilung des
Deutschen Reichskanzleramts eingeführte
telegraphische Chiffrirsystem etc. [The "Suum
cuique" (to each his own) in a new Interpretation
from the Foreign Office: An Annex, which had
become necessary, to the Author"s Work: The
Telegraphic Cipher-System introduced into the
Cipher-Department of the German Reich
Chancellory etc.] M. Niethe, Berlin.
Markus Pöhlmann. 2005. German Intelligence at
War, 1914-1918. Journal of Intelligence History,
5(2):25-54.
Politisches Archiv des Auswärtigen Amtes (PAAA).
1936. Organisation des Auswärtigen Amtes bis
1936. [handwritten chart] Berlin.
Anne-Simone Rous. 2011. Geheimschriften in
sächsischen Akten der Neuzeit [Secret Writing in
Saxon Files of the Modern Age]. Neues Archiv für
sächsische Geschichte, 82:243-254.
Anne-Simone Rous and Martin Mulsow (Ed.). 2015.
Geheime Post: Kryptologie und Steganographie
der diplomatischen Korrespondenz europäischer
Höfe während der Frühen Neuzeit [Secret Mail:
Cryptology and Steganography in the
Correspondance of the European Courts during the
Early Modern Age]. Duncker & Humblot, Berlin.
Martin Samuels. 2016. Ludwig Föppl: A Bavarian
Cryptanalyst on the Western front. Cryptologia,
40(4):355-373.
Bernhard Sassmann and Tobias Schmitt. 2016.
Cultures of Intelligence Conference Report.
75
New Findings in a WWI Notebook of Luigi Sacco
Paolo Bonavoglia
former teacher of Mathematics
Convitto Nazionale Marco Foscarini,
Cannaregio 4942, I 30121 Venezia, Italy
paolo.bonavoglia@liceofoscarini.it
2 English: At Orsova our troops have gained ground again. 3 These pages are between a page dated 11-10-1916 and one
South of Hatzeg lost the Rumanians dated 17-10-1916.
78
Major Koppen deutsche Gesandtschaft Sofia Bitte um Nachricht ob Assistenzarzt Moritz
erbitte Draht Antwort welche Formationen zuletzt in Valievo7 als Arzttatig aus Serbischer
dort unterstellts in4 befreit wurde8
Another cryptogram is an irregular rectangle 5 October ‘16: three grille cryptograms
key transposition, much more difficult to break.
Near the end of the booklet, October 19169, a few
grilles appear; Sacco only presents them together
with some conjecture about a possible solution,
but no solution is given.
In the following couple of page, Sacco
displays two 8x8 grilles, both unsolved, the
second incomplete.
Figure 5 : The original cryptogram is missing,
but of course it can easily be reconstructed.
4 English: Major Koppen asks the German Embassy in¨ Sofia 8 English: Please let us know if the assistant physician Moritz
a wired answer, which formations are placed there. recently in Valjevo as aid physician has been released.
5 English: You are asked to get by the Bulgarian Army 9these pages have a beginning date, 17-10-1916; the next is
Command the disposition of the troops. Major Koppen, Stage dated 20-10-1916.
(rear) command.
10
under these two grilles he showed some unfinished and
6 Wikipedia is not the best source for serious research and its unsuccessful trials.
reliability is variable; but in this case it was the only source I
11
could find about this Major Koppen; and, after all, I just See (Bauer 1997) p. 96,97.
needed a confirmation he was a real German military officer.
12 The cryptogram was decrypted by Barth Wenmeckers with
7 Valjevo is a city of Serbia. a hill cipher algorithm and independently by the author with
a computer aided software implemented ad hoc.
79
ESWURDENDREIPUNKTEGESEHEN fleet in the Adriatic Sea; not very likely from
OTLLICHWEITESRSSUCHENXY German ships in the Black Sea.
There are a few typos and some extra S; the 7 Conclusion
spaced and cleaned text is:
Es wurden drei Punkte gesehen östlich weiter As already stated, the booklet has 160 pages, there
suchen XY13 are still a lot of pages to be studied; these are the
more interesting found so far, but there is always
the possibility of something more important to be
found.
Other pages are about the Austrian diplomatic
code, Austrian and German Navy codes, and
others, but no complete cryptograms with
decrypted texts are given.
Figure 8 : The 7x7 grille
I’m publishing the whole booklet on the web,
so any researcher will be able to examine it.
The 8x8 grilles were also decrypted; here is the
Acknowledgements
first:
Feuer eingestellt feindliche Fahrzeuge I wish to thank Cosmo Colavito, engineer and
abgewandte ausser Sicht Flotten telecommunications historian, for help about
K[ommando?]14 Italian Army history in WW 1, and Diana
Schindler, for help in translating German
And here is the decrypted text of the second cryptograms.
8x8 grille, which happened to be encrypted with
the same grille: References
Krieg Ministerium ist ersucht beantragtes Friedrich L. Bauer. 1997. Decrypted Secrets. Springer,
Guthaben von Zw15 Berlin, D. ISBN: 3-540-24502-2
Bauer in his book 16 , writes that the German Yves Gylden. 1933. The contribution of the
Army “early in 1917 suddenly introduced turning cryptographic bureaus in the world war. Signal
grilles with denotations like ANNA (5x5), Corps Bulletin 75 and 81, Washington, DC.
BERTA(6x6), CLARA(7x7), DORA(8x8), David Kahn. 1967. Codebreakers. Scribner, New
EMIL(9x9), FRANZ(10x0).” Are these grilles the York, NY. ISBN: 978-0-684-83130-5
first of this kind? A few months earlier that
reported by Bauer? Why this small difference? Luigi Sacco. 1947.Manuale di Crittografia.
Did Sacco manage to solve these grilles in the Ist. Poligrafico dello Stato, Roma, Italy.
following months? At the end of October 1916, Luigi Sacco. 1977. Manual of Cryptography Laguna
he moved to Rome, and his booklet ends in the Hills, Aegean Park Press, (English translation).
same days. We simply do not know. The answer
could be in the notebooks and papers of Sacco in
his Rome office, but all these papers were likely
destroyed or lost.
Could these cryptograms be Austrian rather
than German? The first two cryptograms look like
Navy messages and could come from the Austrian
13 15
English: Three points were seen eastwards, seek further English: The War Ministry is requested of the required
XY. balance by Zw
14English: Ceased fire, enemy vehicles got out of sight. Fleet 16 (Bauer, 1997) pag. 96; see also (Kahn, 1967) pag. 308.
Command.
80
W ORLD WAR II
82
The First Classical Enigmas
Swedish Views on Enigma Development 1924-1930
Anders Wik
S Catalinagr 9
S-18368 Täby, Sweden
anders.h.wik@gmail.com
84
step in the encipherment process giving it a ChiMaAG had changed the layout compared to
period length of “about 17500” (263=17576). what they had offered, probably after consent
The new machine could also be delivered with a from SGS. From the back are now rotors, three
28-character alphabet whereas the Enigma A rows of lamps and then the keyboard set up in
only could have a 26-character set. The Enigma alphabetical order. The rotors have an adjustable
A was apparently in stock since they stated that ring setting.
up to 10 machines could be sold with immediate
delivery. Improvements for the Enigma B The wiring of the rotors is as follows in
compared to the machine shown in Stockholm cyclical notation:
should be:
I: (ÖAPRE) (CBSZYLÄKOFXN) (DGQTVI)
1. Different layout: From back to front first (JH) (MUÅ)
two rows of lamps, then the rotors, then
II: (7, 1, 3, 14, 23, 21, 11, 20, 5, 24, 16, 27, 22,
two rows of keys.
2. The rotors move automatically when a 17, 9, 13, 25, 6, 28, 10, 15, 2, 8, 4, 19, 26, 12, 18)
key is pressed making the Antriebstaste III: (5, 1, 26, 11, 28, 12, 25) (10, 2, 22, 4, 9)
unnecessary. (15, 3, 17, 24, 13, 19, 14, 16, 6, 27, 7, 23)
3. The letters are white on a black (8, 18, 21) (20)
background
4. The possibility to check that no lamp is The reflector is fixed in one position and has
faulty. the following connections:
5 Swedish Enigmas
85
“Funkschlüssel C” had 29 keys and 29 lamps It seemed clear that SGS thought highly of the
where the letter X went straight to the lamps Enigma and were going to buy it. In his memoirs
without being enciphered. The wiring of the Boris Hagelin wrote that he visited the person in
rotors was most likely different. The Swedish charge at SGS, major Warberg, and asked him to
machine had a reflector which was fixed in one wait six months with their decision. This would
position whereas the Funkschlüssel C had a allow Cryptograph to make a prototype of a
possibility to fix the reflector in four different machine of Enigma type – but better. He was
positions. Apart from that the two machine types granted the time.
would very likely have been the same.
A G Damm had in 1919, independently of
A delivery of 50 machines to the Reichsmarine Scherbius and Koch, patented a form of wired
took place in January 1926 (Weierud 2014). rotors (Damm 1919). This was essential for
These machines were supplied with two extra Hagelin. By using parts of Damm´s B1 and B13
rotors. This was mentioned in a report from the machines he was able to produce a prototype of
Swedish military attaché in October 1925, which what was to become the B21 machine. His
said that a customer had ordered two extra rotors machine had lamps and rotors, an irregular
which gives 60 different rotor combinations stepping mechanism for the rotors and a wider
instead of 6. ChiMaAG suggested that SGS variety of operator key settings. Hagelin´s
should do the same. prototype was enough to stall immediate
decisions and eventually secure the order from
7 Swedish competition SGS and the Ministry for Foreign Affairs.
Boris Hagelin, who was now in charge of AB Cryptograph had acquired an Enigma (A344),
Cryptograph, heard about the strong interest which was sent to Damm in Paris. He sent back a
from SGS for the new Enigma machines. preliminary report in August 1927 (Damm 1927).
Cryptograph had good contacts with the Swedish There he noted that he made a study already in
military, but their only viable product was the September 1924 based on patent descriptions and
A22, which was far less attractive than the other available information. That earlier report
Enigma. has not been found. In the 1927 report he wrote
that the security is low if the wirings are known.
Nevertheless, the two machines were tested He claims that he is developing a method to
against each other in May 1925. Captain solve Enigma but writes that it would be
Backlund limited his comparison to practical improper to give details in a letter. Instead he
matters such as encryption speed where he stated goes into a detailed discussion of the machine.
that encrypting a 100 character message would
take 6 1/2, 4 and 3 minutes respectively for 1, 2 He concluded his seven-page report with his
or 3 persons whereas for the A22 it would take verdict. Enigma is a reasonably handy method to
around 4 minutes independently of the number of encipher… but … the security is directly
people involved. Backlund noted that the Enigma dependent on keeping absolutely secret not only
machines had a stepping error. This is the double machine details but also texts - even if they are
stepping effect described by Hamer (1997). meaningless – that have been enciphered.
Lt Samsioe gave a preliminary assessment of His report might have helped Cryptograph by
the security in November 1925. He wrote that the casting doubt on the Enigma even if his report
A22 seems to give a fairly low security, which does not contain arguments to show that the
possibly could be improved. His study of Enigma Enigma system is weak.
B was not concluded. He notes that the period is
283 but that there are subperiods of 28 which it 8 The next generation Enigmas
might be possible to isolate. (Actually the period
is 28 x 27 x 28 because of the double stepping.) Contacts between SGS (through the embassy in
A22 and Enigma B share the problem that a Berlin) and ChiMaAG continued during 1925
change of message keys does not change the while the two delivered machines were being
character of the cipher enough. A new key is just evaluated in Stockholm. A request from SGS
a new starting position in the same crypto period. concerned a machine with printer and compatible
Since the order of the three rotors could be with the lamp machines. At first the company
changed there could be six different key series. seemed to be developing such a compatible pair.
86
However, in April 1926 the company stated that that the pair of machines could be tested in
such a solution would not be developed since the operational use. Herslow was quite familiar with
printing machine would lose functionality and the Enigmas after many discussions at ChiMaAG.
the lamp machine would be heavier, costlier and
less reliable. When Herslow left for Moscow in the
beginning of April 1927 the new machines were
In the summer of 1926 LtColonel Carl Herslow not ready so the company supplied two machines
succeeded Henry Peyron as military attaché. on loan (A361, A362), one for Herslow, one for
Herslow had a good knowledge of crypto matters Stockholm. Warberg provided a 12-page
and had worked in the group of officers at SGS document with detailed instructions for key
which solved Russian diplomatic code traffic settings etc. (FRA 1927). The two Zählwerk
during WW1. This work was in cooperation with machines (A350 and A351) were delivered in
Germany, which may have benefitted Herslow´s May 1927 and the machines on loan were sent
insights into German security matters (Grahn back.
2017). In August 1926 Herslow visited the
company and reported that the new machine was 9 Enigma or B21
in its final shape. It had been introduced at the
Auswärtiges Amt and would soon also be Presumably the Zählwerk Enigma was studied
presented to the Reichswehrministerium. The and tested. There is no communication in the file
new machine had four rotors (presumably three for the coming six months. In December 1927
plus a reflector) and also spare rotors in a SGS asked for a quotation for the delivery of 40-
separate box. The price was quoted as 600 RM. 60 machines with a 28-character alphabet – with
Warberg at SGS was interested. He would like to or without Zählwerk. The reply from the
test the new machine, preferably with a 28- company was prompt. They quoted a basic price
character alphabet. of 600 RM for a 26-character machine and gave
two options. A Zählwerk would add 100RM and
In November 1926 Herslow wrote to Warberg 28-character alphabet 30 RM.
to tell him that the Reichswehrministerium had
got delivery of a small series of the new machine. Parallel negotiations were going on between
With Swedish specifications (28 characters) the SGS and Cryptograph and the decision was made.
new machine would be slightly bigger and could B21 was chosen as m/29, SGS standard machine
be offered at a price of 600 RM a piece at an model of 1929. No final evaluation has been
order of 30-40 machines. A counter (Zählwerk) found in the archives. Therefore one can only
was optional and would add 40 RM to the price. speculate about which arguments were the
A new machine with printer (a development of decisive ones. A longer key period? Rotors wired
the Handelsmaschine) was expected to be in Sweden? Wider user key space? Support to
developed by March 1927. It was aimed for use Swedish industry?
by higher staffs and had a price of about 2000
RM. The Navy had some independence from SGS.
Their order for three Enigmas was the last sign
Test machines meeting Swedish requirements of interest from the Swedish armed forces. After
would be quite costly. Therefore, in February delivery of A853, A854 and A855 in April 1929
1927, Warberg asked Herslow to buy two 26- there seems to be no interest in Enigmas from
character Enigmas with Zählwerk (at 700 RM a Swedish authorities.
piece). In March 1927 Warberg reminds him that
he should check the machines on delivery so that 10 Not quite the end
they do not have the stepping error of the earlier
machines (cf section 7 above). The new Carl Herslow, mentioned above, was in 1928
machines were Zählwerk machines and had a recruited by Ivar Kreuger, a Swedish industrialist
different stepping mechanism. That check should and entrepreneur known as the “Match King”.
therefore have worked without problem. By aggressive investments and innovative
financial instruments he built a financial empire
Herslow was going to take up a position as which in the end controlled between two thirds
military attaché in Moscow. He was instructed to and three quarters of worldwide match
take one machine with him to Moscow. The production. His activities needed secure
other one should be delivered to Stockholm so communications and the Swedish match
87
company Svenska Tändsticks AB (S.T.A.B) Damm, Arvid Gerhard. 1919. Swedish patent SE52
became one of just a few non-government buyers 279. Filed Oct 10, 1919. US patent 1 502 376, July
of Enigma machines. 22, 1924.
There is a note that Herslow bought two Damm, Arvid Gerhard. 1927. Preliminärt utlåtande
machines “for Kreuger” in the spring of 1928. angående “Glühlampen-Chiffriermaschine
Enigma”. Krigsarkivet, Stockholm. Boris Hagelins
Also there is a note from 1935 that two machines
privatarkiv, vol F II:3.
(A343 and A344) were presumed to be at
S.T.A.B. All in all it seems that Kreuger´s ENIGMA Chiffriermaschinen. 1924.
company bought six Enigmas (numbered A327, Handelsmaschine. FRA Crypto collections.
A328, A343, A344, A801, A802) (Weierud 2014) Booklet.
Apart from these regular machines S.T.A.B Faurholt, Niels O. 2006. Alexis Køhl: A Danish
also bought three small Enigmas, model Z30, Inventor of Cryptosystems. Cryptologia vol 30.
aimed at enciphering digital codes. No
FRA archive. 1924-1930. Bearbetningsbyrån F V:1.
documentation concerning this has been found. “Chifferapparaten Enigma”.
The acquisition of these three machines, bought
around 1930, concludes all dealings between FRA archive. 1927. Bearbetningsbyrån F V:1.
Sweden and Chiffriermaschinen AG. The three Instruktion för användning av Chifferapparat
Z30 machines are part of FRA Crypto collections Enigma B /Chiffer EZ/.
(Wik 2016).
Grahn, Jan-Olof. 2017. Om svensk signalspaning -
A broader picture of Swedish cryptography and Pionjärerna (“On Swedish SIGINT – The
early Swedish Sigint between the world wars is pioneers”). Medströms bokförlag, Stockholm.
given by McKay and Beckman (2003).
Hamer, David. 1997. Enigma: Actions involved in
Acknowledgements the ‘double stepping’ of the middle rotor.
Cryptologia vol 21.
The author is most grateful to Frode Weierud for
sharing his Enigma expertise and for his valuable McKay and Beckman. 2003. Swedish signal
advise. intelligence 1900-1945. Frank Cass, London.
88
An Inventory of Early Inter-Allied Enigma Cooperation
Marek Grajek
Freelance cryptography consultant and historian
Poland
mjg@interia.eu
90
methods” 2, and its preface partially reveals the 1945, had confused the question of the
identity of its, otherwise unsigned, authors; document’s attribution. The German reports
“Below we sketch how the Cipher Bureau of the based on his interrogation in 1944 mention only
Polish General Staff managed to reconstruct the two mathematicians; it seems probable that
Enigma model described above, and methods Langer’s mind adjusted (consciously or
invented to assure prompt deciphering of its unconsciously) to the situation after Różycki’s
messages, in spite of the changes and death.
improvements introduced by the German cipher
service to protect their security”. A brief mention While the scope of the document covers
in one of Lt. Col. Langer’s (former head of events having taken place between the Pyry
Polish Cipher Bureau) reports allowed this conference and the fall of France in June 1940,
author not only to place the document in its time- its basic structure and form, as well as
line, but also to understand the circumstances of comparison with other documents edited by
its creation. After his liberation from the German Marian Rejewski and his colleagues, suggest
internment camp, Langer (1945) was existence of their common source – presumed to
commissioned to write a report presenting the be the Pyry report. The term “abridged” used in
circumstances of his team’s evacuation from the title might suggest existence of a full version
southern France in 1942 and the events that of the same document. Working in France, in
followed. It is in that report that we find a 1940 or later, at Bertrand’s request, it would be
following statement: “At Château des Fouzes, natural for the codebreakers to prepare the text in
Bertrand requested that a report be prepared French (at least two members of the team were
presenting the contribution brought by each of fluent in that language). However, existence of
three partners to Enigma solution. The report was the German language reference, and economy of
prepared by Lt. Rejewski and Zygalski. After labour dictated the preparation of an abridged
Bertrand had studied the result he declared that version of the existing German language
the work must be rewritten from scratch, as document, complementing it with coverage of
reading it in its present form one gets the the recent events and adding elements
impression that the contribution of the French specifically requested by Bertrand.
was negligible”. The declared purpose of the While working on the original Pyry report,
report is consistent with its otherwise somewhat the codebreakers having full access to their own
mysterious fragment; Section 38 presents an archive, could, and certainly would have wanted
inventory of contributions of the three countries to, demonstrate their mastery of the subject by
towards the success over Enigma ciphers (see including as much detail as possible. However,
Figure 1 below). the archive of the Cipher Bureau was lost during
its evacuation towards the Romanian border.
The analysed document is unsigned; the same When the team attempted to continue its work in
report by Langer sheds some light and a bit of France, the Poles had to recreate their
doubt on the question of its authorship. documentation using their memory as the only
According to that report, the document was reference available. Process was slow and
prepared by Marian Rejewski and Henryk gradual, as can be seen from the effects of its
Zygalski. That would point to its creation either first stage – the so called “Dokument L”
in 1941 (during Jerzy Różycki’s detachment to (unsigned, 1940b), representing an appendix to
Algiers) or in 1942 (after Różycki’s death). This Langer’s report from the pre-war activity of the
author believes that more probable time of its Cipher Bureau. “Dokument L” was written
creation was late 1940 or early 1941, when during the first half of 1940 and supposedly
Bertrand was still unable to provide the covers the period 1930-1940 (although its scope
codebreakers with enough intercepts to keep ends with the Pyry conference). In spite of its
them engaged. Moreover, should the document scope similar to the discussed document it counts
have been written in 1942, it would most only 31 pages – about half of the latter.
probably include some references to British reports prepared in 1945 include some
codebreakers’ work at P.C. Cadix. It is also details of the Polish pre-war activities, which are
possible that Langer, when writing his report in otherwise unknown from the available Polish
sources. Alexander (1945, p. 18) describes the
2
Polish attack on naval Enigma using the term
In original: ENIGMA. Kurzgefasste Darstellung der
“Forty Weepy”. That term was coined by the
Auflösungsmethoden.
91
Poles from the representation of numbers used however, in the discussed document they are
by Kriegsmarine cipher clerks in 1937. The presented in a more systematic way than in other
British codebreakers could not have known about versions. At least some novel elements deserve
that from their own experience, as the system special attention. The first one concerns the radio
was changed before they focused attention on the network of the German Sicherheitsdienst (S.D.).
naval Enigma. The same report by Alexander Section 34 presents the history of Polish struggle
names the call sign, AFA, of the German torpedo with the S.D. network between its first
boat whose signals permitted Polish appearance in October 1937 and a major change
codebreakers to break the new Enigma procedure on 1 August 1939. Messages in the S.D. network
adopted by Kriegsmarine in May 1937. None of were masked with a 3-letter code before
those details (“Forty Weepy” or AFA) are enciphering with Enigma. That did not prevent
mentioned in the analysed document (or any Polish codebreakers from breaking both the code
other Polish sources) and must have been known and the Enigma key and reading the messages up
to the British codebreakers from the original to 31 July 1939.
Pyry report.
This statement contradicts the opinion
The scope of information regarding pre-war formulated in Dilly Knox’s (1939a) report from
efforts of the Polish Cipher Bureau available in the Pyry meeting, and repeated since then by
the analysed document goes far beyond the limits numerous sources, that Poles were unable to read
of the original, Polish sources available so far. Enigma after the change of the indicator
On the other hand it does not include some structure on 15 September 1938. The statement
details quoted in the existing British reports. The in Section 27 reinforces this argument indicating
structure of the document is very similar, even in that the military key from 25 August 1939, the
translation, to the structures of other documents day of general German mobilization, was the last
edited by the members of Cipher Bureau team broken day before the evacuation of the Cipher
(“Dokument L” or Rejewski’s “Memories”), Bureau from Warsaw.
hinting at their common source. All those details
considered together permit the positioning of the Section 29 refers to the preparation by
document as an intermediate link between the Bletchley Park staff of a special catalogue
fragmentary sources known so far and their already proposed by the Poles before the
common reference – the original Pyry report. outbreak of war. Lack of resources prevented the
Polish team from implementing its own idea, but
4 Preliminary findings and conclusions the more resourceful British were able to
manufacture the proposed catalogue, which went
Systematic analysis of this recently found into history as Jeffreys’ sheets. Jeffreys’ sheets
document is far beyond the scope of this paper, represented an extension of Zygalski sheets;
although the preface to the edited version while the latter identified only the location of a
(Grajek, 2017) of the report provides its early female, the former permitted also to identify the
stage. The document, although obviously not character corresponding to the female (“(…) we
identical to the original Pyry report, represents had the idea to create catalogues with characters
the best approximation currently available. It has that would correspond to all female cases, (…)
been created by the same team, for the similar now the British (…) put our plans into practice”).
purpose and using the same language. It is the
first material proof of otherwise obvious fact – Section 30 offers an update to the history of
the transfer of Enigma secrets by Polish Cipher the Herivel method, which was brilliantly
Bureau to the Allies, which was found in the conceived but useless as long as the positions of
Allied archives. This author hopes that this the turnover notches in rotors IV and V were
information might spark a wider search for its unknown. Herivel’s discovery was
presumed predecessor – the original Pyry report. complemented by the Polish team, who
identified the notch positions in both rotors and
Most facts presented in the report are known communicating them to BP thereby enabling the
from other sources, in particular from practical application of the Herivel Tip.
“Dokument L” and Rejewski’s “Memories”;
92
Figure 1: Final section of the analysed document - contributions of the three states to the breaking of Enigma
Section 31 refers to the new Enigma of the German Kriegsmarine. The story long
ciphering procedure used from 1 May 1940. We established among Enigma historians states that
learn that some German cipher clerks started to while Poles provided the foundations for
use it prematurely, on 30 April. The Poles, who breaking the Wehrmacht and Luftwaffe ciphers,
managed to break the military key for that day, breaking the Kriegsmarine Enigma represented a
were able to work out the procedure and purely British adventure. The analysed document
communicate its details to Bletchley. presents this question in a new light. The Poles
were obviously watching the evolution and
While sections 1–32 have a more or less breaking the Kriegsmarine ciphers from their
chronological structure, section 33 is dedicated non-machine beginnings to the establishment in
to the S.D. network, Sections 34–37 break the May 1937 of the system used during the war.
chronological narration and represent an The report confirms that they were able to work
appendix dedicated to the area only incidentally out the details of the new procedure and, thanks
covered in the reports known so far – the ciphers to the German blunder in the transition period, to
93
break enough messages to provide the British Denniston, A. G., How News was Brought from
codebreakers with the reference material for their Warsaw at the end of July 1939, NA 25/12
own efforts. Alan Turing and his team designed a
number of methods (EINS-ing, banburismus) Grajek Marek. 2017. Sztafeta Enigmy.
which could assure regular decryption operation Odnaleziony raport polskich kryptologów,
once the system is first broken, however they ABW, Centralny Ośrodek Szkolenia ABW,
could not advance their practical mastery of the Emów.
cipher beyond the point reached by the Poles in
1937. Their final success in 1941 was based both Knox, A. D. 1939a. Letter to A. G. Denniston,
on the information provided by the Poles and the 1939, NA HW 25/12
documents captured on board the seized German
ships. Knox, A. D. 1939b. Memorandum, NA HW
25/12.
Section 38 represents an element of the
document most appealing to the reader’s mind; it Langer, Gwido Karol. 1945. Sprawozdanie
offers an enumerative list of elements dotyczące ewakuacji Ekspozytury Nr 300,
contributed by the three participants of the Instytut Józefa Piłsudskiego w Londynie,
cryptologic cooperation until June 1940 (cf. 709/133/5.
Figure 1 below). While this picture has changed
significantly in the later stages of war, there is no Mahon, A.P. 1945. The History of Hut Eight, NA
doubt that during the first year of this conflict, HW 25/2.
the Enigma adventure was still heavily
dominated by the achievements of the Polish Milner-Barry, Philip Stuart (ed.). 1945. The
Cipher Bureau team. History of Hut Six, NA HW 4/70.
94
The Poles and Enigma after 1940: le voile se lève-t-il?
Dermot Turing
68 Marshalswick Lane, St Albans, UK
❞❡r♠♦tt✉r✐♥❣❅❜t✐♥t❡r♥❡t✳❝♦♠
96
without even an Enigma machine, presents some reconstructed Enigma machines. With the one
difficulty; existing literature does not face up to they had sent to Bertrand through the diplomatic
that challenge. bag in 1939, that made a total of two to work
with. Bertrand had just made arrangements for the
3 The Cadix Period production of duplicates of the synthetic Enigma
After the fall of France, the Polish code-breakers machines by a factory in Paris when the invasion
were rapidly evacuated to French North Africa, of France took place.9 For the purposes of
despite the plea of Alastair Denniston, the head the reproduction, one of the precious machines
of GCCS, to assimilate them into his team at had been dismantled, leaving the team with only
Bletchley Park. There, there was a near-mutiny one. Before the invasion, a teleprinter link
when some of the team, including notably Marian between Bertrand at PC Bruno and Britain had
Rejewski and Jerzy Różycki, did not want to enabled some degree of sharing of key-finding
return to France but to go to Britain instead. results derived from Zygalski’s sheets, and some
Gwido Langer put down the rebellion and the team decipherment of intercepts, but these had little
moved to a new location near Uzès in the so-called impact on military operations10 and in any case
Zone Libre, the Château des Fouzes, in October the work had come to an end with the evacuation
1940. The conditions were sub-optimal: the code- of PC Bruno. Evidently, at PC Cadix, there was
breakers complained of having to peel potatoes, at best the one surviving Polish reconstruction to
chop wood, and do other manual labour, and the work with, and none of the sophisticated key-
nearest bath was 27 km away; but on the other finding machinery which the British Enigma team
hand Bertrand had arranged for the team’s work, at Bletchley Park were beginning to exploit from
accommodation and wages to be funded by the mid-1940 onwards.
Vichy Government (Bertrand, 1972). Thus it is legitimate to enquire to what extent
Initially, the team had to struggle to obtain the Poles at PC Cadix were able to work on
intercept material to work on, though Bertrand Enigma, if at all, and if so how. In the
arranged a system by which the organs of the first place it must be mentioned that the attack
Vichy state would feed intercepted encrypted on Swiss machine ciphers was an attack on
material to him to be worked on. Insofar Enigma. ‘The Swiss machine turned out to
as this was manually-enciphered material, the be an ordinary commercial model of Enigma,
talented Polish team were able to tackle it naturally with different internal rotor connections’
without special equipment or machinery. So it (Rejewski, 2011). Tackling this machine would
appears that a substantial amount of the work have been straightforward for Rejewski and his
carried out consisted of an attack on German colleagues, who had honed their skills on the much
transposition ciphers, notably a difficult double- harder Wehrmacht version of Enigma without
Playfair method, though there were also successful the modern machinery now in use at Bletchley
attacks on Swiss machine ciphers and, in a Park. Reverse-engineering the Swiss machine,
moment causing some embarrassment to the Poles without the fearsome plugboard, would have been
themselves, on the Poles’ own cipher machine a challenging but ultimately routine task, and
Lacida (Rejewski, 2011). The targets included Rejewski gives a brief description of it in his
the Wehrmacht, operating all across Europe from account.
France to well beyond the Soviet frontier, the However, a substantial contribution to
SS and other ’police’ units, the Abwehr and the intelligence derived from Wehrmacht Enigma
Sicherheitsdienst in France and North Africa, and messages was not likely to be feasible without
the German Armistice Commission (Kozaczuk, the assistance of modern technology. Zygalski’s
1998). sheets had been rendered obsolete by the change
in key-transmission procedure adopted in May
3.1 Enigma 1940, after which the Germans ceased to encipher
The paucity of resources at PC Cadix was not the ‘indicator’ (the required orientation of the
limited to firewood and intercepted signals. In the 9 Bertrand1949 report, and dossier No.272.
flight from Poland, the Polish team had been able 10 As both the Langer account of 1946 and the Bertrand
to bring with them only one of their synthetically account of 1949 graphically describe.
97
three Enigma rotors for the transmitted message) communication between London and PC Cadix.13
twice over. From that point onwards, there were Bertrand’s cryptologist colleague Henri Braquenié
basically two methods for key-finding. The first noted with amusement that the arrival of the
was to use what the British called ‘Cillying’ and machines enabled PC Cadix to communicate
‘Herivelismus’, and the Franco-Polish team called with MI6 using Enigma technology: to rub in
the ‘Method Kx’ (after the British cryptanalyst the irony he would sign off his messages (in
Dilly Knox, who had presumably described cipher) with the words ‘Heil Hitler’ (Braquenié,
the technique to them at one of the trilateral 1975). However, the use of Enigma machines
conferences in 1939). Cillying assumes that the at PC Cadix, for any purposes, was short-lived.
German operator has chosen a predictable six- Within weeks of the approval by London of
letter word like HITLER, or another predictable the use of the new Enigma-type machinery for
sequence like QWERTZ, for the indicator; the communications, the possibility of the Zone Libre
first three letters (transmitted in clear) give a being overrun had become a live threat; the team
clue to the second three (which are enciphered). at PC Cadix knew they were being tracked by
Herivelismus is named for John Herivel, a the ‘Funkabwehr’, German counter-intelligence’s
Bletchley Park code-breaker who imagined an radio direction-finding unit; and on 7 November
operator would be lazy enough to use the last 1942, continued operations at the château became
position of the rotors (or a position very close to imprudent. The premises were evacuated and
it) showing at the end of the previous transmission code-breaking by the Poles in France came to an
- which helped when a long message was broken end.
into several parts (Herivel, 2008). These methods
could have been exploited at PC Cadix without 3.2 Results
the need for special technology - apart from the By all accounts the Polish team at PC Cadix were
much-needed replica Engima machine itself. kept extremely busy for the two years they were
The other method of tackling Enigma in the there. Much of the work involved relaying (and
period after October 1940 was machine-based. re-enciphering) messages for London from the
Developing ideas suggested by the pre-war Polish outpost of Polish Intelligence in North Africa, an
bomba, Bletchley Park cryptanalysts, including activity which seems to have taken place under
Alan Turing, had invented a new means of key- Bertrand’s nose but without his knowledge. As for
finding based on guessed-at message content and the actual code-breaking, PC Cadix was able to
running a logic-check through all 17,576 possible obtain copies of signals which were unavailable to
combinations of rotor start-positions. Their Bletchley Park, which meant that the Polish team’s
machine, the famous Bombe, was used to find reports on the activities of the SS as German forces
thousands of keys each month for the remainder moved east, following the outbreak of hostiliies
of the war. This option was denied to Bertrand with the USSR in 1941, were highly prized in
and the team at PC Cadix: indeed, it seems that London.14 Those reports do not make comfortable
Bertrand was kept largely, if not wholly, in the reading, as they itemize round-ups and ethnic
dark about the degree of success achieved by the cleansing carried out in the newly-occupied areas
British with their Bombes.11 of Belarus and the Ukraine.
However, Bertrand had not lost contact with Towards the end of the Cadix period, the
his engineering firm in Paris, and eventually the code-breakers achieved a breakthrough against the
reproductions of the Polish reconstructed Enigmas hand ciphers of the Funkabwehr. In another
began to arrive in pieces for reassembly at PC irony, the trackers from the Funkabwehr who
Cadix. By 10 September 1942, Bertrand was able were hunting down illicit radio transmissions
to contact his British liaison and report that he had in the (increasingly Nazified) Zone Libre were
reassembled three of these Enigma machines,12 themselves being tracked by their own prey.
suggesting that one of them be used for secure Gustave Bertrand built up a detailed profile of
11 TNA the Funkabwehr, its activities and personnel, its
HW 65/7 (Mar-May 1942).
12 Medrala (2005), page 183, says there were seven vehicles and locations, and above all its secret
machines of which four were reassembled models; 13 TNA HW 65/7.
unfortunately in this instance his source is not specified. 14 TNA HW 65/7.
98
signals. The Cadix team thus knew exactly when mainstream Wehrmacht units in combat roles, and
the net was closing in; and Bertrand himself was others engaged in ‘special’ activities now known
able to equip de Gaulle with a detailed profile of to be part of the program for extermination of
German direction-finding and radio-suppression Jews and other classes of society. ‘German Police’
in occupied France, once he joined the Free French signals were thus regarded as being of significant
in 1944.15 value in building up an overall picture of German
These examples show that the Polish team military and political activities and plans. In
continued to make a valuable contribution to October 1943, the British told Polish Intelligence,
intelligence based on decrypted signals throughout ‘We are very glad to receive the T.G.D. German
their time at PC Cadix. In conclusion, however, traffic taken at Felden,’ and ‘Police Traffic is
it seems unlikely that any significant results were steadily gaining in operational importance’.17
obtained at PC Cadix by the Polish code-breakers
as a result of decrypting Enigma. However, a 4.1 TGD
different story emerges when the remnants of the The specific version of German Police signals on
team reached Britain in August 1943. which the Poles were working was known by its
old call-sign ‘TGD’. TGD was described in the
4 The Felden Period GCCS History of Hut 6 as ‘the famous T.G.D.’,
The story of what happened to the Poles of with the comment ‘this key was never broken
PC Cadix after their forced departure is highly during the war and to this day is one of the classic
dramatic and in some instances tragic. Suffice mysteries of Hut 6. It never cillied so far as we
it to say that only a handful, including Marian know and no convincing re-encodement from any
Rejewski and Henryk Zygalski, eventually made other key was ever produced.’18 Reports filed
it over the Pyrenees, only to be arrested and spend by Gordon Welchman of Bletchley Park’s Bombe
several months in Spanish prisons. On 3 August team in 1942 reinforce the idea that Bletchley Park
1943 the escaped Polish code-breakers - only five had got nowhere with TGD, unlike other German
in number - were relocated to Britain, and assigned Police ciphers based on Enigma.19 However, from
to the Polish signals intelligence unit at Felden, the GCCS reports it is quite plain that TGD was
a rural hamlet situated on the outskirst of Hemel indeed an Enigma cipher, and one of particular
Hempstead, north-west of London. Felden was significance, since it was immune to ordinary
the heart of an operation, approved and directed means of attack. The careful security measures in
by MI6, which was clandestinely monitoring the place to protect TGD traffic imply that the content
signals output of the USSR, notwithstanding that of the signals was more sensitive than other SS
the USSR was notionally the ally of both Britain material.
and Poland in the struggle against Germany In terms of TGD’s structure, the recently-
(Maresch, 2005). declassified Bertrand archive includes an
On arrival at Felden, Rejewski, Zygalski and intriguing dossier (Dossier 278) prepared by
their colleague Sylwester Palluth were assigned the Poles in approximately 1940. This dossier has
to ‘Team N’, which was directed against German not been discussed in the previous literature, and
rather than Russian traffic.16 During this period, it gives the missing technical detail on the cipher.
they enjoyed particular and noteworthy success The dossier was part of a series of intelligence
against ‘German Police’ signals, and received exchanges between PC Bruno and Bletchley
commendation from Bletchley Park, relayed via Park on technical matters, and it summarises
MI6, for their work. To understand this better, the key procedure being used, and thus explains
it is necessary to know that the phrase ‘German why TGD resisted the attacks which worked
Police’ covered a wide range of uniformed for ordinary SS messages. In summary, TGD
services carrying out a wide range of activities used a rigorous key system which precluded
ordinarily associated with armed forces rather than cillies. All three letters of the indicator had to
law enforcement agencies. Nazi Germany had be different, and the message-setting was first
many such organizations, some substituting for enciphered using a substitution alphabet before
17 PISMKol 242/92, TNA HW 14/90.
15 SHD DE 2016 ZB 25/1, file 01H002. 18 TNA HW 43/71 (undated, c.1946).
16 PISM Kol 242/64 (Oct 1943). 19 TNA HW 25/27 (Mar, Jun, Dec 1942).
99
re-encipherment on the Enigma machine. (In First, how was it that Bletchley Park was unable
practice this is unlikely to have made a major to exploit TGD, given that it had been armed
difference to security, and the dossier reports that with the dossier? The answer may be a lack
the preliminary encipherment of indicators was of resources, or that Bletchley Park decided to
discontinued before the war.) More significant focus on the Enigma keys that were susceptible
was the jumbling-up of material normally located to the Bombe technique. Breaking Enigma keys
in a standardized way in a message’s preamble: on a Bombe requires a crib, i.e. guessed-at
in TGD messages message-data like the sender, plaintext, and without a history of prior decrypts
addressee, message-key and so forth could be it is a tough assignment to come up with a
positioned differently on different days, albeit viable crib. Furthermore, the structure of TGD
following a pattern. The ‘biggest surprise’, will have precluded the use of cribs. The Poles
according to the Polish authors of the dossier, at Felden were not relying on Bombes, and it
related to the content of messages. A coding- seems reasonable to infer that they dusted off their
system was used to mask the content (before the previous know-how and reapplied it in their new
entire message was enciphered on the Enigma working environment.
machine), but with a twist: only part of the text A second intriguing feature of the success
would be in code, and the rest was in plain-text. against TGD at Felden relates to Enigma
The toggle between code and plain-text would machines. Not only is it absurd to imagine that the
have made a crib-based attack to find the Enigma PC Cadix Poles managed to smuggle a counterfeit
key extremely hard. The code was in three-letter Enigma with them when they escaped, but there
groups which used no vowels and omitted Q, X is sound evidence that the Enigma duplicates
and Y; Q denoted a shift from alpha to numeric, made in France remained there, with Rejewski
X was punctuation, and Y denoted a shift from and Zygalski making a special trip to France
code to plain-text. Instead of spelling out numbers after the war’s end to retrieve them from where
in full, as in standard Enigma procedure, the they had been concealed.20 Without an Enigma
alphabet was used (A, B, C, ... standing for 1, 2, 3, machine the effort against TGD at Felden would
..., with redundancy, so that K, L, M, ..., and V, W, surely have been doomed. It would therefore
Z would also stand for 1, 2, 3, ...). Unfortunately, appear that the British, who had been supplying
the dossier does not divulge the extent to which Felden with equipment of various descriptions,
the code-book had been reconstituted by the may also have provided an Enigma (or more likely,
Poles. a modified Typex machine reconfigured to emulate
The significance of the messages is mentioned an Enigma, as used by deciphering clerks at
briefly in the dossier. The Poles had, at the time Bletchley Park). Unfortunately there is no archival
the dossier was written, been monitoring evidence to clarify how exactly the Poles did their
exchanges between the Sicherheitsdienst work.
headquarters in Berlin and various border
outposts responsible for gathering political and 5 Rejewski’s 1944 request
other intelligence from Germany’s annexed By the summer of 1944, as the Allied forces
territories and peripheral states. At the time, began their recapture of continental Europe from
before the outbreak of hostilities, this included the Wehrmacht, the importance of German Police
reports on subversive action being taken on traffic to the overall intelligence picture waned.
behalf of the Nazis. Evidently TGD traffic was at The Polish General Staff were told by MI6 that
that time more high-level political material than the British no longer required the ‘German Police
short-term operational information. The extent Intercepts’ on 8 July.21 If it is right that TGD
to which the nature of the traffic had evolved by signals were being relied on for the insights they
1943 is difficult to ascertain. provided into high-level thinking at the top of the
4.2 A veil half-raised Nazi hierarchy, the timing of the shut-down of
work on TGD is no coincidence. By this stage
The declassified dossier thus unveils part of the in the war, Bletchley Park had begun to tap into a
‘classic mystery’ of TGD. But in doing so, it
20 PISM Kol 242/69, Kol 242/93 (May 1945).
merely intrigues us with further unsolved puzzles. 21 PISM Kol 242/92.
100
far more powerful and informative source, namely already in late 1944 it would have been plainly
the teleprinter traffic enciphered on the Lorenz obvious that the Soviet influence in Poland was
Schlüsselzusatz device and broken at Bletchley pervasive and pernicious. To be involved in
Park with the help of novel electronic machinery. the assault on Russian ciphers was an extremely
The change in British priorities for Felden also unwelcome change for Rejewski, as it ratcheted up
signalled a redisposition of Rejewski, Zygalski the danger-level for him personally. Yet precisely
and Palluth, who were assigned in November the same reasoning would have led Bletchley
1944 to ‘Team R’, which was responsible for Park, assuming they were aware of his request,24
monitoring and decrypting Soviet traffic.22 Their to feel uncomfortable with Rejewski obtaining
reassignment followed an unwelcome period of knowledge of the achievements and methods in
idleness and was, for Rejewski at least, an use there, if Rejewski were going to go back to
unwanted development. Rejewski was moved Poland after the war. Regardless of all the rhetoric
to write a long note, dated 20 October 1944, about the USSR as an ally, the British were only
in which he eloquently sets out the Enigma- too well aware that the Soviets needed to be
related debt owed by the British to the Poles and watched, and what the dangers were. After all,
requests closer involvement in the British work it was the British who were sponsoring the Polish
against Enigma.23 Rejewski’s request was viewed efforts at Felden which were directed against the
sympathetically by Polish Intelligence, and passed USSR’s secret messages.
on to the British, but nothing came of it.
By this date, though, Bletchley Park had 6 Conclusion
become a thoroughly industrial operation, The Polish attacks on the plugboard version of the
churning out intelligence based on its Bombes, Enigma machine in the 1930s stand as one of the
in a volume which would have astonished most impressive achievements of mathematical
Rejewski if he had been aware of the scale of cryptanalysis of all time. The fact that, after
the operation. While there remained brilliant May 1940, the individuals who had created those
code-breakers at Bletchley whose skills were earlier successes did not become part of the
put to use right up to the end of the war, the Bletchley Park team which took over, built from,
focus of intellectual attention was no longer the and multiplied, their achievements, has been a
Enigma. The old hands who had met and learned source of dismay to many observers. It has been
to respect Rejewski and Zygalski were out of considered shameful that no place was found in
the picture: Denniston in a new role relating to Britain for Marian Rejewski and his colleagues
diplomatic ciphers, Knox dead, and Alan Turing after the fall of Poland or after the German
redeployed onto speech encipherment. Rejewski takeover of the Zone Libre in France. No doubt,
had no advocates at Bletchley, and, in truth, no until late 1942, a valuable role could have been
Enigma-related role there. Moreover, it would found for them at Bletchley Park alongside code-
have been wholly counter to the culture of secrecy breakers of other allied nations who were already
at Bletchley Park to allow a Polish code-breaker there. But the political weather had changed by
to see the nature of the new operation there. The 1943 when the Poles eventually arrived in Britain,
British brush-off must also be seen against the and in any event the Polish code-breakers were
prevailing political climate, where Poland was, in still under Polish, not British, military command.
1944, thought to be an ‘unreliable’ ally owing to The fact is that the Poles did manage to carry
tension growing between the Poles, aggrieved at on valuable cryptanalytical work in France until
the murders at Katyn, and the acquisitive USSR. the end of 1942 and in Britain from 1943 until the
Viewed in the light of the politics of 1944, end of the war. Only to a limited extent was their
Rejewski’s plea takes on a different colour. Like effort directed against Enigma, but that should
all exiles whose family were left behind, Marian not be regarded as official lack of interest in the
Rejewski was in no doubt that he intended to Poles, rather as a decision about deployment of
return home after the war. As future events would cryptanalytic talent in a changing world. What
show, this was a courageous thing to do; but
24 Rejewski’s paper, or a summary of it, was almost
22 PISM Kol A.XII.24/63, Kol 242/54. certainly provided to MI6, but it may have gone no further.
23 PISM Kol A.XII.24/63. There is no indication in the GCCS files that it was received
or acted upon at Bletchley Park.
101
the Poles actually did, both at PC Cadix and at Medrala, Jean. 2005. Les Reseaux de Renseignements
Felden, was of high quality and highly regarded, Franco-Polonais 1940-1944 L’Harmattan, Paris,
France.
and it should not be seen as a slight on them that
they were asked to carry out this work. Navarre, Henri. 1978. Le Service de Renseignements
1871-1944 Plon, Évreux, France.
Acknowledgments
Paillole, Paul. 1975. Services Spéciaux (1939-1945)
The author acknowledges the invaluable Robert Laffont, Paris, France.
assistance of Dr Janka Skrzypek in interpreting
Paillole, Paul. 1985. Notre Espion chez Hitler Robert
and translating Polish-language material and Dr Laffont, Paris, France.
Marek Grajek for useful discussions. He also
wishes to thank the staff of the SHD, and of the Polak, Wojciech. 2005. Marian Rejewski in the
sights of the Security Services, in ‘Living with the
Sikorski and Piłsudski Institutes in London for Enigma Secret’, p 75-88, Bydgoszcz City Council,
help with archival material, and to acknowledge Bydgoszcz, Poland.
the contribution of the reviewers of the draft
manuscript for their helpful comments. No Rejewski, Marian. 2011. Memories of my work
at the Cipher Bureau of the General Staff Second
external funding was provided and no conflict of Department Adam Mickiewicz University Press,
interest is believed to exist in the creation of this Poznań, Poland.
paper.
References
Bertrand, Gustave. 1972. Enigma, ou la plus grande
énigme de la guerre 1939-1945 PLON, Condé-sur-
Escaut, France.
Bloch, Gilbert. 1986. Quelques Eléments Relatifs au
PC ‘Cadix’, à sa fin et au Sort de l’Équipe Polonaise
(unpublished manuscript) SHD GR 1K 953/2.
Braquenié, Henri. 1975. Interview avec le capitaine
Henri Braquenié, in ‘Geheimoperation Wicher’,
p318-328, 1989, Karl Müller, Bonn, Germany.
Ciechanowski, Jan Stanisław, and Jacek Tebinka.
2005. Cryptographic Cooperation - Enigma, in The
Report of the Anglo-Polish Historical Committee,
vol 1, chapter 46 (Tessa Stirling, Daria Nał˛ecz
and Tadeusz Dubicki, eds) Vallentine Mitchell,
Edgware, UK.
Garliński, Józef. 1979. Intercept J.M. Dent & Sons
Ltd, London, UK.
Grajek, Marek. 2010. Enigma - Bliżej Prawdy Rebis,
Poznań, Poland.
Herivel, John. 2008. Herivelismus and the German
Military Enigma M and M Baldwin, Cleobury
Mortimer, UK.
Kozaczuk, Władysław. 1998. Enigma Greenwood
Press, Westport, CT.
Kapera, Zdzisław J. 2015. The Triumph of Zygalski’s
Sheets The Enigma Press, Kraków-Mogilany,
Poland.
Maresch, Eugenia. 2005. The Radio-intelligence
Company in Britain, in ‘Living with the Enigma
Secret’, p 185-200, Bydgoszcz City Council,
Bydgoszcz, Poland.
102
US Navy Cryptanalytic Bombe - A Theory of Operation and Computer
Simulation
1 Introduction
In 1942, with the help of Bletchley Park, the US
Navy signals intelligence and cryptanalysis group
OP-20-G started working on a new Turing bombe Figure 1: An operator setting up the wheels on a
design. The result was a machine with both sim- US Navy bombe. Source: NSA
ilarities and differences compared to its British
counterpart. It is assumed that the reader is familiar with the
There is an original US Navy bombe still in Enigma machine. This knowledge is widely avail-
existence at the National Cryptologic Museum in able, for example in (Welchman, 2014).
Fort Meade, MD, USA. The bombe on display is To find an Enigma message key with the bombe
not in working order and the exact way it was op- it is necessary to have a piece of plaintext, a crib,
erated is not fully known. corresponding to a part of the encrypted message.
The US Navy bombe was based on the same A crib could be a common word or a stereotyped
principles as its British version but had a different phrase which is likely to be present in a mes-
appearance and thus a different way of operation. sage, for example Wettervorhersage which is the
The bombes were used to search through a part German word for weather forecast. The crib is
of the Enigma key space, looking for a possible used to derive a configuration of the bombe and
Enigma rotor core starting position which would an assumption of the Enigma rotor starting posi-
not contradict a given enciphered message and its tion is made. Once started the bombe will scan
plaintext (Carter, 2008). through all possible Enigma rotor core positions
A theory, based on previous research (Wilcox, and stop when a position has been found that does
2006) and knowledge of how the British bombe not lead to a logical contradiction for the given crib
works, is presented of how the US Navy bombe (Carter, 2008). If a logical contradiction occurs
was operated and it is shown with a computer sim- then the state of the bombe represents a setting of
ulation that the theory is sound. an Enigma where it would not be possible to en-
The computer simulation presents a graphical cipher the assumed plaintext to the ciphertext of
user interface and runs at approximately histori- the crib. Each stop is subject to further tests after
104
count in the following discussion. In practice this 11=L, 1=B according to the last letter of the crib
could not have been known, but the bombe would (see table 1).
still have found a solution since there is no further
second wheel movement during the crib; the entire 3.3 Wheel Positioning
crib has one and only one wheel position for the Apart from the four wheels in a wheel bank, one
second wheel. The difference is that the second for each Enigma rotor, there is also a reflector plug
wheel now has to be set to A instead of Z which it which has the same function as the reflector on the
otherwise would have been assumed to be. There- Enigma. Since the top wheel of the bombe is con-
fore the bombe is adjusted so that the wheels on nected to the reflector plug it can be assumed that
wheel bank 1 are set to 25, 25, 0, 0. This corre- this represent the leftmost Enigma-rotor which is
sponds to {ZZAA}. connected to the reflector of the Enigma. The bot-
The wheels of wheel bank 2 are set to tom wheel of the bombe corresponds to the right-
25, 25, 0, 1 = {ZZAB}, wheel bank 3 to most Enigma-rotor.
25, 25, 0, 2 = {ZZAC} and so on all the way
3.4 Input Switch
up to wheel bank 13 which is set to 25, 25, 0, 12
= {ZZAM}. The bombe works by injecting a test current into
The wheel order, reflector plugs and the start a position corresponding to a certain letter of
position of the bombe wheels are now set up. The the diagonal board. This current then propagates
next step is to connect the wheel banks according through the system and stops the bombe if it fails
to the letters of the message. to reach all other letters of the alphabet.
To select the letters where test currents are in-
3.2 Bank Switches jected the bombe has two 26-step rotary switches
marked PRI and SEC for primary and secondary.
There are two 26-step rotary switches for each Normally only the primary input is set. When us-
wheel bank. One for the input letter to the ing a crib where the letters of the crib and the
bank and one for the output letter. The rotary corresponding ciphertext are forming two separate
switch connects the rotor bank to the diagonal graphs the secondary input is also needed.
board which utilises the symmetrical properties of The input should be connected to a frequently
the Enigma plugboard to interconnect the bombe occurring letter in the crib. L is selected as it oc-
wheel banks. All of the 32 switches are located on curs at three places in the example message. The
the front of the bombe. These switches eliminate primary input switch is switched to 11 which cor-
the need of a plug board as found on the back of responds to L. The secondary input switch is not
the British bombe and thus makes setting up a crib needed in this case and is set to OFF.
on the bombe much faster (Turing, 1942).
The British bombe, on the other hand, could 3.5 Printer
have up to three cribs or wheel orders connected at On the back of the bombe the cables of the printer
the same time on one bombe. The British bombes are connected to the diagonal board sockets repre-
usually had 36 wheel banks of three wheels each, senting the letters in the message. The following
corresponding to 36 Enigma machines. letters are present: A, B, C, E, F, K, L, N, O, R, T,
The plaintext letters of the message are con- U, X. The printer cables for these letters should be
sidered to be the input to the corresponding rotor connected to their respective socket on the diago-
bank and the ciphertext letters to be the output. nal board with A=0, B=1 and so on.
For example, for wheel bank 1 which corre-
sponds to the first letter of the message, the input 4 US Navy Bombe Model
is K and the output L. Therefore the left switch of A theoretical model is presented of how it is as-
the two bank switches corresponding to rotor bank sumed the different parts of the US Navy bombe
1 is set to 10 for the letter K. The right switch is set interacted.
to 11 for L.
For wheel bank switch 2 the input switch is set 4.1 Diagonal Board
to 17=R and the output switch to 0=A. The central component in the US Navy bombe is
The rest of the wheel bank switches are set up the diagonal board. The diagonal board has 26 in-
in the same way with the last, number 13, set to put nodes, one for each letter of the alphabet. Each
105
input node consists of 26 conductors, one for each
letter of the alphabet. The diagonal board utilises
the fact that if a letter A on the plugboard of the
Enigma is connected to letter B, then it follows by
the symmetrical design of the plugboard that let-
ter B must be connected to A. Let conductor y of
diagonal board node x be denoted DB(x, y), then
the connections on the diagonal board can be de-
scribed: DB(x, y) is connected to DB(y, x).
106
The exact format of the original printouts is
unclear. The information in the example above
would most likely have been represented by num-
bers only (Wilcox, 2006) as this is the norm on
the rest of the bombe. This matches the rotor core
starting position of the Enigma used to encrypt the
message (see section 2).
The setting found will be subject to further,
manual, tests using a simplified Enigma machine:
the M-9 Checking Machine. The output from this
process would be either more of the plugboard
connection pairs, or the conclusion that the stop
was in fact false.
After this there would be a brief set of trial and
error tests to find a suitable ring setting that would Figure 4: Sequence diagram showing how the cen-
decrypt the whole message. tral switchboard component of the simulator dis-
tribute information between two rotor banks and
6 Computer Simulation the diagonal board.
107
Figure 5: US Navy Bombe computer simulation screenshot showing the front of the bombe. By interact-
ing with the various parts of the bombe in the simulation, a crib can be set up and run. The simulator is
written in the Haxe programming language and uses the NME framework.
tional Cryptologic Museum. This bombe is sup- CryptoMuseum. 2017. Enigma M4 mes-
posedly the last one manufactured. sage. http://www.cryptomuseum.com/
crypto/enigma/msg/p1030681.htm. [On-
8 Acknowledgments line; accessed 24-October-2017].
We would like to thank the National Cryptologic Joseph R. Desch. 1942. Memo of Present Plans
for an Electro-Mechanical Analytical Machine.
Museum for providing us with useful informa- http://cryptocellar.org/USBombe/
tion on the US Navy bombe, and we thank the desch.pdf. [Published online by Frode Weierud
three anonymous reviewers for their valuable com- in 2000, accessed 16-September-2016].
ments. We would also like to thank Dr. J Ja-
Alan M. Turing. 1942. Visit to NCR.
cob Wikner, Associate Professor at the Depart- http://cryptocellar.org/USBombe/
ment of Electrical Engineering, Linköping Univer- turncr.pdf. [Published online by Frode
sity, Sweden, for hints and tips on how to shape the Weierud in 2000, accessed 16-September-2016].
manuscript. Gordon Welchman. 2014. The Hut Six Story. M & M
Baldwin, 6 edition. ISBN: 978-0-947712-34-1.
108
What We Know About Cipher Device “Schlüsselgerät SG-41” so Far
Carola Dahlke
Deutsches Museum, Germany
c.dahlke@deutsches-museum.de
Although there are basic explanations of the A special model Z with ten figure traffic was
working principle of the machine (e.g. WDGAS- constructed to be used for encrypting weather
14), it was hitherto not possible to understand the reports. Originally, 2.000 – 7.000 pieces were
exact mode of operation of the machine and to be ordered at Wanderer Werke AG at Siegmar-
able to simulate it. No construction drawings Schönau/ Chemnitz (see Sächsisches
were found, and interrogation papers from Staatsarchiv Chemnitz). But TICOM documents
Menzer himself and from colleagues have not speak of very few (TICOM I-194) or about 1.000
been released so far (e.g. TICOM I-71, I-72, I-73 (TICOM I-57) pieces that were truly fabricated
& DF-174). In addition, only few devices are and used by the Luftwaffe (Air Force) from 1944
known. Mostly, they were destroyed, dumped or until the end of the war.
burnt at the end of WW2. So after all, if a device
is found nowadays, it is in most cases not in
working order anymore.
2.1 Standard Model
Menzer’s standard Schlüsselgerät 41 had a
QWERTZ keyboard and was used from 1944
until the end of the war by the Abwehr (Secret
Service) (Mowry, 1983-84). The letter J replaces
the space-key (Kopacz, in prep) and is marked in
red on the keyboard. According to Batey (2009),
Bletchley managed to decipher few messages
due to handling mistakes of the user, but they
could not reconstruct the principle of the Figure 2: SG-41Z Collection Deutsches Museum
machine until they captured it after the end of No. 2013-1092, Photo: Inga Ziegler
WW2.
In 2013, the Deutsches Museum was able to
The Deutsches Museum owns a SG-41 that purchase a SG-41Z that had been dumped in a
has lately been found in the forest grounds near lake near Berlin at the end of WW2. As it was
Munich. It seems that someone had deposited it restored before it was put up for sale, it looks as
there at the end of WW2. Of course, after new, at least from the outside. Internally it is -
approximately 70 years in the ground, it is like our other model - completely corroded.
completely corroded – so it is not possible to
gain helpful information from it regarding its 3 Sources and Outlook
encryption algorithm.
The Schlüsselgerät 41 and its inventor, Fritz
Menzer, are largely unknown up to date. Some
interesting details have already been provided by
documents from the Target Intelligence
Committee, USA and UK (TICOM).
Immediately after the end of the war, TICOM
conducted surveys and investigations with
prisoners of war and recorded these in the
TICOM documents; since 2009 released by the
Figure 1: SG-41 Collection Deutsches Museum NSA as so-called declassified documents).
No. 2017-803, Photo: Konrad Rainer
110
But as long as the respective TICOM and most of all we thank Klaus Kopacz for his
documents are not available it will only be time and energy to explain the Schlüsselgerät 41
possible to reconstruct the encryption algorithm and to share his exciting insights with us.
by the help of a functional Schlüsselgerät.
Fortunately, the engineer and specialist for References
cipher machines Klaus Kopacz from Stuttgart, David Mowry. 1983-1984. Regierungs-Oberinspektor
Germany, was recently able to purchase and Fritz Menzer: Cryptographic Inventor
repair an original SG 41. A publication about the Extraordinaire. Cryptologic Quarterly Articles, 2
working principle and the complete technical (3-4).
details is planned by him in the near future.
David Mowry. 2014. German Cypher Machines of
As soon as the encryption details are published, World War II. NSA history program.
it will be possible to simulate the algorithm and Erich Hüttenhain. 1970. Einzeldarstellungen aus dem
to evaluate the real impact of this device for the Gebiet der Kryptologie. Bavarian State Library,
development of cipher machines after WW2. For Reading Room for Manuscripts and Rare Books.
example, the wheel-stepping mechanism, as well Munich.
as the negation function of the sixth wheel, were Klaus Kopacz. In prep. Schlüsselgerät 41.
implemented again in other pin-and-lug cipher
Mavis Batey. 2009. Dilly, The Man Who Broke
devices after WW2, although mechanically
Enigmas. ISBN 978-1-906447-01-4.
solved in a different way (see H54 from Hell,
and Version M of the CX52 from Crypto AG; Sächsisches Staatsarchiv Chemnitz, 31030 Wanderer-
Kopacz, in prep). Werke AG, Sigmar-Schönau, Signatures: 1975,
3156 and 1212.
Other sources, especially German, British and TICOM I-194: Report on German meteorological
U.S. American sources from archives, museums, cipher systems and the German met. Intelligence
and collectors, could provide more aspects and service. Released by NSA 2009. No DOCID.
information. As well, we intend to perform a CT-
TICOM I-57: Enciphering devices worked on by Dr.
scan to retrieve information about the internal Liebknecht at Wa Pruef 7. Released by NSA 2009.
parts of our machines. This is the focus for the DOCID: 3541302.
next year.
The cryptology of German Intelligence Services.
Acknowledgments Released by NSA 2009. DOCID: 2525898
TICOM I-194: Report on German meteorological
First of all we would like to thank Dr. Marisa cipher systems and the German met. Intelligence
Pamplona and Christina Elsässer from the service. Released by NSA 2009. No DOCID.
restoration research department of the Deutsches
WDGAS-14: Volume 2 – Notes on German high level
Museum for the material analysis and the helpful
cryptography and cryptoanalysis. Released by
tips for designing the showcase, and Konrad NSA 2009. DOCID: 3560816.
Rainer and Inga Ziegler for the beautiful photos.
We also thank Robert Jahn from Libellulafilm
for his research in the Chemnitz Archive. Finally
111
P OSTER AND DEMO
114
An Automatic Cryptanalysis of Playfair Ciphers Using
Compression
Noor R. Al-Kazaz1 Sean A. Irvine William J. Teahan
School of Computer Science Real Time Genomics School of Computer Science
Bangor University Hamilton, New Zealand Bangor University
Bangor, UK sairvin@gmail.com Bangor, UK
n.al-kazaz@bangor.ac.uk w.j.teahan@bangor.ac.uk
noor82.nra@gmail.com
116
Plaintexts containing any numerical values short Playfair ciphers are extremely difficult
such as, contact number, house number, date to break without some known words. In our
of birth, can be easily enciphered using this ex- paper, even Playfair ciphertexts as short as
tended method (Ravindra Babu et al., 2011). 60 letters (without a probable crib) have been
successfully decrypted using our new univer-
2.1 Cryptanalysis of Playfair Ciphers sal compression-based approach. We use simu-
lated annealing in combination with compres-
Different cryptanalysis methods have been in- sion for the automatic decryption. Moreover,
vented to break Playfair ciphers using com- we have also effectively managed to break ex-
puter methods. An evolutionary method for tended Playfair ciphers that use a 6 × 6 key
Playfair cipher cryptanalysis was presented by matrix.
Rhew (2003). The fitness function was based
on a simple version of dictionary look-up with 2.2 Playfair’s Weaknesses
the fitness calculated based on the number of
words found. However, results obtained from The Playfair cipher suffers from some major
this method were poor with run-time requiring weaknesses. An interesting weakness is that
several hours. A genetic algorithm was pro- repeated bigrams in the plaintext will create
posed by Negara (2012) where character uni- repeated bigrams in the ciphertext. Further-
gram and bigram statistics were both used as more, a ciphertext bigram and its reverse will
a basis of calculating the fitness function. The decipher to the same pattern in the plaintext.
efficiency of the algorithm is affected by differ- For example, if the ciphertext bigram “CD”
ent parameters such as the genetic operators, deciphers to “IS”, then the ciphertext “DC”
ciphertext length and fitness function. Five will decrypt to “SI”. This can help in recognis-
initial keys out of twenty were successfully rec- ing words easily, especially most likely words.
ognized in less than 1000 generations and ten Another weakness is that English bigrams that
out of twenty were fully recovered in less than are most frequently occurring can be recog-
2000 generations. Two ciphertexts were ex- nised from bigram frequency counts. This can
amined in this paper: one with 520 charac- help again in guessing probable plain words
ters and the other with 870 characters. Ham- (Smith, 1955; Cowan, 2008).
mood (2013) presented an automatic attack Breaking short Playfair ciphertexts (less
against the Playfair cipher using a memetic than 100 letters) without good depth of knowl-
algorithm. The fitness function calculation edge of previous messages or with no prob-
was based on character bigram, trigram and able words has proven to be a challenge.
four-gram statistics. A ciphertext of 1802 let- Past research has often used much longer
ters was examined in this paper and 22 letters ciphertexts—for example, Mauborgne (1914)
out of 25 were successfully recovered using this developed his methods by deciphering a Play-
method. fair ciphertext of 800 letters. Also, the Play-
fair messages that were circulating between
Simulated annealing was successful at
the Germans and the British during war had
solving lengthy ciphers as reported by
enough depth with many probable words to
Stumpel (2017). However, he found that short
make them easily readable between these two
Playfair ciphers of 100 letters or so were un-
sides, with no predictor of decrypting suc-
able to be solved. Simulated annealing was
cess for short messages on anonymous top-
also used with a tetragraph scoring function
ics (Cowan, 2008). However, the two con-
for the automatic cryptanalysis of short Play-
ditions that the message is short with lit-
fair ciphers by Cowan (2008). Cowan man-
tle depth (no probable words) apply to cryp-
aged to solve seven short ciphertexts (80-130
tograms published by the American Cryp-
letters) that were published by the American
togram Association.
Cryptogram Association.
In summary, several different cryptanalysis 3 Our Method
methods have been proposed aiming to break
Playfair ciphers with varying degrees of suc- This section describes our new method for
cess. However, most of these methods were the automated cryptanalysis of the Playfair
focused on long ciphertexts of 500 letters or cipher. The problem of quickly recognising
more, except Cowan’s method (2008). A large a valid decrypt in a ciphertext only attack
amount of information that is provided by long has been acknowledged as a difficult prob-
ciphertexts makes breaking them easier while lem (Irvine, 1997). What we require is a com-
117
puter model that is able to accurately predict context predicts each symbol with equal prob-
natural language so that we can use it as a ability.
metric for ranking the quality of each possi- Most experiments show that the PPMD
ble permutation (Al-Kazaz et al., 2016). The variant developed by Howard (1993) produces
PPM text compression algorithm provides one the best compression compared to the other
possibility since it is known that PPM com- variants. The probabilities for a particular
pression models can predict language about context using PPMD are estimated as follows:
as well as expert human subjects (Teahan and
Cleary, 1996). 2c(s) − 1 t
p(s) = and e =
2n 2n
Hence, the main idea of our approach de-
pends on using the PPM method to com- where p(s) is the probability for symbol s, c(s)
pute the compression ‘codelength’ for each pu- is the number of times symbol s followed the
tative decryption of the ciphertext with the context in the past, n is the number of times
given key. The codelength of a permutation the context has occurred, t denotes the num-
for a cryptogram in this case is the length of ber of symbol types and e is the probability as-
the compressed cryptogram, in bits, when it signed to perform an escape. For example, if a
has been compressed using the PPM language specific context has occurred three times pre-
model. The smaller the codelength, the more viously, with three symbols a, b and c follow-
closely the cryptogram resembles the model. ing it one time, then, the probability of each
Experiments have shown that this metric is one of them is equal to 61 and escape symbol
very effective at finding valid solutions auto- probability is 36 .
matically in other types of cryptanalysis (Al- As PPM is normally an adaptive method,
Kazaz et al., 2016). In this paper, we show at the beginning there is insufficient data to
how to use this approach to quickly and auto- effectively compress the texts which results
matically recognise the valid decrypt in a ci- in the different permutations producing sim-
phertext only attack specifically against Play- ilar codelength values. This can be overcome
fair ciphers. by priming the models using training texts
In the PPM compression algorithm, the that are representative of the text being com-
probability of the next symbol is conditioned pressed. In our experiments described below,
using the ‘context’ of the previously transmit- we use nineteen novels and the Brown corpus
ted symbols. These probabilities are based converted to 25 letter English by case-folding
on simple frequency counts of the symbols to upper case with I and J coinciding for the
that have already been transmitted. The pri- 5 × 5 grid and 36 alphanumeric characters for
mary decision to be made is the maximum the 6 × 6 grid to train our models. Also, un-
context length to use to make the predictions like standard PPM which uses purely adaptive
of the upcoming symbol. The ‘order’ of the models, we use static models which are not
model is the maximum context length used updated once they have been primed from the
to make the prediction. Many variants of training texts.
the original Cleary and Witten approach have Our new method is divided into two main
been devised such as PPMA, PPMB, PPMC phases. The first phase (Phase I) is based
and PPMD. These differ mainly by the maxi- on trying to automatically crack a Playfair
mum context length used, and the mechanism ciphertext using a combination of two ap-
used to cope with previously unseen or novel proaches, which is the compression method
symbols (called the zero frequency problem). for the plaintext recognition and simulated
When a novel symbol is seen in a particular annealing for the search. The second phase
context, an ‘escape’ is encoded, which results (Phase II) is based on achieving readability by
in the encoder backing off to the next shorter automatically adding spaces to the decrypted
context. Several escapes may be needed before message produced from phase I, as the spaces
a context is reached which predicts the sym- are omitted from the ciphertext traditionally.
bol. It may be necessary to escape down to A variation of an order 5 PPMD model with-
the order 0 (null) context which predicts each out update exclusions has been used in our ex-
symbol based on the number of times it has periments for both Phase I and Phase II. This
occurred previously, or for symbols not previ- variation is where symbol counts are updated
ously encountered in the transmission stream, for all contexts unlike standard PPM where
a default model is used where an order ‘-1’ only the highest order contexts are updated
118
until the symbol has been seen in the context. zero, the simulated annealing becomes identi-
In our experiments, this variation has proven cal to the hill climbing technique.
to be the most effective method that can be The main idea of using simulated annealing
applied to the problem of automatically recog- for the breaking of Playfair ciphers is to mod-
nising the valid decryption for Playfair ciphers, ify the current key in the hope of producing
but also in other experiments with transposi- a better key. This is based on an approach
tion ciphers (Al-Kazaz et al., 2016). proposed by Cowan (2008). This can be done
Simulated annealing is a probabilistic by randomly swapping two characters. How-
method for approximating the global optimi- ever, this random change is not enough to ef-
sation of a given function in a large search fectively break the Playfair cipher by itself. It
space. It is a descendant of the hill-climbing will usually result in a long search process that
technique. This latter technique is based on often gets stuck within reach of the final solu-
starting with a random key, followed by a ran- tion. So other modifications are needed such
dom change over this key such as swapping two as randomly swapping two rows, swapping two
letters, to generate a new key. If this key pro- columns, reversing the key, and reflecting the
duces a better solution than the current key, key vertically and horizontally (flipping the
it replaces the current one. Different n-graph key top to bottom and left to right). Using a
statistics were used as the scoring function to mix of these modifications can lead to the valid
judge the quality of solutions. After millions solution. For example, swapping two rows will
of distinct random changes, this technique at- help rearrange rows if they are out of order,
tempts to discover the correct key. as it is very important that rows be in the
The weakness of this approach lies in the correct order according to the encipherment
possibility of being stuck in local optima, rules (Lyons, 2012).
where the search has to be abandoned and it During the whole search process, the hope is
is necessary to restart all over again. Simu- that the best plaintext solution that appears
lated annealing (inspired by a process similar is also the correct plaintext. Alternatively, the
to metal annealing) is similar to hill-climbing whole process must be restarted all over again
with a small modification that often leads to and the value of the temperature should be
an improvement in performance. In addition reset to its original high value (Cowan, 2008).
to accepting better solutions, simulated an- An important aspect of this whole process is
nealing also accepts worse solutions in order to the metric that is used to rank the different
avoid the local optima. This approach permits plaintexts (such as our PPM method). A good
it to jump from local optima to different loca- metric needs to be able to distinguish effec-
tions in order to find new optima. The proba- tively between good and poor plaintexts.
bility of the acceptance of the specific solution Algorithms 1 and 2 present the pseudo code
is dependent on how much the score value is for the first phase of our method. In a prepro-
worse. The formula for calculating the accep- cessing step prior to the applications of these
tance probability is PA = (d1/T ) where e is the algorithms, all non-letters including spaces,
e
exponential constant 2.718, d denotes the dif- numbers and punctuation were removed from
ference between the score of the new solution the ciphertext if a grid of 5 × 5 is chosen. If
and the score of the current solution, and T is a a 6 × 6 grid-width is selected, all non alpha-
value called temperature (further details con- betic letters and numbers were removed from
cerning this parameter are described below). the ciphertext instead. According to selected
Whenever the difference is small, the probabil- grid-width, a random key is generated (line 1)
ity of accepting the new solution is high, while and the deciphering operation is initiated us-
if this solution is much worse than the current ing this key. In order to rank the quality of
one (the difference is large in magnitude), the the solutions, the PPM compression method
probability becomes small. The probability is used by calculating the codelength value for
value is also influenced by the temperature T . each possible solution (lines 3 and 4). For each
Initially, the algorithm starts with a high tem- iteration, a sequence of changes is performed
perature value, then it is reduced (‘cooled’) at over the generated key in order to find a so-
each step according to some annealing sched- lution with a smaller codelength value which
ule, until it reaches zero or some low limit. As represents the valid decryption (lines 5 to 33).
the temperature drops, the probability of ac- The greater the number of iterations, the more
ceptance also decreases and when T is set to likely a solution will be found, but longer ex-
119
ecution time will be needed. It is important ble combinations of 4 symbols to try to avoid
to note here that we have used negative scores that. Trying 3, 5, or even more combinations
based on the PPM codelengths values in or- of symbols is possible, but of course the higher
der to maximize rather than minimize scores the number, the search starts getting very ex-
for the simulated annealing process as per the pensive, so 4 provides a reasonable compro-
standard approach adopted in various solu- mise. Finally, the deciphered text is returned
tions (Cowan, 2008; Lyons, 2012). with the smaller codelength value which rep-
The temperature for the simulated anneal- resents the best solution found (line 34). This
ing based algorithm is initially set to 20 and has proved adequate for the solution of most
reduced by 0.2 in subsequent iterations. (The ciphers, but if necessary, it is still possible to
smaller this amount is, the more likely a so- iterate the attack several more times.
lution will be found but this will also result
in longer execution time). The initial tem- Algorithm 1: Pseudo code of the main
perature value is essentially dependent on the decryption phase ‘Phase I’.
cryptogram’s length. The shorter the cipher- Input : ciphertext, Playfair grid-width to be either 5 × 5
or 6 × 6
text, the lower the temperature will be needed Output: deciphered-text
generate a random key according the Playfair grid-width
and vice versa. We have found in experi- 1
selected
ments with different length ciperhetexts that 2
3
currentBestKey ← randomKey
decipher the ciphertext using the currentBestKey and
for cryptograms of a length of around 70, an calculate the codelength value using the PPM
compression method
initial temperature will need to start at around 4 currentBestScore ← − PPM-codelength score (decipher-text)
10, but for the cryptogram of 700 characters, 5
6
for Iteration ← 0 to 99 by 1 do
maxKey ← currentBestKey
a temperature at 20 or so is effective. 7 decipher and calculate the codelength value using
the PPM compression method
For each temperature, 10,000 keys are tested 8 maxScore ← − PPM-codelength score (decipher-text)
9 for Temp ← 20 downto 0 by 0.2 do
then a reduction in the temperature is per- 10 for Count ← 0 to 9999 by 1 do
modify maxKey by choose a random
formed (see lines 9 to 32 in the algorithm). A 11
number between (1, 50):
loop is executed 10,000 times (lines 10 to 31) 12 if the number is 0 then swap two
rows, chosen at random
that modifies the key in the hope of finding 13 if the number is 1 then swap two
a better key with a smaller codelength value. 14
columns, chosen at random
if the number is 2 then reverse the
A sequence of different modifications over the 15
key
if the number is 3 then reflect the
key is performed in lines 11 to 17. The en- key vertically, flip top to bottom
if the number is 4 then reflect the
crypted text is then deciphered using the mod- 16
key horizontally, flip left to right
ified key and the codelength value is calculated 17 if any other number then swap two
characters at random
using the PPM compression method (lines 19 18 newKey ← modi f ied-maxKey
decipher and calculate the codelength
and 20). Then, the difference is calculated be- 19
value using the PPM compression method
tween the new codelength value and the pre- 20 newScore ←
− PPM-codelength score (decipher-text)
vious one. If the new value (line 21) is bet- 21 calculate di f f ← newScore − maxScore
ter (that is, the codelength value is smaller), 22 if di f f >= 0 then {maxScore ← newScore;
maxKey ← newKey}
then the maximum score is set to the new score 23 else if Temp > 0 then
calculate probability ← exp(di f f /Temp)
(line 22), otherwise a probability of acceptance 24
25 generate a random number between
is calculated (line 24) if the temperature is 26
(0, 1)
if probability > randomNumber then
greater than 0 (line 23). In this case, a random 27 {maxScore ← newScore;
number between 0 and 1 is generated, and if 28
maxKey ← newKey}
if maxScore > currentBestScore then
the calculated probability is greater than this 29 currentBestScore ← maxScore;
currentBestKey ← maxKey
number, the modified key is accepted (see lines 30 Make systematic
26 to 27). If we have a new best score, then rearrangements(ciphertext,
currentBestKey, currentBestScore)
the old one is replaced (line 29) and system- 31 end
atic rearrangements are performed by calling 32 end
33 end
Algorithm 2. These include mutations (lines 34 return the deciphered text with the best key
4 to 10 in the new algorithm), row swapping
and column swapping (lines 11 to 17) and an
exhaustive search over all 4 ! possible permu- Concerning the second phase of our ap-
tations of each group of four symbols (lines 18 proach, Algorithm 3 illustrates the pseudo
to 24). Swapping single pairs of letters results code for this phase. The main idea of this
in the search getting stuck in local maxima too phase, as stated before, is to try to insert
often, so we added the swapping of all possi- spaces into the deciphered text outputted from
120
Algorithm 2: Make systematic rearrange- method the order-5 PPMD model has been
ments trained on a corpus of nineteen novels and the
Input : ciphertext, currentBestKey, currentBestScore Brown corpus using 25 English letters (when
Output: currentBestKey, decipher-text
1 f lag ← true a 5 × 5 grid is used) and 36 alphanumeric
2 while flag do
3 f lag ← f alse
characters (when a 6 × 6 grid is used). Af-
4 perform systematic mutations over the ter this training operation and during crypt-
currentBestKey:
5 decipher and calculate the codelength value analysis, these models remain static. Regard-
6
using the PPM compression method
newscore ← ing the cryptograms test corpus, 70 different
− PPM-codelength score (decipher-text) cryptograms were chosen at random from dif-
7 if newscore > currentBestScore then
8 f lag ← true ferent resources including cryptograms pub-
9 currentBestScore ← newScore;
currentBestKey ← newKey
lished by the American Cryptogram Associ-
10 continue outer W hile loop ation, cryptograms published by geocache en-
11 perform systematic row-swaps and column-swaps
over the currentBestKey: thusiasts, and two cryptograms that were also
12 decipher and calculate the codelength value
using the PPM compression method
experimented with by Negara (2012). Cryp-
13 newscore ← togram lengths ranged from 60 to 750 letters.
− PPM-codelength score (decipher-text)
14
15
if newscore > currentBestScore then
f lag ← true
A sample trace of a decryption is
16 currentBestScore ← newScore; shown in Figure 1 for the cryptogram:
‘dohrxnwpscqusfrwchrnpctsehagvpstsfaprdtuipwol-
currentBestKey ← newKey
17 continue outer W hile loop
18 perform swapping of four characters: acgqupfwptslaqsizbedxqusfwscosfraevstngqu’.
19 decipher and calculate the codelength value
using the PPM compression method This shows the best score as it changes during
20
the execution of Algorithm 2 for the main de-
newscore ←
− PPM-codelength score (decipher-text)
21
22
if newScore > currentBestScore then
f lag ← true
cryption phase. The scores are increasing (i.e
23 currentBestScore ← newScore; the codelengths are decreasing). The solution
of this ciphertext is a proverbial wisdom that
currentBestKey ← newKey
24 continue outer W hile loop
25 end
26 return currentBestKey, decipher-text
has been attributed to Damon Runyon: “It
may be that the race is not always to the swift
nor the battle to the strong but that is the way
Phase I in order to achieve readability. PPM to bet”. This ciphertext is one of the short
is again applied to rank the solutions. The cryptograms (82 character long) that have
Viterbi algorithm is used in this phase to find been published by the American Cryptogram
the best possible segmentation. In this algo- Association, which usually publishes 100
rithm, looping over the deciphered text (that ciphertexts every two months including one
was produced as output from Algorithm 1) is or more Playfair ciphers, as a challenge to its
performed in line 2. A word segmentation al- members (Cowan, 2008). Cowan has stated
gorithm based on the Viterbi algorithm (Tea- that it is extremely difficult to break short
han, 1998) is then used to search for the best messages of 100 letters or so, especially when
performing segmentations to keep in a prior- there are no suspected probable words or cribs
ity queue, and those which showed poor code- and very little depth of knowledge of previous
length values are pruned (see lines 3 to 5). The messages. However, our method is able to
best segmented deciphered text is returned in solve the following examples in addition to
the last line (line 6). the other cryptograms that were listed by
Cowan as well as even shorter ciphertexts of
60 letters or so.
Algorithm 3: Pseudo code for Phase II
Input : the deciphered text from Phase I A second example in Figure 2 illustrates
Output: segmented deciphered text
1 maximum size of Q1 (priority queue) ← 1; the robustness of our compression approach
2 do
3 use the Viterbi algorithm to search for the best
by showing how it is able to solve a very
segmentation sequences; short cryptogram. The ciphertext is a
store the text that have the best segmentation
4
which present in Q1; 60 letter sentence (a quote by Garrison
5 while the end of the deciphered text;
6 return the best segmented deciphered text from Q1;
Keillor): Cats are intended to teach us
that not everything in nature has a pur-
pose. The best solution for this exam-
ple is ‘catsareintendedtoteachusthatnotexeryt-
4 Experimental Results
hinginxnaturehaoapurposew’ with the best code-
In this section, we discuss the experimental length value -137.68 resulting in only two er-
results of our approach. As stated, in our rors: x→v in ‘exerything’ and o→d in ‘hao’.
121
Iteration:35 Iteration:0
Mutation -221.66 ridaybetoktxtherkdeconotalwaystothescru- Mutation -311.85 tuemuirecolstaurytrtforxreafmstoopanile-
gain: fyorthebrtxtletothestucngmitxthatoithewaytobetx gain: rcekqbsulatydopatiekedanalondtfulsmonytancheck-
Key:zkbncwagerfhmlduvxyitsqpo andeqloilerchui
Mutation -220.15 ridaybetvstxthersaeksnotalwaystotheskru- ...
gain: fyorthebatxtletothestukngmitxthatksthewaytobetx Iteration:2
Key: zcbnkwagerfhmlduvxyitsqpo -311.01 adpcdhowwhsvalarcaucedofowleyldmailiumw-
Mutation -215.91 ridaybetvstxthersaemsnotalwaystothesmru- oseheatrelarbailaonmemlilgatstorelylscalikeese-
gain: fyorthebatxtletothestumngkitxthatmsthewaytobetx ctswpgaumwokedh
Key: zcbnmwagerfhklduvxyitsqpo ...
Mutation -207.60 rmdaybetvstxthersaeisnotalwaystothesiru- Iteration:24
gain: fnorthebatxtletothestningkmtxthatisthewaytobetx -309.15 hkxepmmaskbitesratokhdtvmabasedaeferela-
Key: zcbniwagerfhklduvxymtsqpo mlamntbinetnrefetlpgwhereeceithinesevaterbcalx-
Mutation -204.66 itzaybetvstxthersaeisnotalwaystotheswiu- leismecelambcpm
gain: fnorthebatxtletothestorngbutxthatisthewaytobetx ...
Key: dcbniwagerfhklzuvxymtsqpo Iteration:30
Mutation -195.04 itmaybetvstxthersaeisnotalwaystotheswiu- Row -304.26 amsegplaysonasarealmcarblaterstcrustita-
gain: fnorthebatxtletothestomngbutxthatisthewaytobetx swap lsniymeinsahirusaezkstatsorodtbinsrmreastpensn-
Key: dcbniwagerfhklmuvxyztsqpo gain: kodyboritalpegp
Row-swap -184.98 itmaybethvtxthervaeisnotalwaystotheswif- ...
gain: tnorthebatxtletothestzongbutxthatisthewaytobetx Iteration:89
Key: dcbniwagerfhklmtsqpouvxyz Row -215.81 thecoxordinatesarenorthfortydegrexeszer-
Row-swap -162.46 itmaybethatxtheraceisnotalwaystotheswif- swap opointfivethrexetwowestseventyfivedegreestwopoi-
gain: tnorthebatxtletothestrongbutxthatisthewaytobetx gain: ntfivezerotwox
key: dcbnifhklmtsqpouvxyzwager ...
Iteration:100
122
rors. Table 1 presents the results from testing average space insertion errors for the cipher-
ciphertexts for various lengths. The results texts that were experimented with in Phase II
overall showed that we are able to attain very is less than one error.
high success rates and 60 ciphertexts out of 70
were efficiently solved. Also, 100% of ciphers 10
Number of errors
6
Length
No. of 9 21 15 11 8 6 4
Ciphers
Success 67 81 80 100 100 100
Rate (%) 2
Referring to the second phase of our Figure 6: Segmenting errors produced as a re-
method, as the spaces are omitted from the sult of the Phase II algorithm.
ciphertext traditionally, this phase focuses on
segmenting the decrypted messages that are Table 2 lists the high recall and precision
outputted from the first phase. The edit dis- rates and the low error rate produced by our
tance (or Levenshtein distance) metric is used segmentation algorithm. The recall rate is cal-
to qualify how the decrypted message is differ- culated by dividing the number of successfully
entiated from the original message by count- segmented words over the number of words in
ing the minimum number of the removal, in- the original testing texts, the precision rate by
sertion, or substitution operations required to dividing the number of successfully segmented
transform one message into the other (Leven- words by the number of words which are cor-
shtein, 1966). In almost all cases, the correct rectly and incorrectly segmented and the er-
readable decryptions were efficiently found as ror rate by dividing the number of unsuccess-
the illustrated in Figure 5. fully segmented words by the number of words
Ciphertext byntlbneonnuimmzqnhpbkxnqmfqoqnmugclqmeuersuqp-
in the original testing texts (Al-Kazaz et al.,
nzigqbqyipilqtku 2016).
Decrypted cats are intended to teach us that not
text exerything in nature ha o a purpose
Ciphertext kuinbrnuikcnqmhuvgtnnmybkgbromruknqmmnknqdmpvg-
niignkoneumokgpgxytqsu Recall (%) Precision (%) Errors (%)
Decrypted experience is the worst teacher it gives the 96.72 96.12 3.28
text test before presenting the lesson
Ciphertext pqghqncnndqyhfqugqqmeusxqmfqdpkgqbitqdkunurqio-
innlpgvqvpbmlwhuqoimigbzka
Table 2: Recall, precision and errors rates for
Decrypted a n egotist is a man who thinks that if he hadnt our method for word segmenting the decrypted
text been born people would have wondered why
Ciphertext qmghblxytkyfihogkunugiqoqmgnqmgincimtlqmmpnuik-
output produced from Phase I.
iwszqmgiknliqrbhafgtigtldnnqgtxz
Decrypted the grass may be greener on the other side of
text the fence but there s probably more of it to
mow The execution times required to decrypt a
Ciphertext hmfnuwfntufdbgushmtuqmckqnutfpmatuzfmbfntylxqp- number of Playfair ciphertexts by our method
thrkucnrkrmcqdamibarurntumucoffdummbrnki
Decrypted the likeliness of a thing happening is inversely are presented in Table 3. This table shows the
text proportional to its desirability fin agles
first law decryption time in seconds for Phase I of our
Ciphertext dohrxnwpscqusfrwchrnpctsehagvpstsfaprdtuipwola- method. The results indicate that our method
cgqupfwptslaqsizbedxqusfwscosfraevstngqu
Decrypted it may be that the race is not always to the produces reasonable decryption times, and in
text swift nor the battle to the strong but that is most cases the successful decrypts of longer
the way to bet
ciphertexts were obtained after only one or two
Figure 5: Example of solved ciphertexts with iterations.
spaces inserted after Phase II.
Ciphertext 60 71 86 100 124 185 235 526 730
Length (Letter)
The number of space insertion errors for Time (Sec) 457 539 507 93 36 17 135 107 101
123
5 Conclusion Swati Hans, Rahul Johari, and Vishakha Gautam.
2014. An extended playfair cipher using rota-
An automatic cryptanalysis of Playfair ciphers tion and random swap patterns. In Computer and
using compression has been introduced in this Communication Technology (ICCCT), 2014 Inter-
national Conference on, pages 157–160. IEEE.
paper. In particular, a combination of sim-
ulated annealing and PPM compression was Paul Glor Howard. 1993. The design and analysis of
used in the automatic decryption method. efficient lossless data compression systems. Ph.D.
thesis, Brown University, Providence, Rhode Island.
The compression scheme was found to be an ef-
fective method for ranking the quality of each Sean A Irvine. 1997. Compression and cryptology.
possible permutation as the search was per- Ph.D. thesis, University of Waikato, New Zealand.
formed. In 60 of the 70 ciphertexts that were Richard E Klima and Neil P Sigmon. 2012. Cryptology:
experimented with (without using a probable classical and modern with maplets. CRC Press.
word) for different lengths (from as short as 60 Vladimir I Levenshtein. 1966. Binary codes capable
letters up to 750), almost all the correct solu- of correcting deletions, insertions, and reversals. In
tions were found. The exception was just two Soviet physics doklady, volume 10, pages 707–710.
very short ciphers which resulted in two mi- James Lyons. 2012. Cryptanalysis of the play-
nor errors in the decrypted output. Moreover, fair cipher. http://practicalcryptography.
we have also managed to decrypt an extended com/cryptanalysis/stochastic-searching/
cryptanalysis-playfair/.
Playfair cipher for a 6 × 6 key matrix.
In addition, a compression-based method Joseph Oswald Mauborgne. 1914. An advanced prob-
lem in cryptography and its solution. Fort Leaven-
was used to segment the decrypted output by worth, Kansas: Leavenworth Press.
insertion of spaces in order to improve read-
ability. Experimental results show that the Packirisamy Murali and Gandhidoss Senthilkumar.
2009. Modified version of playfair cipher using lin-
segmentation method was very effective pro- ear feedback shift register. In Information Manage-
ducing on average less than one space inser- ment and Engineering, 2009. ICIME’09. Interna-
tion error with a recall and precision of over tional Conference on, pages 488–490. IEEE.
96% for the ciphertexts that were tested. G Negara. 2012. An evolutionary approach for the
As PPM provides a different type of scor- playfair cipher cryptanalysis. In Proc. of the Int.
ing function compared to the standard n-gram Conference on Security and Management (SAM),
analysis (such as update exclusions, the es- page 1. The Steering Committee of The World
Congress in Computer Science, Computer Engineer-
caping back-off mechanism for smoothing the ing and Applied Computing (WorldComp).
models), it is not clear whether using longer
K Ravindra Babu, S Uday Kumar, A Vinay Babu,
context for n-grams might lead to better re- IVN S Aditya, and P Komuraiah. 2011. An exten-
sults. It is also not clear how PPM compares sion to traditional playfair cryptographic method.
to the standard n-grams approach and fur- International Journal of Computer Applications,
ther experimentation (for example with hex- 17(5):34–36.
agrams) needs to be done. Benjamin Rhew. 2003. Cryptanalyzing the playfair
cipher using evolutionary algorithms. Avail-
able: http://citeseerx.ist.psu.edu/viewdoc/
References download?doi=10.1.1.129.4325&rep=rep1&type=
pdf.
Noor R Al-Kazaz, Sean A Irvine, and William J Tea-
han. 2016. An automatic cryptanalysis of transpo- Laurence Dwight Smith. 1955. Cryptography: The
sition ciphers using compression. In Int. Conference science of secret writing. Courier Corporation.
on Cryptology and Network Security, pages 36–52.
Springer, Springer Int. Publishing. Shiv Shakti Srivastava and Nitin Gupta. 2011. Se-
curity aspects of the extended playfair cipher. In
Timothy C Bell, John G Cleary, and Ian H Witten. Communication Systems and Network Technologies
1990. Text compression. Prentice-Hall, Inc. (CSNT), 2011 International Conference on, pages
144–147. IEEE.
John Cleary and Ian Witten. 1984. Data compression
using adaptive coding and partial string matching. Jan Stumpel. 2017. Fast playfair programs. www.
IEEE Transactions on Communications, 32(4):396– jw-stumpel.nl/playfair.html. last accessed De-
402. cember 13, 2017.
William J Teahan and John G Cleary. 1996. The en-
Michael J Cowan. 2008. Breaking short playfair
tropy of English using PPM-based models. In Data
ciphers with the simulated annealing algorithm.
Compression Conference, 1996. DCC’96. Proceed-
Cryptologia, 32(1):71–83.
ings, pages 53–62. IEEE.
Dalal Abdulmohsin Hammood. 2013. Breaking a play- William J Teahan. 1998. Modelling English text.
fair cipher using memetic algorithm. Journal of En- Ph.D. thesis, University of Waikato, New Zealand.
gineering and Development, 17(5).
124
ManuLab System Demonstration
In both cases, the filter contains a set of strings as • 4 - selected filters palette.
an input, and also produces a set of strings as an
output. In case b), the output corresponds with the
input. This feature allows to join several filters as
a chain of operations. This chain can be also saved
and loaded.
We have already implemented the following fil-
ters:
• n-gram frequency,
• n-gram distances,
• index of coincidence,
• Shannon’s entropy,
• substitution,
Figure 1: Main components of the UI, displaying
• sub-pages selection, a page of the Rilke Cryptogram (Klausis Krypto
Kolumne, 2018).
• changing the read direction,
ManuLab was designed to provide a manuscript
• pattern search.
visualisation with a good user experience. This vi-
The result of the analysis is visualised through sualisation is visible in the major part of the ap-
pop-up menu for each filter. In most cases, the plication window (parts 1a and 2). A side-by-
data can also be exported into a csv file for further side image/transcription pair is displayed on the
processing. screen. In case of multiple images, the scrollbar
(visible under part 1a) or the left arrow and right
2.4 Source code arrow keys of the keyboard can be used to switch
The source code is available online at the follow- to other page. The orientation/alignment of com-
ing GIT repository: https://bitbucket.org/ ponents 1a and 2 can be changed to display the
jugin/manulab.git. parts vertically (see Figure 2).
126
The document transcription may contain any
valid characters. It is recommended to use a line
separator for each line and to use a custom de-
limiter between the symbols. This is very help-
ful in case of documents containing special sym-
bols, like the Rohoncz codex, where each symbol
can be transcribed into a unique number. The tran-
scription can be also displayed using any custom
font2 (In Figure 2, the upper part is the original
image and the lower part is the transcription using
a custom font).
127
Figure 5: Pop-up menu for the Frequency filter,
with available filter settings.
Acknowledgments
This work was partially supported by grant VEGA
1/0159/17.
References
CrypTool Contributors. Cryptool Portal, https://
www.cryptool.org/en/
128
Willard’s System
Niels O. Faurholt
MJ, DAA, retired
DDIS Technical-Historical Collection
faurholt@fasttvnet.dk
130
Figure 3: Willard ”hardware” and the encipherment process
You want to encipher the word ”HIS- (15 - 20 times the length of the keyword) it can
TOCRYPT”. You find the first plaintext letter be broken. The periodicity is the weak point. An-
”H” in the first column. The corresponding cipher other weak point is that if you have to count many
letter stands x places above or below ”H”. This places up or down in the columns, it will be dif-
must be agreed beforehand. If the agreed rule is ficult and give frequent errors. So normally you
”x=go down 2”, you go two places down from could assume that 5 places up or down will be the
”H” in the first column and find the cipher letter maximum. The length of the keyword will also
”I”. The second plaintext letter ”I” is found in for practical reasons seldom exceed 6 -10 letters.
the second column. Down 2 gives cipher letter In theory all the sticks could be used, 28 in all, but
”J”. The third plaintext letter ”S” is found in the that would give a very cumbersome operation.
third column. Down 2 gives cipher letter ”T”.
And so on. If a plaintext letter is at the bottom of
a column, you go to the top of that column to find
the cipher letter.
5 Security
How secure is this system? As the Danish school
teacher showed in 1873, it is not particularly
secure.If you have a cipher message long enough
131
6 Cryptanalysis Acknowledgements
Due to the construction of the Willard table, I am most grateful to professor emeritus Ole Im-
breaking the cipher depends largely upon the manuel Franksen from the Danish Technical Uni-
number of steps you go up or down in the opera- versity. Mr. Franksen brought the old periodical,
tion. Even numbers are much easier to break than ”Nær og Fjern”, to my attention and kindly al-
uneven numbers. lowed me to use his extensive research in the Dan-
ish crypto history in this article.
Even number of steps (2 or 4): As the Without the great help of my good colleague,
columns are in alphabetic order with every second Hans-Erik Hansen, this article would never have
place skipped, only a few letters can come 2 or 4 been converted to the required format.
places above or below a given cipher letter. We The Danish Defence Intelligence Service
start with 2 places up and down: E.g. cipher ”I” (DDIS) Technical-Historical Collection owns the
above will become plaintext ”J” or ”H” (there are Willard hardware, presented to the collection by a
exceptions in columns A and Ø). Cipher ”J” will former MFA employee. I am the curator (volun-
become plaintext ”K” or ”I”. It is possible quickly teer, unpaid) of the collection.
to form tables with 2 up or down, and 4 up or
down, and it will be obvious which of these is
the correct one. The plaintext can then be read in References
the table. Finding the keyword requires a longer Buonafalce, Augusto, Niels Faurholt and Bjarne Toft.
message. If the test for even number of steps does 2006. Julius Petersen - Danish Mathematician and
Cryptologist. Cryptologia, Vol. 30: 353-360.
not succeed, uneven steps are probably used.
Faurholt, Niels. 2006. Alexis Køhl: A Danish Inventor
Uneven number of steps (1, 3 or 5): Here the of Cryptosystems. Cryptologia, Vol. 30: 23-29.
construction of the table cannot help in the same Johnsen, Erik, Morten Christensen, Ole Immanuel
way. In principle any cipher letter can become any Franksen and Knud Nissen. 1994. Willard’s Sys-
plaintext letter. So we will have to use the Ka- tem. Matematiklærerforeningen DTU, Lyngby, DK
siski2 method to find the length of the keyword,
Kasiski, Friedrich W. 1863. Die Geheimschriften und
e.g. 5, and then solve the resulting monoalpha- die Dechiffrier-Kunst. E.S.Mittler und Sohn, Berlin
betic cipher texts the hard way. However, there is
a possibility to find the keyword: You have a good Kjølsen, Klaus and Viggo Sjöqvist. 1970. Den danske
Udenrigstjeneste 1770-1970, Vol. I. J.H.Schultz,
chance to locate ”E” in each of the 5 monoalpha- Copenhagen, DK
betic cipher texts. You can construct a table that
for each (cipher) letter shows in which column that Petersen, Julius. 1875. Nær og Fjern, 154:4-7. Peri-
odical, Copenhagen, DK
letter stands 1, 3 and 5 steps above or below ”E”.
If you combine the ”E”-equivalents with that ta-
ble, you may find which 5 columns (sticks) were
used in the encipherment. This gives the keyword
and an easy way to read the whole message.
132
The Application of Hierarchical Clustering to Homophonic Ciphers
Anna Lehofer
Department of Philosophy and History of Science
Budapest University of Technology and Economics
Budapest H-1111, Egry József u. 1. E 610, Hungary
lehofer.anna@gmail.com
134
the logarithm of the number of possible keys of the cluster for vowels and another one for consonants
ciphers and D is the redundancy of the language. almost perfectly, the dendrogram of the 700-
The unicity point is the message length beyond character-long text (of which unicity point value is
which decipherment using a known system becomes already much lower than the real text length) falls
a unique process. From the given formula it is clear into smaller clusters. These small clusters may still
that the lower the redundancy of a language, the support the individual codebreaking process but
greater the unicity point for a given cipher (Fischer, neither separate vowels and consonants, nor identify
1979 and Reeds, 1977). the homophone pairs of the ciphertext in a proper
I examined the original corpus in two ways. The way.
first table shows how entropy – thus redundancy –
and the unicity point changes when increasing the 4 Early Modern Letters
number of homophones gradually from 1 to 5 on the
700000-character-long corpus. Despite of the In this section, I will investigate encrypted letters1
indicated infinite limit of the 5th case, all of the from the early modern age. The cipher keys of these
related five dendrograms have identified the vowel letters were also available (in an archive or
and consonant groups correctly and clustering could reconstructed form), thus the keys offered help and
even find the 1-2-3-4-5 element homophone groups control when examining the efficiency of clustering.
of the ciphers. The first letter I have examined – C.Bay.01 – was
a 419-character-long almost fully encrypted letter
Number of
Number of
used code Hmax Hreal Redundancy Unicity point
that uses a very complex cipher key: beyond the
homophones
characters homophonic set of code characters it also indicates
1 35 5.129 4.58 0.107 1240 syllables, logograms and nulls with separate signs.
2 70 6.129 5.579 0.09 3706 The dendrogram outlined as a result of clustering
3 105 6.714 6.164 0.082 6816 proved that this letter was too short, the cipher key
4 138 7.109 6.579 0.074 10569 was too complex to give any help in the decoding
5 174 7.443 6.901 0.073 ∞ process.
After the Bay letter I looked for a letter with a less
Table 1: Increasing the number of homophones in the full complex cipher key than the first one, and examined
text
C.Wes.03.a. It was a 2359-character-long letter
The second table shows how unicity point using an all-in-all 43-element cipher key, assigning
changes when decreasing text length assuming 2 more (5-6) code characters only to the vowels.
homophones for each plaintext letters. The first The cluster map of this cipher looked more
value (around 700000 characters) shows the full promising. The software separated two bigger
length of the text, 100%. Than follows 10%, 1%, clusters: one showed only consonants, the other
0.5% and finally 0.1%. bigger cluster contained almost exclusively vowels.
The program identified homophone pairs in five
Text length (number of characters) Number of used code characters Unicity point cases. The remaining smaller groups and the
700934 70 3706 characters that were not grouped to other ones were
70093 66 3986 mostly logograms, so they were "outranked"
7009 66 4059 correctly from the homophones.
3504 65 4203 So far I have examined 6 more early modern
700 62 3515 ciphers to find out where the limits of applicability
are. All of the scrutinized letters come from the
Table 2: How unicity point changes when reducing text period 1664-1706 and have their cipher keys in an
length using 2 homophones per letter available form as well.
To describe applicability, two outcomes were
We can see that in the given artificial code,
tested: 1) whether the clustering process could
speaking of pure homophonic substitution, using
identify the vowels and consonants in different
two homophones for each plaintext letter, the
efficiency of hierarchical clustering falls down
around the text length of 3500 characters. Here the
1 Up to now I have investigated 8 early modern
unicity point is around 4200 characters, so a longer
Hungarian letters. Since this is a work in progress, this
text is needed for a safe codebreaking than the
outcome will be better grounded, after I will have
examined one. The dendrograms of these cases also transcribed and analyzed several other manuscripts in the
corroborate this statement: while the dendrogram of near future.
the 3500-character-long text can still separate a big
135
clusters, and 2) whether the clustering process could homophones (20) than the early modern practice
identify the homophone groups belonging to the showed (2-6). Investigating the unicity points of
particular plaintext letters. In cases where clustering ciphertexts it can be stated that hierarchical
can show up any of these two identifications, clustering was still efficient when text length was
hierarchical clustering can be stated effective. In under the unicity point, but near to it. In cases when
these cases hierarchical clustering can support the text length was much lower than the unicity point,
codebreaking process. the dendrograms could not give any help for the
The outcomes of the examined letters are codebreaking process.
summarized in the following table. The first column In a second part I have processed original early
shows the name of the letters following the notation modern ciphers with the upper methodology. I have
of Benedek Láng (Láng, 2015, 233). The column of stated that hierarchical clustering was efficient if it
text length shows how many code characters the could clearly identify the vowels and consonants in
concrete letters are made of; number of used code separate clusters on the dendrogram and/or if it
characters shows how many characters were could find the homophone groups belonging to the
actually used in the concrete letters. Hmax shows the particular plaintext letters. The features and
maximum value of entropy, Hreal stands for the outcomes of the eight early modern letters showed
actual values of entropy. Redundancy shows the text that when the unicity point was under or near the
redundancy of the letters, the column of unicity point text length the dendrograms could help the
indicates the required text length. Vowel-consonant codebreaking process. Hierarchical clustering could
groups shows whether the method of hierarchical not bring any results in case of letters that were
clustering could separate the vowels and the much shorter than the unicity point.
consonants in different clusters; and homophone Consequently, speaking of homophonic
groups shows if the clustering process could identify substitution ciphers we can state that the longer an
the homophone groups belonging to the particular encrypted letter, or the less symbols its cipher key
plaintext letters. uses, the more probable the cipher can be solved
with the help of hierarchical clustering. Since the
Number of Vowel-
Letters2
Text
length
used code Hmax Hreal Redundancy
Unicity
point
consonant
Homophone
groups
historical manuscripts of the early modern age do
involve such encrypted letters – we can find ciphers
characters groups
136
Teaching and Promoting Cryptology at Faculty of Science
University of Hradec Králové
138
• the x-vector corresponding to the order of 3.2 Computer Analysis of Encrypted
the first exchanged character is replaced Correspondence of House of
by the x-vector corresponding to the Piccolomini
second exchanged character; Bachelor thesis titled Computer Analysis of
• then y-vector corresponding to the order of Encrypted Correspondence of House of
the first exchanged character is replaced Piccolomini interconnects the two author’s study
by the y-vector corresponding to the areas (Vlnas, 2017). The author is a student of
second exchanged character; Informatics and History in education. Archives,
• then z-vector corresponding to the order of especially archives of aristocratic families whose
the first exchanged character is replaced members held important state, military or
by the y-vector corresponding to the diplomatic positions, contain, in addition to open
second exchanged character; documents, very often-encrypted documents.
After each substitution a new matrix D‘ is The analysis of archive ciphertexts is a
obtained, and the evaluation of the compliance of complex task. Standardly, the task is necessary to
the relative frequency of the trigrams in the do in large part by hand, such as recognizing
cipher text and the reference text is obtained different forms of written fonts (different types
using the evaluation function: of ancient shape handwritings or special cipher
26 26 26
characters), counting frequencies of individual
f ' D'ijk Eijk (3) characters, etc. Some monotonous tasks can be
i 1 j 1 k 1 performed a computer. The author of the
bachelor thesis had appropriately used custom-
After each substitution the values f and f‘ are made macros in Visual Basic for Applications to
compared and if f‘ < f, the procedure decrypt encrypted texts transcribed into Excel
immediately stops the process of the letter spreadsheets. After identifying a given cipher
substitution and the exchange in the conversion system and determining the transmission table, it
table is proposed, which will improve the is a purely mechanical matter. Macro combined
compliance (lowering the value of evaluation both basic principles used to make mono-
function f). Finally, the sub-procedure will alphabetic substitution, i.e. simple swapping and
creates a new matrix D of relative frequencies of supposed words, and was created in such a way
the trigrams of the cipher text, and it will provide that allow the gradual uncovering of the
a new assessment of compliance with the spreadsheets that effectively supports cipher
reference text using the evaluation function (3). shredders in the phase of stepwise reconstruction
of the encryption table.
The sub-procedure Frequency is run only once
at the beginning of the program to set the The first phase of the work was to obtain
appropriate initial conditions. The sub- appropriate cipher texts. The State Regional
procedures Trigrams and Exchange are run Archive of Zámrsk offers to researchers a family
alternately. archive of the genus Piccolomini on microfilms.
It has advantages and disadvantages. The
Above mentioned sub-procedures were advantage is the possibility of fast document
realized in Visual Basic for Application in MS browsing, where the scrolling of the microfilm in
Excel Spreadsheet. Deciphering based on the reader can be significantly faster than
trigrams’ analysis has been studied in cipher text working with original archives. The disadvantage
with the length in the range from 200 to 500 is the lower contrast and the overall loss of
characters. It was proved that algorithm for quality and the worse possibility of photographic
deciphering of short simple substitution cipher documentation. The student focused on
text based on automatic analysis of the trigrams documents related to the person of Ottavio
enables decryption almost without manually Piccolomini and his activities during the Thirty
performed exchanges Years' War. He found two microfilms with a
number of cryptic texts, partially decrypted,
apparently immediately after receiving the
addressee, but partly un-decrypted. The student
took photographs of the corresponding microfilm
images. He thought he had captured only a small
139
Figure 1: A homophone substitution cipher - pair of digits represents a letter or null character
part of Ottavio Piccolomini's cipher was certainly easier to read digits then to read
correspondence. The archive contains large handwritten characters in shapes which had been
number of the ciphertext than cannot be found written in individual manuscripts of many
within one business day. different scribes. This text was copied in the cell
of Excel spreadsheet. Also deciphering table was
Firstly, the document number 25039 was placed in the same sheet. The deciphering table
analyzed, see Figure 1. Part of this letter was was step by step filled in and simultaneously was
written in plain text and part was written in used for decrypting of prepared ciphertext. It was
numerical code. Some letters of the alphabet found that the open text in French was encrypted
have been written above the two-digit codes. It by simply replacing the alphabet letters with
allows reconstructed the decryption conversion two-digit numbers, but with the use of vowel
table - see Table 1. The entire cipher text, homophones and special codes for some selected
composed only of digits, was read quite well. It short words, supplemented by several null
140
characters. It is a simple nomenclator. Partially teees état misérable comme elle samuue du
decrypted text was a significant help, however, passé au camp skoprocle Bérenburg. En
the use of macros in Visual Basic for Application quoi consiste néanmoins la conservation ou
significantly simplified the complete decryption perte du reste de l'empire gedlusrc?“
of the encrypted part of the letter.
The similar system, based on a square table, with
1 2 3 4 5 6 7 8 9 otherwise delimited letters, was used in several
1 a b u C d que f other cipher letters. From the cryptanalysis point
2 e g et L a e l of view, the letter with non-encrypted cipher
3 i m sko n p q sections consisting of letters and numbers was
more interesting. This letter was included in
4 o r tre s t m o document number 24873, see Figure 2. The
5 u y s cipher was taken by standard methods, i.e. by
6 a a g e i l frequency analysis and predicted words. The
7 n o r s rc t plain cipher text was in Italian. Even this cipher
contained the null characters represented by all
8 u * * e *
even digits (2, 4, 6, 8). The cipher does not
9 e n r contain any homophones. If these null characters
were omitted, the very simple monoalphabetic
Table 1: Encryption table of the document 25039
substitution cipher was reached - the consonants
(* = null characters)
did not replace and the vowels, were
„C'est une misère que si nous puissions nous successively replaced by the numbers 1, 3, 5, 7,
entretenir un mois dans ce camp Xabucd que 9. As soon as the eyes of the solver became used
to the Italian handwriting of the first half of the
fais-je là et lim skonpqor très
17th century, reading the text was relatively
tmouysaageilnorsrctu et en joignant presque simple. Fig. 3.
le leur, ladite armée, ennemie seroit réduite
à la ruine, il nous faudra, à faute d'une b c d f g h j k l m n p q
trentaine de mil patacons pour faire la b c d f g h j k l m n p q
provision nécessaire de la prouiange
laquelle nous manque, que prendre autre r s t v 1 2 3 4 5 6 7 8 9
résolution à nous? sloigcer de quelques r s t v a * e * i * o * u
ligues d'ici où nous puissions muuerles dit su Table 2: Encryption table of the document 24873
iures et fourrage à fin de ne voir ce que dit (* = null characters)
une plais est réduit et cette belle armée en
Figure 2: An example of a cipher text (cut-off from document 24873) that was not decrypted by the recipient
141
Figure 3: Sample of decryption of the text from the Figure 2 (cut-off from document 24873)
142
as a method of the system approach in the Bachelor Thesis at Faculty of Science University
algorithm development thinking. International of Hradec Králové. Supervisor Š. Hubálovský.
journal of applied mathematics and informatics. 58 p.
4 (4), 92-102. ISSN 2074-1278.
Procházka, L. 2014. Board games, puzzles,
Hubálovský, S., & Musílek, M. 2014. Algorithm anagrams and ciphers as motivation in the
for Automatic Deciphering of Mono-Alphabetic teaching of algorithms and programming.
Substituted Cipher Realized in MS Excel Diploma Thesis at Faculty of Science University
Spreadsheet. Applied Mechanics and Materials. of Hradec Králové. Supervisor M. Musílek. 75 p.
p. 624-627
Procházka, L. 2012. Deciphering of substitution
Musílek, M., Hubálovský, Š., & Hubálovská, M. ciphers with computer support. Bachelor Thesis
2017. Mathematical Modeling and Computer at Faculty of Science University of Hradec
Simulation of Codes with Variable Bit-Length. Králové. Supervisor M. Musílek. 37 p.
International Journal of Applied Mathematics
Singh, S. 2000. The code book: the science of
and Statistics. 56 (1), 1-12. ISSN 0973-7545.
secrecy from Ancient Egypt to quantum
Musílek, M. 2012. Morse telegraph alphabet and cryptography. New York: Anchor Books. 411 p.
cryptology as a method of system approach in ISBN 0-385-49532-3.
computer science education. In Proceedings of
Vlnas, V. 2017. Computer Analysis of Encrypted
9th International Scientific Conference on
Correspondence of House of Piccolomini.
Distance Learning in Applied Informatics
Bachelor Thesis at Faculty of Science,
(DIVAI). Štúrovo, Slovakia: Wolters Kluwer. p.
University of Hradec Králové. Supervisor M.
223-231. ISBN 978‐80‐558‐0092‐9.
Musílek. 58 p.
Musílek, P. 2017. History of ciphering of
transposition ciphers with computer support.
143
Examining The Dorabella Cipher with Three Lesser-Known
Cryptanalysis Methods
Klaus Schmeh
Freelanced Journalist
klaus@schmeh.org
With the advent of computer technology and suit- For these cryptograms, frequency analysis,
able software (especially, the open source tool word pattern analysis, word guessing, and hill
CrypTool), breaking a mono-alphabetic substitu- climbing have failed so far. An interesting ques-
tion cipher (MASC) has become quite easy. Fre- tion is whether there are other MASC breaking
quency analysis, which once was the most impor- methods that can be applied in such a case. In
tant way to break a MASC, is often not even nec- fact, there are. The book Cryptanalysis by He-
essary any more, as a word pattern search con- len Fouché Gaines mentions three MASC break-
ducted by a software is usually more effective. In ing methods that are worth considering (Fouché
addition, hill climbing—another technique that re- Gaines, 1939): vowel detection, digram analysis,
quires computer support—has proven a powerful and consonant lining. All three methods are as
technique to break MASCs. good as not mentioned in the literature that has
146
Figure 1: The Dorabella Cryptogram is an unsolved ciphertext that has the appearance of a mono-
alphabetic substitution cipher (MASC).
Q R S T U V W X
- EG ET -H OT - YE -
As the next step, we determine the contacts of - AD LH NO CP - OY -
each letter. A contact is defined as a letter that - AY IT UI BI - -
stands directly before or behind a certain charac- - EE NO - -
ter. As the Dorabella Cryptogram doesn’t indicate - NS OH - -
spaces (and as we ignore the spaces in the com- - SI - -
parison text), each of the 87 letters, except the first - SE - -
and the last one, has two contacts. The examples - NH - -
described by Fouché Gaines contain spaces, which - DH - -
means that her contact analysis is a little different
from the one performed here. Y
Here’s the contact analysis for the comparison EW
text (below each letter the left and the right con- WC
tacts are isted, one contact pair per line): RO
147
H I J K L M N Vowel candidates in the Dorabella Cryptogram:
DA AJ IK JL KJ JJ JG F, J, P, C, D, G, N
DR NP LM NG NG GI
QH JJ MJ PL PL FK
HQ QP JF UP UP UP 3.2 Pointer 2: Letters contacting
BN PD low-frequency letters
NC What Fouché Gaines writes: Letters contacting
KN low-frequency letters are usually vowels.
NF
Does this hold for the comparison text? Yes. The
O P Q R S T U letters with frequency 1 are contacted by A, E, I,
GG IG FD HS RC FB FP O, and U. The letters with frequency 2 are con-
TK LH LU tacted by A, N, R, S, E, O, Y, and Y. This means
IC HI UN that 10 of 13 letters contacting low-frequency
CF FK letters are vowels. If we count only the contacts
UC that appear more than once or that contact a letter
NC that appears only once we get exactly A, E, I, O,
KN U, and Y.
NF
What does this mean for the Dorabella Cryp-
3 Vowel Detection Method togram? The letters contacting low-frequency
letters are B, B, D, D, F, H, I, J, J, L, G, G, H,
The vowel detection method is the first one de-
S, R, C, and F. If we count only the contacts that
scribed by Fouché Gaines. It is based on eight cri-
appear more than once or that contact a letter that
teria (Fouché Gaines calls them pointers) that can
appears only once we get B, D, J, G, H, S, R,
be used to identifiy vowels in a ciphertext. The
C, F, and B. As can be seen, the vowel detection
idea of this method is to use the vowels identified
doesn’t work here as good as for the comparison
for further investigations with other cryptanalysis
text.
methods (i.e., the vowel detection method alone
will usually not break a cipher, but it can help to
Vowel candidates in the comparison text: A, E, I ,
do so). Fouché Gaines’ vowel detection method
N, O R, S, and Y
should not be confused with the Shukotin algo-
rithm (Guy, 1991), which has the same purpose.
Vowel candidates in the Dorabella Cryptogram:
3.1 Pointer 1: High frequency of vowels A, E, B, D, F, H, I, J, L, G, H, S, R, C, F
I, and O
What Fouché Gaines writes: The vowels A, E, I, 3.3 Pointer 3: Wide variety in contact letters
and O are normally found in the high-frequency What Fouché Gaines writes: Letters showing
section of a cryptogram. wide variety in their contact letters are vowels.
Does this hold for the comparison text? It is true Does this hold for the comparison text? Yes.
for the vowels E, I, and O. The letter A, however, Fouché Gaines does not define exactly what wide
has a lower frequency than expected. variety means. However, if we look at the three
letters with the widest variety we see that these
What does this mean for the Dorabella Cryp- are the vowels E, I and O.
togram? If Fouché Gaines is correct the ciphertext
letters F, J, P, C, D, G, and N contain the cleartext What does this mean for the Dorabella Cryp-
vowels A, E, I, and O. togram? The three letters with the widest variety
are J, F, and P.
Vowel candidates in the comparison text: D, E, I,
N, O, T Vowel candidates in the comparison text: E, I, O.
148
Vowel candidates in the Dorabella Cryptogram: 3.6 Pointer 6: Doubled consonants
J, F, P. What Fouché Gaines writes: Doubled consonants
are usually flanked by vowels, and vice-versa.
3.4 Pointer 4: Repeated digrams Does this hold for the comparison text? The
comparison text contains three doubled letters:
What Fouché Gaines writes: In repeated digrams
LL, CC, and DD. Only CC appears inside a word
(in immediate succession), one letter is usually a
(OCCUPY), while the other two stand at the end
vowel.
of a word (STILL) or are spread to two words
(OLD DOMIN). While CC is flanked by two
Does this hold for the comparison text? There is
vowels, LL and DD aren’t. It seems that this
no repeated digram in the comparison text.
pointer is not applicable, if the word boundaries
are not known.
What does this mean for the Dorabella Cryp-
togram? There is no repeated digram in the What does this mean for the Dorabella Cryp-
Dorabella Cryptogram. togram: The digram JJ appears twice. HH and
UU are two more doubled letters. No conclusions
Vowel candidates in the comparison text: - can be drawn from these facts at this stage.
Vowel candidates in the Dorabella Cryptogram: - Vowel candidates in the comparison text: -
149
Vowel candidates in the comparison text: -
J P B C D F G H I K U
Vowel candidates in the Dorabella Cryptogram: - 11 8 4 6 6 8 6 4 4 4 4
10 8 7 7 9 11 7 5 5 6 6
150
like a vowel. Figure 3 shows the Dorabella Cryp-
togram with marked vowels and consonants (this
is the next step recommended by Fouché Gaines).
In my view, there is no obvious way to proceed
from here. So, I will leave further steps (for in-
stance, checking if one of the different candidates
for N and H makes sense) to future research.
All in all, it can be said that the consonant line
method, which works surpisingly well on an ordi-
nary English text of the same length and written in
the same year, doesn’t render a clear result for the
Dorabella Cryptogram.
6 Conclusion
To my regret, none of the three methods described
in this paper has led to a solution of the Dora-
bella Cryptogram. Nevertheless, there are a num-
ber of interesting conclusions that can be drawn
Figure 2: Consonant lines for the comparison text from the examinations described in the previous
(left) and the Dorabella Cipher (right). paragraphs. Here are some general ones:
151
Figure 3: The Dorabella Cipher transcription with marked vowels (bold) and consonants (underlined).
Further research may show whether there are conclusions that can be drawn from this.
• The frequency count and the contact counts • The consonant line method should be adapted
of the Dorabella Cryptogram are consistent to other languages.
with the English language.
• The concept of reversed digrams might be
• The following letters in the transcribed Dora- helpful for cryptanalysis. It should be ex-
bella Cryptogram could be vowels: F, J, and plored.
P. I am optimistic that some of the unsolved cryp-
• The number of reversed digrams in the com- tograms mentioned in the introduction can be
parison text is double as high as in the Dora- solved with the methods covered here. I hope that
bella Cryptogram. additional research in this direction will be con-
ducted.
• Candidates for the letters H and N have been
found in the Dorabella Cryptogram.
References
• The vowel detection method and the con- Bauer, Craig. 2017. Unsolved!. Princeton University
sonant line method work less good on the Press, Princeton, NJ.
Dorabella Cryptogram than on the compari-
Belfield, Richard. 2006. Can You Crack The Enigma
son text. Code. Orion Publishing, London, UK.
Here are some some ideas for future work: Fouché Gaines, Helen. 1939. Cryptanalysis. Dover
Publications, New York, USA.
• Some of the instructions given by Fouché
Guy, Jacques B. M. 1991. Vowel Identification: An
Gaines are a little fuzzy. Especially, the def- Old (but Good) Algorithm. Cryptologia (4), 258-
inition of letter frequency classes and the 262 (1991)
quantification of contact variety is not very
Schmeh, Klaus. 2012. Nicht zu Knacken. Hanser, Mu-
precise. This leaves room for further re- nich, Germany.
search.
Schmeh, Klaus 2017. The Top 50 unsolved en-
• Fouché Gaines’ methods assume that the ci- crypted messages: 31. The MLH cryptogram Klau-
sis Krypto Kolumne, 2017-05-23
phertext examined contains spaces. As this
is not the case in many cases, the methods Schmeh, Klaus 2017. Who can decipher this en-
should be adapted to cryptograms without crypted inscription on a cigaret case? Klausis
Krypto Kolumne, 2017-11-26
known word boundaries.
Schmeh, Klaus 2017. Mathematical formula needed
• The concept of letter contacts should be made Klausis Krypto Kolumne, 2017-12-22
more popular in cryptananalysis.
Schneier, Bruce 2006. Handwritten Real-World Cryp-
togram Schneier on Security, 2006-01-30
• The consonant line method should be applied
on other cryptograms, as well. Wikipedia. Dorabella Cipher. Retreived 2018-01-08.
• Further tests whether the Dorabella Cryp- Wikipedia. Voynich manuscript. Retreived 2018-01-
08.
togram is a real text should be made.
152
Design and Strength of a Feasible Electronic Ciphermachine from the
1970s
154
and a message key, unique for one message, of m in a plaintext, because the corresponding charac-
letters. In practice it might be necessary to send ter cannot be printed. In such a case the decipher-
the message key in the clear with the message, so ment is repeated with the same keynumber, until
the message key should not be used unmodified. an acceptable result has been reached. If the plain
This could be achieved by using (part of) the long- number equals 4, the plain character is a space.
term key to change the bits of the message key,
before they are entered into the machine. 3.6 Stepping of the registers
The key space of this machine has the size After the encryption of a letter has ended success-
25(n−1) . fully, the registers should step a number of times.
To ensure a high period for the keystream gener-
3.4 Character encipherment
ated by the machine, at most one register can have
To encipher a plain letter, a key number K could be a constant step, so e.g. register 1 steps a constant
determined by reading off the contents of sections number of steps. Then the higher numbered regis-
k of all the registers. ters should make a variable number of steps, e.g.
P the number of steps could be determined by the
K = 5i=1 Ri,k ∗ 2i−1
XOR of the content of the sections s of the lower
(where Ri, j is the content of section j of register i).
numbered registers.
The plain letter could be converted to a number P
P using its ITA2 bit value. For the word separator S i = i−1
j=1 R j,s mod 2
representing the space, the value P = 4 could be where S i determines the number of steps of regis-
used. ter i for i > 1.
The formula for the encipherment could be: In this way it is guaranteed, that whenever one
C = P − K mod 32 register completes its period, all the other regis-
ters do not. This ensures that the period of the
where the number C is converted through the
keystream of the machine is in the order of magni-
ITA2 table into a cipher letter. If C is equal to
tude of (2n )5 .
0, 2, 8, 27, 31 or the number corresponding to the
letter used as the word separator, it cannot be used 3.7 Summary of the design
in a ciphertext, because the corresponding charac-
ter cannot be printed. (Such a character is con- The design of the machine meets the necessary cri-
sidered illegal for this machine). In such a case teria for a secure algorithm that were mentioned
the encipherment could be repeated with the same earlier:
keynumber, until an acceptable result has been The keyspace has a large size.
reached. It is easy to verify that this procedure The keystream has a large period.
always stops. The keycharacters generated have a flat distri-
bution.
3.5 Character decipherment Different messages use different parts of the
Character decipherment is the inverse operation of keystream.
the character encipherment. To decipher a cipher
letter, a key number K is determined as described 4 Strength
above, by reading off the contents of sections k of When attacking the feasible cryptographic ma-
all the registers. chine described above, it is reasonable to assume
P
K = 5i=1 Ri,k ∗ 2i−1 that the details of the design are known to an at-
The cipher letter is converted to a number C using tacker. This approach is fully in line with the sec-
its ITA2 bit value. ond Kerckhoffs’ principle that was stated by Au-
The formula for the decipherment is: guste Kerckhoffs in 1883 in (Kerckhoffs, 1883).
It is clear that the variable stepping of the higher
P = C + K mod 32 numbered registers implies that the start of an at-
where the number P is converted through the tempt to break this feasible machine, can only be
ITA2 table into a plain letter. If P is equal to an attack on the first register, the only one that has
0, 2, 8, 27, 31 or the number corresponding to the a constant step. After solving the first register the
letter used as the word separator, it cannot be used stepping pattern of the second one is known and
155
then the second register could be attacked and so Then the initial content is solved by solving the
on. set of equations, generated by those reliable bits.
Under certain conditions on the values of the bias
4.1 Fast correlation attack and the length of the used part of the shift register
The fast correlation attack is a procedure to deter- sequence this method converges.
mine the initial content of a shift register if only a
4.2 Correlation attack on register 1
part of the output sequence containing some errors
is known. For every letter of the ciphertext the 32 possible
This attack has been presented for the first time decryptions can be divided into two groups, one
at Eurocrypt 1988 in Davos by Willi Meier and corresponding with an even key number and the
Othmar Staffelbach and has been published in other one corresponding with an odd key num-
(Meier and Staffelbach, 1989), although the attack ber. Depending on the letter frequencies in the lan-
was already known many years before. guage of the plaintext, the two groups have a dif-
The method works as follows: ferent probability of occurrence. The consequence
The enciphering method modifies the bits of the is that the probability of the last bit of the key num-
register that will be attacked. This modification is ber being 0, is different from the probability of the
not random, but biased to either 0 or 1. The part last bit of the key number being 1. In other words
of the shift register sequence for which this biased for every cipher letter there is a bias to either 0 or
information is available is initialised with values 1 for the last bit (K1 ) of the key number K.
coming from the bias. Compare
The feedback polynomial gives a relation that is P(K1 = 0 | cipherletter = C)
satisfied by all the bits in the output sequence of with
the shift register. Moreover all the multiples and P(K1 = 1 | cipherletter = C)
in particular all the 2n powers of this polynomial
(which are all also trinomials) give relations that An example: For cipher letter H the set of decryp-
are satisfied. tions with an even keynumber is:
As an example if X n + X f + 1 is the feedback {B,G,H,L,M,O,P,Q,T,V,W,Y,Z}
polynomial, then if bi is the i-th bit of the out- The letters L, W and Z occur twice as a decryp-
put sequence, bi+n + bi+n− f + bi = 0 mod 2 is sat- tion with different keynumbers.
isfied for all i > 0. But also bi+2n + bi+2(n− f ) + For odd keynumbers the set is:
bi = 0 mod 2 and so on. Moreover one can add {A,C,D,E,F,G,I,J,K,N,R,S,U,X}
bi+n+ f + bi+n + bi+ f = 0 mod 2 to bi+n + bi+n− f + bi = Here F and U occur twice as a decryption with
0 mod 2 and in this way get the new relation different keynumbers.
bi+n+ f + bi+n− f + bi+ f + bi = 0 mod 2. In this way In most languages the second set is much more
many trinomial and tetranomial relations can be frequent than the first one, especially because the
found which are satisfied by the bits of the pro- X in the second set is the word separator and thus
duced output sequence. represents the space character. So for cipher letter
Step 1 is: Count for every position the number H the bias is for a 1 as the last keybit.
of relations involving that position that are satis- This bias makes it possible to determine the ini-
fied and the number of relations that are not satis- tial shift register sequence. First the bias is used
fied. If the number of satisfied relations is much to make an approximation of the shift register se-
higher, then the bit at that position is probably re- quence produced by the first register of the ma-
liable and nothing is changed. In the other case chine.
the bit is probably unreliable and it is declared un- If the language statistics are favourable enough
known. and the message length is sufficiently large, the
Step 2 is: Compute for all the unknown bits a fast correlation attack reconstructs the initial con-
new value on the basis of the reliable bits only. For tent of the first register of the machine.
many unreliable bits this will succeed and then a
new sequence has been determined. 4.3 Correlation attack on register 2
The counting and correcting steps are repeated A similar approach can be used for the higher
until enough highly reliable bits have been found. numbered registers. Once register one is known
156
the stepping of register two is also known. More- of the registers 1 and 2 are known only 8 possible
over for every cipher letter one keybit is known. decryptions remain possible.
In this case the decryptions of a given cipher letter If the cipher letter is H and the keybits from the
are divided into four groups. In each group the key first and second register both are 1 then the plain-
numbers have a fixed value modulo 4. letter must be one of the set {A,E,F,G,I,S,U,X}.
An example: For cipher letter H the four groups Now it is possible to test for every position in
are: the ciphertext, whether a word that probably will
be present in the plaintext, fits the ciphertext.
key = 0 mod 4 {H,L,P,Q,T,W,Y,Z}
So you might get the following situation:
key = 2 mod 4 {B,G,L,M,O,V,W,Z}
(CT is the ciphertext letter and k1 and k2 are
key = 1 mod 4 {C,D,F,J,K,N,R,U}
the known keybits from the registers 1 and 2).
key = 3 mod 4 {A,E,F,G,I,S,U,X}
CT D P I F E R U O B K
Then depending on the value of the keybit pro- k1 0 0 1 1 0 1 0 1 0 0
duced by register one, either the groups with key k2 1 1 1 0 0 0 0 1 0 0
numbers equal to 0 and 2 modulo 4 are compared S V V A E G U N B K
or the groups with key numbers equal to 1 and 3 A B B H S V X R O N
modulo 4 are compared. U T P Z A B X C O N
Compare C L R O U X S D G R
P(K2 = 0 | cipherletter = C ∧ K1 = b) X Z D M E I I F O C
with K O O G X S E J M D
I M M V A O S K G F
P(K2 = 1 | cipherletter = C ∧ K1 = b)
E G G B I M A T V J
where b is the known keybit from the first register. The boldface letters show that here the word
For every ciphertext letter the bias between the AMBASSADOR is possible.
groups that are compared is used to initialise an If the word has a sufficient length only correct
approximation for the second register. Once again positions are found. (Almost) every plaintext let-
if the conditions are favourable the fast correlation ter that is found in this way fixes a keybit in the
attack finds the initial content of register two. registers 3, 4 and 5. In the example above that
is the case for all ciphertext letters, except the ci-
4.4 Registers 3, 4 and 5 phertext letter B, for which there are three possible
In principle the same method can be used for the decryptions as O.
registers 3, 4 and 5. However, in many languages As soon as enough independent keybits have
the statistics are unfavourable for a successful at- been found, the initial content of the registers 3,
tack on register 3. 4 and 5 can be found by solving a set of linear
One way to improve the statistics is by using bi- equations for each register.
grams of the ciphertext instead of single ciphertext
letters. That improves the bias because in that case 5 Conclusion
the frequencies of plain letter bigrams are used to The cryptographic algorithm that has been de-
calculate the bias. scribed could very well have been invented in the
Compare 1970’s, when new ideas in cryptography were gen-
P(K3 = 0 | cipherpair = CC ′ ∧ K1 = b1 ∧ K2 = erated by the new electronic components that be-
b2 ∧ K1′ = b′1 ∧ K2′ = b′2 ) came available. The paper shows that although
with the algorithm meets a number of design criteria,
P(K3 = 1 | cipherpair = CC ′ ∧ K1 = b1 ∧ K2 = it still could have been broken using an idea that
b2 ∧ K1′ = b′1 ∧ K2′ = b′2 ) was published in 1989.
157
Willi Meier and Othmar Staffelbach. 1989. Fast Cor-
relation Attacks on certain Stream Ciphers. Journal
of Cryptology, 1(3):159–176.
158
Author Index
Kopal, Nils, 29
Lasry, George, 55
de Leeuw, Karl, 49
Lehofer, Anna, 133
Niebel, Ingo, 65
159
Published by