Spy Tech: Unshredding Documents
Bureaucracies generate paper, usually lots of paper. Anything you consider private — especially anything that could get you in trouble — should go in a “burn box” which is usually a locked trash can that is periodically emptied into an incinerator. However, what about a paper shredder? Who hasn’t seen a movie or TV show where the office furiously shreds papers as the FBI, SEC, or some other three-letter-agency is trying to crash the door down?
That might have been the scene in the late 1980s when Germany reunified. The East German Ministry of State Security — known as the Stasi — had records of unlawful activity and, probably, information about people of interest. The staff made a best effort to destroy these records, but they did not quite complete their task.
The collapsing East German government ordered documents destroyed, and many were pulped or burned. However, many of the documents were shredded by hand, stuffed into bags, and were awaiting final destruction. There were also some documents destroyed by the interim government in 1990. Today there are about 16,000 of these bags remaining, each with 2,500 to 3,000 pieces of pages in them.
Machine-shredded documents were too small to recover, but the hand-shredded documents should be possible to reconstruct. After all, they do it all the time in spy movies, right? With modern computers and vision systems, it should be a snap.
You’d think so, anyway.
Shield and Sword of the Party
The Stasi has been likened to the Soviet KGB. Using civilian informants, they contributed to the arrest of about a quarter of a million people between 1950 and 1990. They also had extensive files, and if they survived the destruction, people can ask to see the information the Stasi collected about them.
The agency was known for pervasive and invasive spying, with agents in every apartment building and all major companies. At the time they disbanded, the agency had over 90,000 employees and nearly 175,000 informants. This works out to one secret policeman for every 166 East Germans. By contrast, the Gestapo had one agent for every 2,000 people. They were also known for harassing enemies of the state. If you want to learn more about the Stasi, Deutsche Welle has an interesting short documentary about the agency and its spying activities that you can watch below.
So you can see why the Stasi leadership wanted to destroy files. Citizens occupied the Stasi offices, but not before about 5% of the documents — 1 billion sheets of paper — were destroyed somehow. As the German Democratic Republic fell, many citizens protested the destruction of the papers, primarily to ensure there was evidence to prosecute wrongdoing in the agency. However, some informants wanted the documents destroyed so they would not be identified.
The new government appointed an office to control the records, but there was a strong debate about what to do with them. Some wanted them sealed or destroyed. Others wanted them used for prosecution. In the end, the Unification Treaty allowed people to access their own files starting in 1992. Between 1991 and 2011, about 2.75 million people have requested to see their files. Near relatives can also request the files of deceased or missing persons. The media and schools can access documents that are redacted for personal information. The archive is now the responsibility of the Federal Archives.
However, there are still these 16,000 bags of fragments — about 45 million pages worth. In some cases, the destroyed files had pages simply torn in half or quarters. Those are the easy ones, but they do not all fall into that category. The 36 archivists tasked with reconstruction processed 327 bags in 13 years, not a speedy record.
To help, the German government turned to computers and the Fraunhofer Institute. Scientists there demonstrated software known as e-Puzzler that would revolutionize the document reconstruction process. However, that turned out not to be the case. While it does work, the process is painfully slow.
In theory, it makes sense. An article in The Guardian from 2007 describes the machine. According to the article:
The machine works by scanning the document fragments into a computer image file. It treats each scrap as if it is part of a huge jigsaw puzzle. The shape, colour, font, texture and thickness of the paper is then analysed so that eventually it is possible to rebuild an electronic image of the original document.
Some marketing material from Fraunhofer itself says, “The system uses an adaptive, non-deterministic workflow to process a wide range of characteristics, such as the contour, color, writing, and lines of the fragments.” Seems plausible. You can see the system in action in the video below. That video also notes some of the possible reasons the project has been a failure.
So What Went Wrong?
It isn’t clear why this isn’t feasible. The Fraunhofer system did help Bundesbank match up damaged banknotes. However, banknotes are more uniform and have known features that the Stasi documents lack. And, as you can see in the video below, it still looks like there is some manual work required. Despite putting in about 6.5 million euros, the official word is the process didn’t scale well for this many documents.
On the one hand, you have to imagine that computers and image processing has come a long way since 2013. It is surprising you couldn’t do much better with modern hardware and techniques. Of course, if you were conspiracy-minded, you might wonder if someone doesn’t want the project to succeed.
To be fair, opening the bags is a chore. Archivists try not to disturb the order of the papers and they often contain trash which we are sure is pretty disgusting after all these years. Some papers have clips or staples and many are wrinkled. The machine needs the pages separated and flattened. To help speed up the process, each piece destined for the machine has to be about 2 cm square or larger. The downside is the documents are two-sided, which doubles the number of trips for each piece.
We get it. Building one radio is easy. Building 16,000 of them is hard. We know people can unshred documents that aren’t reduced to dust or ash. Even the crosscut shredder isn’t foolproof if you have the right open-source software. The Iranians famously employed carpet weavers to reassemble documents taken from the US embassy in 1979. You can hear more about some of these cases in Edward Robinson’s CHCon presentation in the video below.
We suspect the Stasi files will remain shredded and unread for a long time, but maybe not entirely for technical reasons. We also imagine since DARPA has sponsored challenges for unshredding, that someone — maybe a lot of someones — has some great tech for this that they aren’t making public.