Wish you could access the Vatican's Secret Archives online? Sorry, it's not there—but a new project combining artificial intelligence with optical-character-recognition (OCR) could change that, the Atlantic reports. Called In Codice Ratio, it aims to scan the archives' 53 miles of shelved material dating back some 1,200 years. Only problem: OCR struggles with hand-written material like the cursive and calligraphy that's typically in the archive. OCR works by identifying groups of letter-images and comparing them to letters stored in memory, but handwriting makes that hard by stringing letters together with no spaces between them. So In Codice Ratio turned to high schoolers for help.
Students from 24 Italian schools then compared medieval Latin letters with the OCR's attempted letter scans, and identified which ones worked best. Initially "the idea of involving high-school students was considered foolish," says a lead scientist on the Italian project. "But now the machine is learning thanks to their efforts." With In Codice Ratio still struggling, the researchers fed it 1.5 million digitized Latin words to show which letter combinations arise most often. Now the software has a 96% success rate. At stake is the chance to unveil the Pope's personal archive online—but that won't include material on the Church's sexual abuse scandal, History.com notes, because archive documents are only made available to scholars after 75 years.