Transcribathon gives community a glimpse into the history of medicine

Loretta Merlo, circulation manager, transcribes 19th-century case files at the Transcribathon.

Article credit: WCM Central

This summer, the WCM community came together to give artificial intelligence a human boost by transcribing handwritten medical notes from the 19th century into computer-readable files. The transcribed documents are part of a project by Cornell Tech master’s candidate Praveen Kumar Govindaraj to teach computers to decipher the florid script preferred by society at that time.

Govindaraj is using machine learning, a technique that allows the computer to learn from data without being specifically programmed. With the handwriting recognition project, the computer analyzes a set of “gold standard” transcriptions that have been verified as accurate. And the more there are, the better.

The Transcribathon gave human participants a fascinating view into medicine in the 1800s, with cases ranging from an inebriated sailor who gashed his head after falling off the dock to a young woman who had been badly burned when a candle ignited her clothing. Treatments at the time varied from remedies still in use today to therapies such as bloodletting that were not very helpful.

Transcribathon participants weren’t working from the original case files. The fragile and sometimes decaying files had first been scanned and saved as digital documents by the Medical Center Archives staff as part of a project spearheaded by Dr. Curtis Cole, chief information officer, with a grant from the Frank Naeymi-Rad and Theresa A. Kepic Foundation. With digital copies, the documents will both be preserved and more accessible to those unable to visit the archives in person.

The hope for the machine learning project is that the computer will become able to generate keywords from the scanned handwritten documents so that they can be organized in a searchable database. It will likely take many years of refinement for the technology to be able to generate complete, accurate translations from the script.

Fortunately, five more Cornell Tech students have signed on to advance the project: Young Sang Choi, Evan Yates, Kelly Wang, Aaron Yingxiang Lu and Rohun Tripathi are spending this semester tackling the handwriting-recognition problem as part of a Product Studio challenge to develop a technology-driven solution to a business need.

The Transcribathons took place in the Samuel J. Wood Library computer lab on July 27 and Aug. 17. Additional events will be scheduled in the future.  

Need Help?

myHelpdesk
(212) 746-4878
Monday-Sunday
Open: 24/7 (Excluding holidays)
SMARTDesk
WCM Library Commons
1300 York Ave
New York, NY
10065
M-F
9AM - 5PM
Make an appointment

575 Lexington Ave
3rd Floor
New York, NY
10022
Temporarily Closed