KELLIA Products

White Papers

Transcription and Encoding Standards for Digital Coptic (forthcoming)

Metadata Recommendations (forthcoming)

Linked Data Standards for Digital Coptic (forthcoming)

Full final report for the KELLIA project (forthcoming)

Online digital Coptic dictionary

KELLIA members Frank Feder and Maxim Kupreyev of the The Berlin-Brandenberg Academy of Sciences and Julien Delhez of the University of Göttingen created a digital lexicon of Coptic. Amir Zeldes and his student Emma Manning created a web interface for the lexicon, which is available as a standalone website and is linked to Coptic SCRIPTORIUM texts in the normalized visualization available through the web application and ANNIS tool.

Online Coptic Natural Language Processing (NLP) service

KELLIA member Amir Zeldes created a web application and a machine actionable API that simultaneously runs Coptic SCRIPTORIUM's Natural Language Processing tools. Users can input text, including XML tags, and the online NLP pipeline will tokenize, normalize, lemmatize, and tag the text for part-of-speech and language of origin in SGML output.

Coptic Neural Network OCR

KELLIA member So Miyagawa is creating Coptic OCR using Artificial Neural Networks collaborating with Kirill Bulert (eTRAP/Max Planck Insitute for Biophysical Chemistry), Marco Büchler (eTRAP) and Eliese-Sophia Lincke (Humboldt Universität zu Berlin/eTRAP).

SCRIPTORIUM-VMR converter

KELLIA member Uwe Sikora built a converter of XML structrues of Coptic SCRIPTORIUM and VMR

Integration of data from the Database and Dictionary of Greek Loanwords in Coptic (DDGLC) into the NLP tools

Information from the DDGLC from the University of Leipzig has been added to NLP tools, improving the language of origin tagger, the tokenizer and morphology analysis, and the lemmatizer and part-of-speech tagger. All of these tools are also available from the online NLP service.

Growing Besa Corpus

KELLIA partner So Miyagawa has edited and translated Besa's On Lack of Food, which is available on the Coptic SCRIPTORIUM web application for viewing texts and is searchable in ANNIS

GitDox: Online multilayer corpus annotation editing tool

GitDox is a light-weight transcription and annotation tool customizable for individual projects and for multiple languages. Created by researchers at Georgetown University during the KELLIA project, it currently contains a transcription/text editor with customizable encoding validation options and a spreadsheet editor for collaborative editing of a multi-layer annotated document. The tool can be adapted to different languages; the Coptic Scriptorium version is linked to the online Coptic Natural Language Processing Service. After researchers transcribe a Coptic text with light XML markup for structural information (i.e. page breaks, missing text, etc.), they can click a button to run the text through the NLP tool pipeline; this annotated text is presented to the researcher in a multilayer format in spreadsheet mode. Researchers commit the data and subsequent edits to repositories on GitHub. The tool includes space for document metadata and customizable validation mechanisms. The tool is open-source (Apache 2.0 license) and available for download and installation.

Coptic WordNet

KELLIA member So Miyagawa is building Coptic WordNet with Laura Slaughter (University of Oslo) and Luís Morgado da Costa (Nanyang Technological University, Singapore). One of their papers is now available online.

KELLIA E-ditions

KELLIA member Uwe Sikora, Tiffany Ziegler and So Miyagawa created a link and description website about DH projects of digital editions.