DigiLab + DiXiT 

The Scholarly Digital Edition and the Humanities. Theoretical approaches and alternative tools”

Rome, 3-5 Dec 2014

The workshop ‘The Scholarly Digital Edition and the Humanities. Theoretical approaches and alternative tools’ was held in Rome from 3 to 5 December 2014. The initiative was organized by DigiLab (Centro interdipartimentale di ricerca e servizi of Sapienza University of Rome), with the support of DiXiT Marie Curie Network, of which DigiLab is one of the partners. The scientific project is directed by Domenico Fiormonte; the workshop has been organized by Federico Caria and Isabella Tartaglia. It has been widely attended: students, PhD candidates, professors, researchers; representatives of various projects spread around Europe; philologists, philosophers, conservators and experts of media and communication.

The talks of Domenico Fiormonte, Desmond Schmidt and Paolo Monella are focused on the digital scholarly edition and problematize different aspects of it, addressing its creation and usage. It relies on not simply adhering to some practices, experiences or traditions, or to uncritically use a standard, but to study them closely, identifying their strengths and weaknesses by tracing their history until today. This will enable scholars to make well-informed scientific choices.

We will give a brief account of the talks. The complete materials of the workshop are available here.

In his talk, with the illustrative title ‘The socio-cultural foundations of the scholarly digital edition‘, Fiormonte, recalled by Schmidt, opens focusing on the inner workings of the digital edition: software, i.e. code. The hermeneutical circle of digital humanities assumes a loop of hypotheses, practices / methodologies, and results. The references are to the identification of a software culture by Manovich1 and to Kittler’s provocation on code: codes are what determine us today?2. The speaker also addresses editions, which has to be seen as a digital representation of a textual artifact and read through a semiotic of culture. In the field of scholarly edition too, standards have imposed a formalization, which models its representation –from characters to the definition of a text. The former is explored in Monella’s talk; furthermore Fiormonte drafts a history of character encoding, starting from the first international standard, which came into being in a peculiar cultural environment: the American Standard Code for Information Interchange was developed from telegraphic codes and first used in a Bell teleprinter; it does not take into account, for example, accentuated characters, which is normal because of its provenance. This has become a limitation, a bias, when ASCII was accepted by the ISO in 1972 as an international standard. Its successor, which integrates it, is the UTF-8 standard, one of the encoding language of Unicode. It’s worth noting that the board of directors of an initiative whose goal is the character encoding of all active languages of the world is composed by representatives from Google, Apple, Intel, Microsoft, IBM, ISM-Appature.

The idea of the text underlying the scholarly edition, which perpetuates itself in the digital context, is the one of an image, a simulacrum, to be restored and preserved. This approach is informed by structuralist philology and the imperatives of information retrieval of computer science. Considering the teaching of those scholars who overcome data, in order to insert into the hermeneutical circle the process and the context (e.g., Contini, Halliday, Benozzo, Rico, Fiormonte, Schmidt), some desiderata for a new scholarly edition are expressed: digital philology should not be limited to the preservation and reconstruction of cultural heritage documents, but it is required to act as an interface to knowledge; there should be no separation between the scope of the representation and the needs of the user3.

Before presenting Ecdosis, a software for creating digital edition, Schmidt briefly recalls the history and prehistory of critical edition and markup, recognizing common elements from Hellenism to nowadays (e.g., font dimension, formatting, markup, citation). In the current situation, digital editions have some problems: their cost is high and they are not many. Schmidt’s solutions boils down to better overall software design: easier data entry, editing and proofreading; reuse of standardized software modules across projects; lower software maintenance costs. If we recognize that digital edition is software, it is clear how their development should be governed by the procedures and principles of software development.

Schmidt distinguishes two ways of building applications: a bottom-up approach, that can be summarize in the provocative expression ‘build it, and they will come’, in which one starts from design data structure, ending with the user interface; a top-down approach, or ‘user-centric design’, in which the user requirements come first and data entry at the end. The second one is more suitable for the development of softwares, and therefore digital editions.

Schmidt asks if it is possible to achieve what has been said using the XML-TEI platform. The answer is negative, for the following reasons: lack of interoperability4, complexity of the TEI Guidelines and decline of XML and XSLT in web development.

The MultiVersion Document (MVD), Ecdosis internal format, arises because of these limitations and from the collaboration with Fiormonte. The format allows to store in a single file different versions of a work; one can query the file to identify variants across versions and each version is readable. As already mentioned, MVD is Ecdosis internal format; data import and export are possible in a variety of formats. When importing a XML document into Ecdosis, the program processes the encoding in order to create autonomous versions: each layer of markup, or nested markup, corresponds to a version. No collation tool is needed with MVDs, because the functionality is already available into the format itself. Within the present panorama of editing tools, Ecdosis, which is still under development, presents some innovative features, as a Minimal Markup Editor based on markdown, automatic shape recognition (stored in GeoJson format) for the link between text and image, a plain text editor for editorial material, the possibility of organizing data as events to be shown on a time-line.

Monella brings to the fore a topic that would seem too specific to reach the general interest (of the attenders). His talks deals with the encoding of primary sources whose writing systems were introduced before the normalization brought about by printing technology; it is indeed a relevant issue in the field of digital scholarly edition, as proved by the many editions which offers a diplomatic and normalized transcriptions of the sources; furthermore, the question has been recently (re)raised in the TEI mailing list5. Monella exhorts to think about a formalization of standard characters, versus non-standard characters; which consequences entails this kind of representation, a cultural one? Beside the difficulty of encoding rare or ancient characters, more and more overcame by initiatives like MUFI6, the consequence is a lack of attention towards character encoding: we have mechanism to define the meaning and usage of XML elements used in the TEI grammar — the element <paragraph> does not determine itself intuitively–; on the contrary, an A is just an A.

The recording of ‘the complete set of discrete elements in which the encoder decides to divide the continuum of possible letters, macrons, dots, decorations etc‘ has been implemented in Orlandi’s table of signs7. For his own Vespa project, Monella draws up a similar table, which is converted by a script into a set of XML files, for the alphabetical and graphical level, linked with the transcriptions XML files, also layered on alphabetical, graphical and linguistic layers. On this latter is working now Monella, in order to develop a complete and sustainable work-flow for a complex model of characters encoding in digital editions8.

1  Manovich, Lev, Software takes command, Bloomsbury, New-York, London, 2013.

2  Softwares Studies: A Lexicon, a cura di Matthew Fuller, Cambridge, Mass.: MIT Press, 2008.

3  Cfr. Fiormonte, Domenico, «Chi l’ha visto? Testo digitale, semiotica, rappresentazione. In margine a un trittico di Dino Buzzetti», Informatica Umanistica, 2, 2009, pp. 21-63.

4  Schmidt, Desmond, «Towards an Interoperable Digital Scholarly Edition», Journal of the Text Encoding Initiative [Online], Issue 7, November 2014. http://jtei.revues.org/979.

5  In the TEI list archive pertinent discussions can be found into the Objects: Describing glyphs in <msDesc>, Dealing with obscure characters in Unicode.

6  http://folk.uib.no/hnooh/mufi/ [Accessed 15 Dec 2014].

7  Orlandi, Tito, Informatica testuale. Teoria e prassi, Laterza, Roma, 2010. Cfr. http://www.cmcl.it/~orlandi/principe/ [Accessed 15 Dec 2014].


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s