Digital Medieval Data Curation

  • Published on
    18-Jul-2015

  • View
    97

  • Download
    1

Transcript

Digital Medieval Data Curation

Digital Medieval Data CurationCLIR Postdoctoral Fellowship SeminarBryn Mawr, 2013Benjamin Albritton, Stanford University Librariesblalbrit@stanford.edu@bla222Current State: A World of Silos

Roman de la RoseParker on the Webe-codicesAnd so on2Data InteroperabilityBreak down silosSeparate data from applicationsShare data models and programming interfacesEnable interactions at the tool and repository level

3Three communities in this ecosystemPorous boundariesShared requirementsOverlapping roles and responsibilitiesGoal: Shared creation, curation, preservation, and access to data about manuscriptsDesigning Modular Repositories and ToolsImage Data (Canonical)Image ViewerDiscoveryAnnotationNon-image data (Canonical)TranscriptionImage ViewerImage AnalysisDiscoveryTool X?RepositoryRepository User Interface3rd-Party Tools4Image Data (Canonical)Image ViewerDiscoveryAnnotationNon-image data (Canonical)TranscriptionImage ViewerImage AnalysisDiscoveryTool X?RepositoryRepository User Interface3rd-Party ToolsDesigning Modular Repositories and Tools5Image Data (Canonical)Image ViewerDiscoveryAnnotationNon-image data (Canonical)TranscriptionImage ViewerImage AnalysisDiscoveryTool X?Designing Modular Repositories and Tools6Iterative Interactions

Multiple Data SourcesExisting structured data (catalogs)User-addedCommentsTranscriptionsEtc.Digital imagesMachine processingMotivating Questions

What does this mean for medieval data?

How do we rethink medieval object data in a shared, distributed, global space?How do we enable collaboration and encourage engagement?How do we deal with tools that are producing new data on digital surrogates that are implicitly about a real world object?

9Transcribing from Digital Surrogates

La Terre de SecilleNave Approach: Attach Transcription to ImageOne problem example: Multiple Representations

CCC 26 f. iiiR11Nave Approach: Attach Transcription to ImageOne problem example: Multiple Representations

CCC 26 f. iiiRFold A Open12Nave Approach: Attach Transcription to ImageOne problem example: Multiple Representations

CCC 26 f. iiiRFold A OpenFold A and B Open13Nave Approach: Attach Transcription to ImageOne problem example: Multiple Representations

CCC 26 f. iiiRFold A OpenFold A and B Openf. iiiV14The Shared Canvas

Represents a real world thing we want to talk aboutHas a unique namehttp://dms-data.stanford.edu/Parker/CCC026/canvas-12 Data Model: SharedCanvas

http://www.shared-canvas.orgData is about a real thing

Canvas Paradigm

A Canvas is an empty space in which to build up a display Makes explicit that the image is a surrogate18Open Annotation ModelAnnotation (a document)Body (the comment of the annotation)Target (the resource the Body is about)

19

Model: Annotations to Paint CanvasThe Canvas represents the empty pageAnnotation links Image with Canvas20Model: Annotations to Paint Canvas

Annotation links Text with Canvas21Model: Annotations to Paint Canvas

22Model: Missing Pages

23Medieval Data Use-Cases: A SamplerStructured data from existing sourcesTranscription and glyphsStructured data from new sourcesStructured Data from Existing Sources

A Catalog of the Manuscripts of Salisbury Cathedral LibraryDrives Discovery

Transcription:T-PEN (Saint Louis University) http://t-pen.orgTranscription toolProvides image parsingColumns

BNF fr. 9221 column parsingT-PEN (Saint Louis University)http://t-pen.orgTranscription toolProvides image parsingColumnsLines

BNF fr. 9221 line parsingT-PEN (Saint Louis University)http://t-pen.org

BNF fr. 9221 transcription viewDrives Full-Text Search

http://t-pen.org/TPEN and other interfaces

http://stanford.edu/~blalbrit/v-machine-2/samples/DamedequiRF5.xmlT-PENs PaleoTool

BNF fr. 1586 glyph parsingResults for matching glyphs

Glyphs with multiple letters

Comparing results across manuscriptsBNF fr. 1586

CCCC 324

User-created Structured Data

Beinecke MS 310, f. 1rEach row = 1 day (January 1, here)Lists the feast of the CircumcisionOptionally provides additional informationDistributed Resources / Distributed Environments

Data capture in T-PEN

http:t-pen.org Saint Louis UniversityFront-end: Exhibit

http://guillaumedemachaut.com/kalendar/sharedkalendar.htmlSimple (really simple) Exhibit based on kalendar transcriptions(Exhibit: http://www.simile-widgets.org/exhibit/)Allows filtering by date, item, and manuscript, as well as search across the items39For each record:

Enabling rapid comparison

Two mss. include the entry Thimotheus apostel

Distributed Resources / Distributed Environments

SharedCanvas Demo Implementation

http://www.shared-canvas.org/impl/demodhSharedCanvas Demo Implementation

http://www.shared-canvas.org/impl/demodhSharedCanvas Demo Implementation

http://www.shared-canvas.org/impl/demodhA Sea of Manuscript DataThousands of manuscripts currently available interoperably, with more coming rapidlyDiscovery data is a mixed bagTools provide data back into the system that can be re-usedNew data drives new discovery, new interfaces, and new visualization challengesManagement and manipulation of that wild data is a serious challenge

Recommended

View more >