Data Integration

Innovation: Integration of Geovisual and Humanities Databases

All of our work will be made possible by our plan to integrate Geovisual and Humanities databases. Unlike other efforts that focus on manuscript transcription or GIS-interpretation, our work will integrate a humanities-oriented database (historical details, events, and relationships) with a geovisual database (GIS and 3D modeling). Filemaker Pro is ideal for our humanities-oriented database because (1) it utilizes open ODBC and XML standards to exchange data and (2) its relational databases can exchange static and live data with SQL data sources. Further, Filemaker Pro’s relatively user-friendly tools will encourage our non-technical humanities scholars to learn to use more robust data analysis tools in their scholarship.

Although the backbone of our data management scheme will utilize traditional relational databases, with the assistance of Dr. Kantabutra we will begin testing a new data organization process known as Intentionally-Linked Entities (ILE). It replaces the rigid “indexing of things” with a more flexible approach that links “entities” (things, usually nouns) and “entity sets” (e.g. the entity set of “municipalities”) through “pointers,” which point in both directions. ILE also allows for any number of relationships to be represented as relationship objects. That is, each relationship object relates entities that have “roles” to play in that particular relationship. This aspect of ILE makes it particularly simple (and efficient) for users to navigate among the stored entities involved in that relationship and empowers scholars to explore more effectively the interconnections among individuals and groups defined by kinship, faith, and office.

Data Management Plan

Dr. Martinez at UCCS and Dr. Schinazi at ETH-Zurich will be responsible for overseeing the successful collection, preservation, and dissemination of project information. In his capacity as an database expert, Dr. Kantabutra will advise on all efforts to integrate our data processes and storage. Generally speaking, this project deals with data of two types: spatial (i.e. location-based) and non-spatial (e.g. properties of different places and things in space). Microsoft Access, Filemaker Pro, SQL and ESRI’s ArcSDE geodatabase will be used to support the storage and management of spatial and non-spatial data. Our storage and management scheme includes three components:

Copyriught 2014. Revealing Cooperation and Conflict Project.

Copyriught 2014. Revealing Cooperation and Conflict Project.

Expected Data and Data Format

The project will generate a comprehensive collection of information, including:

Some of the particular data formats (and storage processes) that we will utilize are:

Data Format and Dissemination

Digital data will be stored within a secure server at ETH. Using a previously established secure Microsoft Sharepoint sharing technology, all data will be made available to project collaborators. With the support of the Information Technology (IT) team, appropriate access level will be determined and granted to each collaborator. Microsoft Sharepoint will be configured to adhere to ISO 15489 standard of records keeping, and COBIT (Control Objectives for Information and Related Technology) guidelines for data governance. The ETH IT team has a vast experience setting up various collaborative environments for research purposes. All data will be the shared intellectual property of the participating scholars and institutions. Materials will also be publically accessible via our public project website and other dissemination strategies.

Period of Data Retention

UCCS and ETH-Zurich will guarantee the hosting of their respective public project websites and project data for a period of no less than ten years, but PIs intend to host these sites on a perpetual basis. In the event that neither UCCS or ETH-Zurich can permanently host the sites, the Project Director will seek another public access hosting provider.

Data Storage and Preservation of Access

The primary repositories for all data will be UCCS and ETH-Zurich servers and websites that will be properly indexed and searchable via the respective campuses’ official university websites. Both these university data storage systems are equipped with automated and redundant backup services. These systems run a full backup on a regular schedule according to university policies.

Data Protection

Data used in this project are already public domain. All personal data collected using the transcript tool will be made anonymous and not made available for public access.