The design of the ASDN digital repository will be based on six pilot datasets containing records about individual slaves collected by scholars from diverse types of source documents and a wide range of geographic areas. In addition to being valuable materials to be made accessible to other scholars and the public for the first time (only one of the datasets is currently available online), they will be excellent datasets for expanding the descriptive fields in the ASDN repository and testing the digital analytical tools.
PI Gwendolyn Midlo Hall’s Louisiana Slave Database 1719-1820 (LSD) contains some 100,000 descriptions of slaves recorded collected over a period of 15 years from numerous original manuscript sources. The following table lists the types of sources that were consulted and the number of slaves found in each type of document.
Types of Source Documents in Louisiana Slave Database
Document Type Frequency
Estate inventory 27,638
Estate sale 3,761
Sale (other than probate) 50,124
Criminal litigation 150
Other litigation 6
Mortgage 930
Marriage contract 964
Will 463
Seizure for debt 574
Confiscation in criminal proceedings 13
Report of runaway 196
Miscellaneous 5,060
List (e.g., census) 1,552
Slave testimony 589
Atlantic Slave Trade Voyage 8,645
Total number of slaves found 100,665
Because of its diverse sources, Hall’s dataset includes a wide range of descriptive data fields, including but not limited to the type of document in which the slave was described, the location where the slave lived, name of owner, name of slave, gender, age, racial designation (black, Indian, mulatto, etc.), African ethnicity when recorded (if born in Africa, as many slaves documented here were), marriage partner, children, skills, illnesses, injuries, perceived value (inventory and sale price, if sold alone), and date and retrieval information for the document in which the slave was described. (See list of 109 fields in the LSD in Appendix A.) This database was first published as a CD in 2000 and was made available online free of charge in 2001 at: www.ibiblio.org/laslave. Hall’s work, discussed further in the History, Scope, and Duration section, was recognized in Rediscovering America: 35 Years of the National Endowment for the Humanities (NEH, 2001).
In 2005, PI Walter Hawthorne assembled his Maranhão Inventories Slave Database (MISD), containing information about almost 8,500 slaves in the Brazilian state of Maranhão (located in the Amazonia region) from 1767-1832. Hawthorne discovered the documents that comprise the basis for the MISD in the Arquivo Histórical do Ministério das Finanças.in Maranhão. There they sit uncatalogued in boxes rarely viewed by professional historians. The documents are plantation inventories, which were recorded by the state when planters in the captaincy died. Among other things listed in the inventories are details about slave holdings, including slaves’ names, ethnicities, genders, skills, values and injuries. The documents provide rare details of slave populations recorded in a standardized way beginning in an early period of Amazonian development as a colony.
Douglas B. Chambers, Associate Professor of History at University of Southern Mississippi, will contribute his dataset, “Jamaican Runaways: A Compilation of Fugitive Slaves, 1718-1817” (2004), based on newspaper ads for approximately 7,500 runaway slaves from Jamaica, about half of whom were African.
Virginia Meacham Gould, Adjunct Faculty, Department of History at Tulane University has compiled datasets based on civil documents, including police records, as well as Catholic Church sacramental records in New Orleans (1727-1852) and Mobile, Alabama and Pensacola, Florida (1809-1843), from which she has published both books and journal articles.1
Paul F. LaChance has agreed to contribute all his databases about Louisiana including “Index to New Orleans Indentures, 1809-1843,” which contains records from 1809 to 1843 in five indenture books housed in the Office of the Mayor of New Orleans. The indentures are an exceptionally rich source regarding both the characteristics of the indentured apprentice or laborer and the conditions of the contract: the years of service owed the employer and what the worker (many of whom were slave apprentices) or the master of the slave was promised in return.2
O. Vernon Burton will contribute census data records about slaves that he collected about Edgefield District, South Carolina, which he used for his book, In My Father’s House are Many Mansions: Family and Community in Edgefield, SC (Chapel Hill: University of North Carolina Press, 1985).
A number of other scholars have compiled databases about individual slaves in the U.S. and abroad that have not been made publicly available. We believe there will be wide interest in adding to the ASDN database, thus making it an increasingly rich resource about slaves throughout the Atlantic World. For example, beginning two decades ago, Manolo Florentino constructed databases in Brazil using dBASE for DOS software. Laird Bergad, Fé Iglesias and Maria Carmen Barcelo have created a slave sale database from documents in Cuba, and Bergad created slave databases for Matanzas, Cuba and Minas Gerais, Brazil, but these have not yet been made available to other scholars. As the collection of ASDN datasets expand, it has the promise of bringing many scholars into a collaborative space to conduct research on new topics, including international comparative analysis, based upon vast quantitative data that will be available for analysis together for the first time
At present, people combing through historical sources for genealogical research are ahead of humanities scholars in making their data available publicly online. Amateur genealogists are an important audience for the ASDN project, and we expect that many users searching the ASDN datasets will come with this interest. Many of these people become interested in history inspired by their genealogical research and become very competent amateur historians. AfriGeneas – African Ancestored Genealogy (http://www.afrigeneas.com/), an impressive website hosted at Mississippi State University, is a freely-accessible site with both rich data and valuable discussion networks. The almost 4,000 people following AfriGeneas in the few months since it went on Facebook in spring 2010 attest to the large community of African-Americans and others doing genealogical research about slave ancestors. The ancestry.com commercial websites in both the United Kingdom3 and United States have expanded their records of former slaves, making the basic index data available to the public but historical documents available by subscription only. For example, records from the Slave Manifests Filed at New Orleans, Louisiana, 1807-1860 Forms (http://community.ancestry.com/project.ashx?pid=31204) will soon go live on ancestry.com. This is an Ancestry World Archives Project in which data have been entered by volunteers from online digitized documents. But the fields are chosen based on the interests of genealogists not historians. For example, while the slave manifest documents include the height of each slave, this information was expressly excluded from ancestry.com’s data entry system.4
Why have few humanities scholars working in the field of Atlantic slavery—who are the project’s primary audience—made their datasets available online? Several factors are at play. First, they have not had access to the technology to put their databases online and some were reluctant to do so because of professional hesitancy about sharing data. Second, for many humanities scholars, SPSS or other statistical programs are not really accessible. A new generation of digital tools is needed, and very few websites with large datasets for humanities research are yet providing them.
B. Relation to Existing Databases and Scholarly Networks
Many decades of research about slave voyages by numerous scholars following in the path of Philip D. Curtain have culminated in the ambitious and innovative large-scale slave database hosted at Emory University—Voyages: The Transatlantic Slave Trade Database (TSTD2). It has received substantial financial support from NEH and deserved praise from scholars. Several historians working on the ASDN also participate in the TSTD2 project. PI Hawthorne has advised on this project and two scholars who have agreed to serve on the ASDN Advisory Board also work with the TSTD2: Manolo Florentino is one of four primary editors, and Paul LaChance is a member of the Development Team.
The TSTD2 project documents about 35,000 slave ship voyages. On the site, users can trace flows of slave trade voyages from ports in Africa to ports in the Americas and can generate information about, among other things, the volume of those flows over time. TSTD2 shows better than any other scholarship the nature of the linkages between broad regions of Africa and broad regions of the Americas. TSTD2 allows examinations of port-to-port transfers of African slaves; ASDN will permit users to peer beyond regions and/or ports of departure and ports of arrival and into voyages by land and sea beyond these ports of departure and arrival and into African and American interiors.
What the TSTD2 does not contain, but the ASDN will contain, is information about the enslaved Africans aboard these ships including their final destinations in the Americas. TSTD2 PI David Eltis and others wrote that there is very little information in voyage documents about the names or ethnic designations of the slave “cargo” aboard these ships.5 The exception is the 67,000 “recaptives” returned to Africa - Africans liberated by British anti-slave trade patrols from slave ships they captured after Britain outlawed the African slave trade in 1808. These recaptives have been entered in the TSTD2’s African Names Database. Although of great interest, these named slaves involve a relatively small time and place within the context of the Atlantic slave trade.
The Atlantic Slave Data Network, with its focus on biographical information and calculations about slaves and their descendants, will not, then, replicate the TSTD2 in any way either in content or as a tool. The ASDN data fields flow from descriptions of individual slaves, not of slave trade voyages. It will be a wholly different data collection that will augment and be augmented by knowledge and experience gained from TSTD2 and will encourage further research.
C. Preserving data collected by scholars and making it accessible for collaborative research
Too often, valuable historical data that are carefully collected by individual scholars are subsequently lost. After a specific article or monograph is finished, scholars may lock the data away in file cabinets or discard it. This project will build a digital repository where scholars can deposit and then share their data. Many scholars have a strong interest in preserving their data, although migrating it to new formats is often daunting.
Many scholars feel ambivalent about sharing their research data with others, however, having spent a significant amount of their precious research time searching for and compiling it. Here, a shift in the culture of scholarship from individual production to scholarly collaboration is needed. The ASDN will offer a model intended to assist in this cultural shift.
Scholars whose datasets are approved as being relevant to Atlantic slavery and of good quality by a rotating editorial committee will be able to obtain the benefits of preserving their data and using the site’s digital tools while they initially analyze their data privately. Scholars will have the option to make their datasets inaccessible to others (in a password-protected section of the digital repository) for a period of time to be decided by the editorial committee. Although the project is founded on the principle that broad sharing of data is valuable to humanities scholarship, we recognize that scholars often need to publish articles and books based on data they have collected over a considerable period of time. Sensitive to this concern, we want to ensure that scholars (especially outside the U.S.) do not see this project as an attempt to “grab” data. Scholars will be assured that the ASDN site will back up their data on multiple servers and make it accessible to them anywhere in the world in an easy-to-analyze fashion. Once these scholars have made private use of material they have collected for the time period established by the ASDN project, their datasets will be opened along with the other datasets on the ASDN site for use by scholars and the public. We envision that allowing scholars to “stage” their data releases will help them to move towards open access.
D. Creating Tools for Data Analysis
Historians, their students, and the public are often unaware of how to use statistical and graphing tools, such as SPSS, SAS or ORACLE, all of which are expensive, proprietary software, to manipulate and make sense of data. This is particularly true of scholars in poor countries in Africa and the Americas who often have no access to expensive data analysis tools. There is a tremendous digital divide between the U.S.-based scholars and our colleagues in Africa and the Americas. For many, the value of quantitative data remains a mystery. Moreover, even those knowledgeable of such tools often work alone and are unable to collect the density of data necessary to make broad claims about African and African-American slaves’ lives (or the lives of any population). The proposed innovative ASDN will create new tools to encourage and assist in collaborative, international studies of massive and widely scattered collections of materials about slavery in the Atlantic World.
The digital repository that will be used for ASDN is the open source application KORA (http://kora.sourceforge.net/), designed by Michigan State University’s MATRIX digital humanities center and now in its third-generation. KORA has been used successfully to create websites with large quantities of digital objects. (See: http://www2.matrix.msu.edu/wp-content/uploads/2008/11/kora_sheet.pdf.) Because KORA is a robust web-based application, the distributed development of collections and datasets where multiple users can simultaneously work with the same resource from separate locations at the same time is possible. This capacity can facilitate the rapid development of multiple online datasets contributed by multiple project participants located in places scattered far and wide.
MATRIX has national and international experience using this application to build collaborative, educational digital projects with multiple partnering institutions and scholars, particularly in Africa. We will create both written instructions and a web-based tutorial—initially in English, French, Spanish, and Portuguese—explaining how to enter data into the ASDN schema in KORA. The project will take advantage of KORA’s multilingual capabilities; the ASDN data ingestion pages, data fields, and controlled vocabulary for many of the fields will be translated into French, Spanish, and Portuguese.
Making statistical data easily available and securely preserved is, then, one aspect of the project. Making that data understandable is another. Scholars and students—and anyone with Internet access—will be able to search and browse (or download) individual datasets or the entire collection of datasets. Users will be able to formulate questions (like those spelled out in section I.A. above), and get calculated answers not only in the form of numbers but also visual graphs: pie and line charts and histograms. Also, the project will build a flash-based visualization platform that will work with KORA that will allow users to map the data; maps of the Americas and Africa will illustrate from the places where individual slaves (based on ethnic identifications) hailed and to where they were finally brought. A time scrubber will allow scholars and students to see temporal and spatial shifts in patterns, a visualization of the slave trade attached to names and individuals. The map visualizations will be especially appealing to teachers and students, as they will link visually specific places in Europe, the Americas and Africa to other places in Africa, connecting the slave trade to real people.
E. Making Social Knowledge Networking Available
The ASDN project will have extensive social knowledge networking features that will be designed not only to stimulate communication and collaboration among the diverse range of scholars interested in slavery, but to break down traditional knowledge silos, open new perspectives, and facilitate the advancement of knowledge within the domain. To these ends, the ASDN project will have several core concepts that cut across the user generated social content. The most important of these are tags. All user-generated content discussed below (regardless of how formal or informal it is) is taggable. These tags are vital for the creation of connections between scholars and research.
In addition to tags, the ASDN project will include a range of features, all of which are designed to facilitate the advancement of knowledge within the domain. First and foremost, the system will feature a variety of ways for scholars to contribute original content (either standalone or linked to data within the system):
Threads. Threads are the most informal unit of user-generated knowledge in the ASDN project. The feature will allow scholars to post short, almost casual, pieces of content. These could include original thoughts or simply contextualized links to other content (internal or external). Threads, which are analogous to microblogging, are designed specifically as a low cost way for scholars to contribute content.
Research Notes. This feature is the basic unit of formal user-generated knowledge in the ASDN project. Each user has their own set of Research Notes, into which everything of value relating to their research or scholarly activities goes: preliminary conclusions, field observations, textual analysis, notes, etc. While Research Notes resemble blog posts, they are not personal or professional announcements. Instead, they are substantive units of research thought. As with all scholar-generated content, Research Notes are taggable, thereby facilitating connections between scholars.
Research Discussions. While scholars have the ability to engage in discussions around specific pieces of content (Threads & Research Notes), they also will be able to initiate standalone discussions. Like other user-generated content in the ASDN project, Research Discussions will be taggable, thereby allowing them to be included in system-wide searches and content interconnectivity. We anticipate that Research Discussions will focus on a wide variety of issues: interpreting data (such as the meaning of certain ethnonyms when written in a variety of languages and with changing meanings over time and place, the way in which slaves’ ages were determined, the rootedness of specific African names to a specific region, etc.), research strategies using various sources documents, what analysis of the data reveals (and sharing of calculations that can be replicated and checked), and approaches to teaching the next generation of scholars about Atlantic slavery and analysis of large-scale datasets in the humanities.
Knowledge Collections. ASDN users will have the ability to create Knowledge Collection, to which any unit of content (Threads, Research Notes, and Research Discussions) can be added. These Knowledge Collections can then be contextualized by the collection’s “curator,” and shared with other ASDN users. Knowledge Collections will allow combination (and recombination) of knowledge, producing new insight into a particular topic.
Research Connections. The ASDN system will create connections between scholars who might be working in the same intellectual space (or complementary intellectual spaces) but, because they are in different sub-disciplines, might never become aware of each other’s work because they do not publish within the same journals or attend the same conferences - essentially, they don’t travel within the same academic “circles.” These connections will be displayed as recommendations, and will be generated automatically by the system (based on tags created by users to describe their areas of research) and by the users themselves.
In sum, the project will make tremendous amounts of data available to scholars, students and general audiences throughout the world by using the latest best practices in digital technology and social knowledge networking to study, organize, and make new connections among previously inaccessible data. This ASDN project also will lay the basis for expanding innovative tools in comparative, international humanities; increase international collaboration among scholars, especially those with limited funds for advanced technology and travel; and promote global understanding and consciousness.
F. Selection Criteria for Datasets and Designating Database Fields
We face a number of methodological and practical issues with this project that we have begun to address in meetings with scholars engaged in the study of Atlantic slavery.
Selection Criteria: The ASDN repository and digital tools will be available for datasets the records of which are named individual slaves. Datasets collected by scholars will be reviewed by a rotating board of experts who will determine the quality of the data before being posted. Datasets from throughout the Atlantic World and from diverse source documents will be welcome.
Descriptive Fields in the Database: Hawthorne and Hall, other scholars contributing pilot datasets, and the Advisory Board will develop a comprehensive list of fields to be included in the ASDN digital repository schema. In a meeting in June 2010 at the Harriet Tubman Institute for the Study of Migration of African Peoples at York University with faculty (including ASDN Advisory Board member Paul Lovejoy and historian of Angola José Curto) and graduate students, we began this discussion. Fields need to reflect the full range of data that are available in different types of source documents, as well as bibliographical and retrieval information about each record from the sources themselves. Both accuracy of transcription from sources and standardization are essential for a database that can be used for calculations and comparison across datasets. Standardized lists of values for particular characteristics of slaves can be quite extensive, as can be seen by the list of values about illnesses and skills in Hall’s dataset that appear in Appendix B. One clear example of the need for standardization is ethnonyms, the spelling of which and meaning of which varies greatly from document to document. For example, in documents Hawthorne has examined, the ethnic group Balanta is often written Balante, Balant, Ballanta, and Ballandra. “Nago” in one era may appear as “Yoruba” in another. Spelling of place names also differ, as do the nature of how professions are characterized. To address these issues will require sets of “documentary fields” and “imputed fields.” “Documentary fields” will contain the exact spellings of words as they appear in historical sources that are consulted. “Imputed fields” will contain an expanding list of “standardized spellings.” Hence, for a slave named Joze who is recorded as Ballanta in a given record, the “documentary ethnicity” would be entered as “Ballanta” and “imputed ethnicity” as “Balanta,” the standardized spelling. A notes field will allow for explanations of how an imputed entry was derived. The challenges of both creating standardized values for descriptive data and deriving imputed data are hardly new; they has been addressed by the TSTD2 project, although for a very different set of data fields.6 Also, seminal exchanges about African ethnicities in Louisiana took place on the H-Africa discussion list during the 1990s.7 The Discussion Logs of H-Africa messages since March 1995 are available at: http://www.h-net.org/~africa/. The ASDN project will benefit from the experience of the TSTD2 and will further the debate about these issues in the ASDN scholarly network.
G. Providing Access to Historical Objects
While this project is primarily aimed at preserving and providing access to information about Atlantic slaves and slavery codified in a database, the site will enable scholars to display documents and images associated with their datasets. We recognize the value of digitizing large collections of documents and having them made accessible online. However, our task as historians is more than to preserve digital images of primary sources. The overarching purpose of this project is to enable scholars to collaborate in providing access to data from multiple sources and interpreting them. Scholars working at archives and libraries where digitizing is possible will be able to upload digital objects with appropriate descriptive metadata and associate these objects with the records containing data derived from them. This will allow other scholars to observe directly their practices in recording data into their dataset and will also preserve the source material itself in the project’s digital repository. (See sample source document and record derived from it in Hall’s LSD in Appendix C.) We will not require that all scholars who contribute datasets to digitize documents from which they have derived data, although providing samples of original documents will be strongly encouraged. We recognize that scholars in many parts of the world do not have access to digital cameras or scanners. Also, implementing best practices for digitizing archival materials requires considerable time and financial resources that are often prohibitive. Moreover, many archives do not allow digitization of materials in their collections. Finally, great collections have already been databased without digital files being made to accompany entries. We will provide best practice guidelines for digitization for scholars who undertake both digitizing and data collection to add to the data network. These standards will assist particularly another important audience of the project—ambitious new junior scholars, graduate students and undergraduates who are beginning their research and are interested in digital scholarship.
II. HISTORY, SCOPE AND DURATION
A. History of Hall’s and Hawthorne’s database research leading to the ASDN
The Atlantic Slave Data Network is grounded in the research of Principal Investigators Gwendolyn Midlo Hall and Walter Hawthorne and the datasets they have collected. A pioneer in the digital humanities domain, Hall spent a large part of her career constructing the Louisiana Slave Database 1719-1820 (LSD). The LSD was begun in 1984, and results from calculations were first published in the applicant’s book, Africans in Colonial Louisiana: The Development of Afro-Creole Culture in the Eighteenth Century (Baton Rouge, 1992, pb1995), which received nine book prizes. An expanded database was created under an NEH Collaborative Research grant awarded in 1991. Databases and supporting documentation, calculations, and images were first published on a CD.8 This CD publication includes census databases and spreadsheets by Jeffrey and Virginia Gould and Paul LaChance, all attendees at the Gulf South Database Group Conference in January 1993 at the Historic New Orleans Collection under the original NEH Collaborative Research contract. It was attended by 25 scholars from the United States and Canada, including also Jane I. Landers and Patrick Manning. The Louisiana Slave Database was created primarily as a tool for historical research. But it took on a life of its own and attracted a great deal of attention from the media and the wider public, both in the United States and abroad. This can be explained by its innovative methodology as well as the hunger for concrete knowledge about slaves.
On Sunday, July 30, 2000, the New York Times published a front-page story about the LSD (David Firestone, “Louisiana Slaves Lose their Anonymity,” http://www.nytimes.com/library/national/073000la-slaves.html). In November 2001, a website with a user-friendly search engine was mounted by ibiblio at the University of North Carolina (http://www.ibiblio.org/laslave), but it omits a few important fields and does not include manumission records in the search engine. The databases and supporting files can be downloaded from this web site free of charge.
Hall’s database has received positive reviews in the United States and as far away as Senegal and increased usage with various audiences.9 It has been incorporated into the www.ancestry.com search engine, as well. Calculations were used in the applicant’s most recent book, Slavery and African Ethnicities in the Americas: Restoring the Links (Chapel Hill, 2005, pb2007) as well as in other publications. Andres Perez y Mena wrote a lesson plan for high school students for the LSD as a fellowship project for Colombia Teachers' College. Hall demonstrated how to use her database to Advanced Placement high school history students, which was broadcast throughout high schools in Central New Jersey.
Nine years after being put online, user statistics provided by ibiblio document that the Louisiana Slave Database is still being widely used. The site received an average of 1,677 hits per day during the 11 months up to June 30, 2010. The most recent monthly statistics reported a total of 21,355 page views, including 13,000 views of the search page and 93 users who went to the webpage where datasets or dataset explanatory codesheets can be downloaded. Ibiblio does not have the resources to add records to the database or create the broader network in this proposal.
In July 2010, the LSD database was named by Family Tree Magazine as one of the 101 Best Websites of 2010 for genealogical research.10 Hall regularly receives thanks from people who were able to trace their ancestry using her database. One exciting recent example is Lieutenant Commander Michael Nolden Henderson, a retired U.S. Naval Officer and graduate of Xavier University of Louisiana, who, on June 29, 2010, became the first African American in Georgia to be inducted into National Society Sons of the American Revolution, thanks to information he uncovered in the LSD. The story of Henderson's research about his fourth generation great-grandparents is the subject of an upcoming segment on the PBS The History Detectives series. The recognition of this database attests to its value for amateur historians, but the ASDN project, with its easy-to-use tools, will make it more accessible and useful to humanist scholars who heretofore have had to rely on SPSS to analyze the data.
In 2005, Hawthorne assembled his Maranhão Inventories Slave Database (MISD), containing information about almost 8,500 slaves in the Brazilian state of Maranhão (located in the Amazonia region) from 1767-1832. The more recently created MISD is an ideal companion piece to the LSD. Hawthorne recorded data for it when he was in Brazil with funding from a Fulbright Hays Faculty Research Fellowship, and he received an NEH Faculty Fellowship in 2008-09 to analyze the data and write a book manuscript, The resulting book, From Africa to Brazil: Culture, Identity and an Atlantic Slave Trade, 1600 to 1830, will be published by Cambridge University Press in September 2010. Data from LSD and MISD also served as the main evidentiary source for companion articles by Hall and Hawthorne in the February 2010 issue of American Historical Review. Hawthorne plans a second major research trip to Brazil’s other Amazonian state—Pará—to augment data that comprise the MISD.
B. Scope and Duration of the African Slave Data Network Project
Phase 1: Network Development. We will create schema in a digital repository for the project. The six pilot datasets will be cross-walked and ingested into the repository. To create a database where a network of scholars can incorporate other datasets from a variety of sources and geographical areas, we will undertake the major task of developing a comprehensive list of database fields and protocols for data collection that take into account the work of similar projects and the scholarly debates on the complex topic of data collection about individual slaves. Controlled vocabulary for many fields will also be developed, building on the pilot datasets. Translations of field names and controlled vocabulary for a number of fields will be provided initially in French, Spanish and Portuguese as well as English. Database schema will also be created for digitized source documents and descriptive metadata about them, which will be able to be linked to database records derived from them.
During this phase, MATRIX will begin to develop the scholarly knowledge networking applications, working with the PIs and Advisory Board to carefully establish needs and uses for them. Cross-language communications among scholars (and also in developing the database translations) will be facilitated because our Principal Investigators are fluent in French, Spanish and Portuguese.
Phase 2: Network Integration. Additional scholars will be invited to join the data network and encouraged to contribute slave data. Development of the scholarly network platform will continue as use, needed revision, and enhancements will dictate. The work of finding and collecting slave data can be painstaking and difficult, and often years of work can lead to only small returns. By establishing a data network, the smaller sets can be enhanced by being made accessible along with the work of others. By creating a platform of sharing and exchange, scholars can be encouraged to improve methods, follow best practices, and help new scholars into the field. During this phase, development of digital tools will begin for visualizing data, including mapping movements of Africans of various ethnicities and their descendants over both time and space.
Phase 3: Network Sourcing. Beyond contributing datasets, the network of scholars will be encouraged to submit sample documents, photographs, lesson plans, as well as short explanations of how they collected their data (problems and solutions). This material can be used not only to enhance the site for teaching and learning, but will help to build a stronger network as scholar begin to see the ASDN as a place to turn to for scholarly expertise and information sharing.
To facilitate this work, project researchers will develop sample collections of original historical documents and photographs that complement the datasets. Some model lesson plans will be made available for teachers.
Future Plans: In sum, during the grant period we will produce a flexible, extensible database, the records of which will be based on individual slaves. Additional participating scholars will be able to enter new records into the digital repository, keeping their data private for a period of time, making use of advanced analytical tools, and later releasing their data to the public. Beyond the grant period, we envision continuing to demonstrate and disseminate information about the Atlantic Slave Data Network at conferences and in professional publications, to train new scholars in the data networks capabilities, and to invite them to participate. We also envision developing the data network further as a teaching tool by working with teachers to create course plans around specific sets of data as they become available. We will further explore ways to integrate this data network with other existing online slave datasets that address issues surrounding slavery and the Atlantic slave trade. The PIs and Advisory Board will listen closely to scholars’ input into the Research Discussions and will track use of all features of the website. Further development, based on user feedback, will include enhancing interface and tools usability; expanding to other languages; engaging with a continually expanding, interdisciplinary community of scholars; and growing the number of datasets.
III. METHODOLOGY AND STANDARDS
KORA digital repository: MATRIX's open source digital repository application will be used to store access copies of the project's digital content – both datasets and digitized historical objects – and display them online. KORA’s architecture is unique in that it can accommodate any set of metadata schema in individualized digital datasets. Project staff can easily create metadata elements using a simple point-and-click interface, select the type of form control for each element (e.g., required formats for date, URL, audio file upload, etc.), and then determine whether the element is required for each record and if it should appear in database search returns and advanced search feature. KORA automatically generates storage structures, ingestion forms, and validation requirements for each metadata scheme.
KORA, developed with funding from the NSF Digital Library Initiative, IMLS, and NEH, has three main strengths in relation to other digital repository packages. First, designed for small and medium-sized institutions with limited technological resources, KORA greatly facilitates ingestion. The modular programming, accessible from any Internet connection and browser platform, allows users to master the technology using a project-specific training manual.
A MATRIX team, including experts in electronic records archiving, libraries, and programming, undertook to rewrite and improve the KORA application in 2008-09 with an emphasis on best practices in digital preservation. In keeping with the need to ensure authenticity and integrity of files ingested into KORA, as described in the International Research on Permanent Authentic Records in Electronic Systems (InterPARES) guidelines, automatic fixity checking has been built into the new version of KORA to verify that data has been kept free of tampering and corruption. Long-term access to digital material can be assured by storing this preservation information in the digital repository, as described by the ISO Reference Model for an Open Archival Information System (OAIS) model and Preservation Metadata: Implementation Strategies (PREMIS).
Interoperability: Using best practices in digital archiving, the digital repository, KORA, has been optimized for interoperability. The Biographies will comply with the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) standard for sharing metadata so metadata about digital objects may be harvested and aggregated for use by other repositories. This work will build upon KORA’s already-implemented MySQL and XML platform-agnostic data design in a manner that allows extensible interoperability between repositories.
Digital Preservation Strategies: The KORA digital repository system incorporates several digital preservation strategies, particularly as regards the generation and regular checking of fixity information on ingested files and the ability to add technical metadata to KORA records (as discussed above). The files stored in the KORA system on MATRIX servers will be access copies.
Preservation copies of the files will be stored on archival quality tape and kept in MATRIX's climate-controlled digital laboratory. These tapes will be sampled annually to ensure readability, and the data will be refreshed to new media every five years. To guard against file format obsolescence, format (data) migration plans that preserve the significant properties of the digital objects will be specified and developed for both access and preservation files.
Establishing comprehensive fields for records about slaves: To create a collaborative site where large quantities of data from multiple datasets about individual slaves will be made accessible, it is necessary to create a list of fields to be used in the digital repository scheme that will be at once as comprehensive as possible and expandable. This requires consultation with experts in the field and careful review of the history of datasets about individual slaves. Consultation will begin with the distinguished group of historians from the United States, Senegal, Brazil, Trinidad, Canada, and Cuba (teaching in the United Kingdom) who have agreed to serve on the Advisory Board. An expanding scholarly community will be brought into this process via the social knowledge networking on the website.
The six pilot datasets provide good initial diversity of geographical location (Louisiana, South Carolina, Alabama, Florida, and Brazil), languages (English, French, Spanish, and Portuguese), and source documents that are needed to begin development of fields and controlled vocabularies. Other datasets known to the PIs and members of the Advisory Board also will be consulted. The first pilot dataset, Hall’s Louisiana Slave Database, is unusually extensive, having been created over a period of 15 years and by consulting 15 types of sources (see Table on page 3. The list of its field, shown in Appendix A, provides an extremely rich starting point. Hall also developed extensive controlled vocabularies for many of these fields. Appendix B gives samples of terms for the fields concerning Sickness (99 terms) and Skills (159 terms).
In addition to the expertise brought by the historians on the project, MATRIX has valuable experience created and documenting comprehensive fields and controlled vocabulary for a digital humanities project. The Quilt Index, a project of MATRIX, the Michigan State University Museum, and The Alliance for the American Quilt, has worked since 2000 to create what has become a definitive list of fields in quilt research (including more than 100 fields for describing the appearance and composition of a quilt) and a massive digital repository of quilt images and descriptive metadata. Quilt Index project staff presented a paper about the Quilt Index planning process and particularly the development of comprehensive fields for the Museums on the Web conference in 2004.11 This project was funded by an NEH planning grant (2000) and two Preservation and Access of Humanities Collections grants (2001-2004 and 2006-2009).12
KORA Descriptive Metadata Schema: KORA is extraordinarily flexible, allowing for the addition of new metadata elements as well as new search and display tools. To support search and retrieval across all datasets and records in the repository and to support metadata harvesting and interoperability, MATRIX researchers have developed a Dublin Core scheme (dcKORA) for use with KORA projects that will be used for schema that hold digitized historical objects including sample source documents associated with records in datasets. Recommendations for this core KORA metadata scheme are based on Dublin Core and the Dublin Core Metadata Best Practices Version 2.1.1 document (September 2006) by the Metadata Working Group of the CDP (formerly known as Colorado Digitization Project, a partnership of the Colorado Historical Society and Colorado State Library).13 CDP is a large digital library project that brings together materials from archives, libraries and small historical centers into a unified repository. MATRIX researchers adapted the CDP’s metadata best practices to meet MATRIX’s needs by incorporating suggestions from other metadata standards (e.g., the PBCore used by public broadcasting entities).
Social Knowledge Networking: During the design and development of the social knowledge networking portions of the ASDN project, MATRIX will employ a measured, user-centered design approach. Direct input from members of the Advisory Board as well as key members of the user community will be solicited at the beginning of the design process in order to determine the system’s optimum usability. In addition, formative design (both visual and interaction) of the system will be carried out using a parallel design model. The main design and development of the system will be done using an iterative design model. This will require conducting a series of usability tests (both formal and informal) at key stages during the development process, the results of which will be used to inform changes to the system’s overall design (both visual and interaction) and usability.
Information Security: MATRIX runs its operations, including its KORA software system, on several servers kept in a climate controlled, physically secured room; these servers run the Debian distribution of Linux. Incremental tape backups of the data stored on the servers are performed daily using the NetVault backup software application, with a full backup performed on a weekly basis. Tapes are taken offsite to another facility on the MSU campus and exchanged for the tapes stored there the previous week. Backup tapes cycle through the system approximately every six weeks and are replaced as needed. The MATRIX systems administrator keeps a wiki-based log of all tape backups. A Redundant Array of Independent Disks (RAIDs) divides and replicates data among multiple hard disk drives for better system reliability.
MATRIX has also adopted two offsite storage plans to better protect and ensure the continued availability of its server-based data. First, an additional full set of backup tapes is created every four months and logged into the MATRIX tape backup wiki. Through an arrangement with the Michigan State University Archives, MATRIX plans to store these tapes in a secure, climate-controlled storage facility in nearby Lansing, Michigan. These long-term backup tapes are kept on a three-year retention schedule.
In addition to maintaining the long-term backup tapes at the Lansing storage facility, MATRIX has a reciprocal storage arrangement with the Inter-University Consortium for Political and Social Research (ICPSR) at the University of Michigan, Ann Arbor, 60 miles from the MSU campus. On a daily basis, ICPSR uses rsync software to synchronize and copy MATRIX data into “dark” storage—that is, storage that cannot be accessed by general users—and MATRIX provides the same service for ICPSR.
Sustainability: Michigan State University is committed to maintaining the Biographies repository, network and its related applications in perpetuity. The KORA digital repository application is used for all MATRIX digital library projects, and MATRIX has invested and will continue to invest in its development, with institutional commitment from Michigan State University.
IV. WORK PLAN
The schedule of work for the three-year ASDN project, from May 1, 2011 to April 30, 2014, is spelled out in six-month phases. (Dissemination activities appear in section VI rather than in the World Plan.)
Phase I (May – October 2011)
Database: PIs and Advisory Board will develop plan for creating set of comprehensive fields and controlled vocabularies for ASDN database. Obtain pilot datasets and record and compare field names and vocabulary for data entry in fields. Design schema in KORA digital repository application for dataset records of individual slaves and digital source documents.
Digital Tools: PIs, Advisory Board, and MATRIX decide upon and document desired functionality of digital tools for making calculations from datasets and map visualization platform.
Networking: Establish networking system among PIs and the Advisory Board and archive content for possible use in Research Discussions when ASDN website goes live (on such topics as comprehensive set of data fields and functionality of digital tools useful to humanities scholars). Begin developing networking features; solicit direct input from members from the Advisory Board as well as key members of the user community at the beginning of the design process in order to determine the system’s optimum usability.
Phase II (November 2011 – April 2012)
Database: Migrate first two pilot datasets from PIs Hall and Hawthorne into the ASDN project in KORA. Begin to draft data guidebook of fields and controlled vocabulary to share with contributors of other pilot datasets to use for data entry into the KORA schema. Establish editorial committee and policies and procedures for accepting datasets.
Digital Tools: MATRIX begins design of tools for making calculations and displaying calculation results.
Networking: Design and development of social knowledge networking features (initial efforts following parallel design model)
Website: Design beta ASDN website. Write specifications for outputting datasets from KORA to the ASDN website.
Phase III (May – October 2012)
Database: Migrate third and fourth pilot datasets into KORA. Extend fields and controlled vocabulary as necessary for additional pilot datasets. Update data guidebook and make it available online in English for online users of datasets. Translate ASND data ingestion pages into French, Spanish and Portuguese.
Digital Tools: Implement search and browse functions and downloading for datasets from KORA. User testing of beta calculations tools leading to possible programming improvements.
Networking: Announce ASDN website and the opportunity for scholars to add datasets (initially for private use, then to be made public) in scholarly discussion lists and professional associations. Begin use of Threads, Research Notes, and Research Discussions on the website. Establish user statistics reporting.
Website: Go live with website with first two pilot collections.
Phase IV (November 2012 – April 2013)
Database: Migrate fifth and sixth pilot datasets into KORA, making any necessary extensions to data fields and controlled vocabularies. Translate data guidebook into French, Spanish and Portuguese.
Digital Tools: Go live with calculation tools. Develop map visualization platform.
Networking: Add Knowledge Collections function to the site.
Website: Make third and fourth datasets live on the website, along with display of digital objects associated with records of datasets for which digital objects are available. Implement private space in website for dataset contributors who wish to work with their data before it is made public.
Phase V (May – October 2013)
Database: Accept first contributed datasets beyond the six pilots.
Digital Tools: Go live with map visualization platform. Create video tutorial and written instructions in English for using calculation functions. Design time scrubber for visualizing movement over time. Conduct user testing of calculation and visualization tools.
Networking: Add Research Connections function to the site.
Website: Make fifth and sixth datasets live on the website. Make data guidebook available online in French, Spanish and Portuguese.
Phase VI (November 2013 – April 2014)
Database: Make third and fourth datasets live. Migrate fifth and sixth pilot datasets into KORA.
Digital Tools: Go live with time scrubber tool. Survey users about digital tools.
Networking: Survey users about social knowledge networking features and post results of surveys to Research Discussion.
Website: Make available online instructions for all digital tools in all four languages. Add sample lesson plans for high school and undergraduate students.
V. STAFF
Principal Investigators
Walter Hawthorne, Professor of History and Chair of the MSU Department of History (as of August 15), will serve as Principal Investigator (PI) and Project Director, devoting 20% of his time to the project. As described in the History Scope, and Duration section, Hawthorne has considerable experience with quantitative analysis of data about slavery from primary sources in Brazil.
Gwendolyn Midlo Hall, Adjunct Professor in the MSU Department of History, also will serve as Principal Investigator, devoting 15% of their time to the project. Hall brings a wealth of experience with not only compiling her large-scale database but also with the project’s varied audiences, including historians, linguists, anthropologists, creolists and genealogists interested in discovering African Americans’ roots in the African continent.
Hawthorne and Hall will jointly be responsible for the design of the database and consultation with the Advisory Board to develop a comprehensive set of fields and controlled vocabulary for many fields to create a database that will expand to meet the needs of a wide network of scholars. Hawthorne will meet regularly with MATRIX staff, and he and Hall will provide input to them about the digital tools and social networking features to be designed for the project to meet needs of humanities scholars for analyzing large-scale datasets and networking for collaborative research. Both Hall and Hawthorne also will reach out to their extensive network of colleagues to contribute datasets to the project and take advantage of the content, tools, and social knowledge networking developed by the ASDN project.
Ethan Watrall is an Assistant Professor at MATRIX: The Center for Humane Arts, Letters, and Social Sciences Online and Assistant Professor in the Department of History and the Department of Telecommunication, Information Studies, and Media. In addition, he is a Principle Investigator in the MSU Communication Technology Lab and the Games for Entertainment and Learning (GEL) Lab. Watrall has written numerous popular press books on user-experience design and standards-based web design and has presented and published academic work in the area of cultural heritage informatics, digital scholarly practice, digital humanities, and serious games for cultural heritage learning. Watrall will devote 15% of his time to this project and will lead the development of the project’s social knowledge networking features and digital tools and oversee the work on this project done at MATRIX by the programmer, designer, project manager, and students.
Atlantic Slave Data Network Advisory Board
Advisory Board members will play crucial roles in developing the project. First, they will work with PIs Hawthorne and Hall to review and contribute to the list of comprehensive fields needed for the digital repository to reflect data available from diverse sources. Second, several Advisory Board members have agreed to contribute datasets that they have created (as described in the Significance section). Third, they will contribute ideas about tools useful to historians and other scholars for analyzing large quantities of data that will be used for the final tool design and will then test beta versions of digital tools created for the project. Fourth, they will provide input and feedback about social networking applications useful for international discussion and collaboration. Lastly, they will spread the word about the project and bring into the network their colleagues and talented students.
The following distinguished scholars have agreed to serve on the Advisory Board. The research they have done and their engagement in other collaborative projects involving Atlantic slavery are strong evidence of the significant value that this project will add to the field.
Manuel Barcia, Lecturer in Latin American Studies and Deputy Director of the Institute for Colonial and Postcolonial Studies, University of Leeds, is a young Afro-Cuban scholar and author of the outstanding book, Seeds of Insurrection: Domination and Resistance on Western Cuban Plantations, 1808-1848. This book is based heavily on slave testimony in the Conspiracy of the Ladder trials in Cuba, listing slaves’ ethnic designations.
O. Vernon Burton, Distinguished Professor of Southern History and Culture at Coastal Carolina University, was founding director of the Institute for Computing in Humanities, Arts, and Social Science (ICHASS) at University of Illinois, Champaign-Urbana until his retirement, and currently serves as chair of its Advisory Board. He has agreed to contribute data about slavery in South Carolina to this project.
Manolo Florentino, faculty member in the Department of History at Universidade Federal do Rio de Janeiro, is an outstanding Brazilian scholar heavily relying on his own databases for his publications and, as a member of the TSTD2 Steering Committee, is one of the four primary editors of TSTD2, responsible mainly for Brazilian and Portuguese voyages.
Linda Heywood, Professor of African History and African-American Studies at Boston University, is author of many edited books and articles. Her latest book, with John K. Thornton, Central Africans, Atlantic Creoles, and the Foundation of the Americas, 1585-1660 (Cambridge University Press, 2007, won the Herskovits prize of the African Studies Association for the best scholarly work from the previous year.
Mark Kornbluh, Dean of College of Arts & Sciences, University of Kentucky, is a digital technologies pioneer. Kornbluh, a historian, is one of the founders of H-Net and was the first director of the MATRIX digital humanities center at Michigan State University. He was crucial in the creation of many on-line history projects including The American Black Journal (http://www.matrix.msu.edu/~abj/) and South Africa: Overcoming Apartheid (http://overcomingaparatheid.msu.edu), among others.
Paul F. LaChance, Invited Professor of History, University of Ottawa, has wide experience in database creation. He serves on the TSTD2 Development Team and is the data specialist and member of the editorial board for the Atlas of the Transatlantic Slave Trade (Yale University Press, forthcoming, November 2010). He is also a widely published historian of slavery and the slave trade. LaChance and Hall have worked together on projects for over 15 years.
Share with your friends: |