Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

What North American Archaeology Needs to Take Advantage of the Digital Data Revolution

What North American Archaeology Needs to Take Advantage of the Digital Data Revolution sUntil the last quarter of the twentieth century, archaeology was a data-poor science, and data limitations were its primary weakness. Indeed, almost all early studies on broad topics of general interest—such as the origins of agriculture, urbanism, human impacts on the environment, and civilization—ended with the lament that the findings were preliminary due to the paucity of pertinent data. Starting in the 1960s, several developments—including the passage of the National Historic Preservation Act (NHPA), the National Environmental Policy Act (NEPA), and other legislation and regulations in the United States; the passage of similar statutes in most other industrial nations; and the imposition of cultural heritage safeguards on loans and financing in developing countries—have transformed the discipline. Indeed, the past 50 years have been a golden age for archaeological field and laboratory work, expanding our evidentiary base exponentially. And yet, the result is that we have learned much more about the archaeological record than we have learned about past human behavior. Indeed, one of the most important things we have learned is that a large amount of information, in and of itself, is not sufficient to provide firm answers to the most compelling questions people ask about the past or about human society in the big picture.sThis article explores why this is the case and suggests one possible solution. First, we detail those aspects of cultural resource management (CRM) that have been successful and those aspects that have fallen short. Then, we consider the extent to which the discipline's main problem today is not data but data integration. We examine current attempts at data integration in archaeology in North America and Europe and contrast them with those in other fields. We forward the precepts and basic framework of a data integration service that might transform archaeological practice so that the data collected through CRM can be used in ways that more closely match the needs of heritage management and archaeological research. We close with a call to action to create a data integration service in the United States. Much of what we suggest could apply to other countries, other regions, and even the entire world.sTHE ACCOMPLISHMENTS AND CHALLENGES OF CRMsCRM archaeology has been hugely successful in finding, protecting, and excavating archaeological properties. Extrapolating from the Secretary of the Interior reports to Congress on the Federal Archaeological Program (National Park Service [NPS] 2022) for the period 1985–2012, since passage of the NHPA in 1966, CRM activities have resulted in recording more than a million archaeological sites, conducting more than a million field studies, excavating and analyzing the remains from more than 100,000 sites, and curating more than a billion artifacts and associated records (see Altschul 2016:Table 1). Beyond the numbers is the fact that the NHPA has been renewed and amended multiple times. If anything, the Act's reach has increased. Importantly, the 1992 amendments provided Indian Tribes and Native Hawaiian organizations with an expanded role in decisions on projects that impact ancestral and sacred resources. The law's popularity has been repeatedly demonstrated in public surveys. For example, a poll conducted by Harris Interactive (Ramos and Duganne 2020; see also Ipsos 2018) found that 96% of the public believes that there should be laws to protect archaeological sites, and 80% believe that public funds should be used to this end.sOur perception from reading the relevant literature (e.g., Sebastian and Lipe 2010) and discussing the situation with colleagues is that most archaeologists are satisfied with the documentation of the archaeological record that is being achieved (cf. Schlanger et al. 2015). Standards for field and laboratory work have improved, and CRM, unlike academia, has a good track record of finishing projects, producing reports, and curating collections and records within reasonable time frames. Data recovery has been so successful as a mitigation practice that it has become the “go-to” method to resolve the adverse effects of projects that will disturb archaeological sites. The reliance on a science-based practice has led some archaeologists and Native Americans to argue that archaeological CRM has become yet another way that the dominant society disenfranchises Indigenous people from their heritage (see Dongoske 2020). Yet, for most, recovering and curating archaeological remains prior to their destruction remains the bedrock tenet of historic preservation.sThe area where we believe most archaeologists would not be so sanguine concerns what we have learned about the past from all this documentation. The primary failure of modern CRM practice is not that we dig too much but that we seem to learn less and less that is new with each project. Indeed, over time, there has been a trend in CRM to favor documentation of the archaeological record over analysis and interpretation, saving the latter for someone else to do at some undefined point in the future. CRM remains good at filling in the gaps of regional culture history—the who, what, when, and where of the past—but the practice rarely probes deeper questions of how past societies worked, how they affected the environment, and why they changed as they did over time (cf. Kintigh et al. 2014).sIt is not that synthetic studies are lacking in CRM. For the most part, these types of studies include syntheses based on existing literature (Class I overviews and historic contexts) and predictive models. Class I overviews consist of summaries of all or a very large proportion of published and unpublished reports that are organized into chronological or thematic categories, and that highlight what is known and what is still to be learned for a region. The areas covered can vary from small project areas to vast regions. Some states, for example, have been completely covered by first being divided into regions based on physiography or culture, with each region the subject of a comprehensive overview (see, for example, Altschul and Fairley 1989; Lipe et al. 1999). Historic contexts (NPS 1983) are a second type of synthetic study that gather and organize information about related historic properties around a common theme, place, and time (NPS 1999:6). Many states have developed context statements as aids in determining eligibility for inclusion in the National Register of Historic Places (NRHP). These contexts often compile and synthesize the results of vast numbers of published and unpublished reports on topics as diverse as Paleoindian and Archaic sites in Arizona (Mabry 1998) or social, political, and economic trends in post–World War II Ohio (Sweeten et al. 2010).sPredictive modeling has its roots in settlement pattern studies of the 1950s (e.g., Chang 1967; Willey 1953, 1956). In the early 1970s, logical and quantitative rigor was added to the analysis of why sites are located where they are as typified by the work of the Southwest Anthropological Research Group (Plog and Hill 1971). Multivariate statistical models that correlated site location with environmental variables were introduced to the discipline by Green (1973) in her study of Mayan settlement in Belize. The potential of locational models for CRM was recognized almost immediately, and by the late 1970s, federal agencies and State Historic Preservation Offices began sponsoring what became known as “predictive models” in earnest (see Kohler 1988; Thoms 1988). In 1988, the Bureau of Land Management provided the first comprehensive primer on archaeological predictive modeling (Judge and Sebastian 1988). Since then, interest in predictive modeling has remained strong, not just in the United States but throughout the world. Furthermore, there have been tremendous advances in modeling with all types of models emerging—correlative, deductive, expert, subsurface, significance—each employing different logic, methods, and goals (see Doelle et al. 2016; Heilen 2020; Verhagen and Whitley 2020). Predictive models are now used to manage archaeological resources for federal installations (e.g., Fort Polk; Anderson and Smith 2003), states (e.g., Minnesota and Washington), and even nations (e.g., the Netherlands; Kamermans and van Leusen 2005). For the most part, these models are used to assess the likelihood that any predefined area will or will not contain an archaeological site. Although they have not eliminated the need for archaeological surveys, predictive models have proven useful for planning purposes, answering such questions as these: What is the likelihood of encountering cultural resources in the Area of Potential Effect and what are appropriate levels of effort to locate and document them? What types of archaeological properties will be found? How important might such resources be to understanding the past? How significant are such resources to descendant communities?sThere continues to be a need to compile and synthesize reports for literature reviews and historic contexts as well as to organize, analyze, and display cultural and environmental spatial data in predictive models. Two key activities of these endeavors—data compilation and data integration—are among the most time consuming in CRM. Over the years, data compilation has been made easier with the systemization and digitization of state and agency site files. But even if one can find and access archaeological reports and data, there remains the problem of integrating data from different sources in ways that allows the combined data set to be used in meaningful ways. Most states and federal agencies have their own data collection methods and forms. Terms vary for such categories as site and feature type, chronological period, and artifact types and function. Enormous amounts of time, effort, and money are required to integrate data (Beebe 2017; Kansa et al. 2020; Kintigh 2013; Kintigh et al. 2018). In CRM, most integration efforts focus on management needs such as NRHP eligibility, property type (e.g., archaeological site, historic building, traditional cultural property, etc.), site size, and level of disturbance. Although these variables are critical for resource management, they are generally not sufficient to address larger research questions (Kintigh et al. 2014) or are not of pressing concern to disadvantaged groups (Flewellen et al. 2021; Franklin et al. 2022). Consequently, CRM data remain outside the realm of all but the best-funded grant research (e.g., Kohler and Reese 2014; Mills et al. 2015; Ortman et al. 2007), and they are absent from the public discourse on issues such as climate change (Kohler and Rockman 2020) and human migration (Altschul et al. 2020).sBut does it have to be this way? We do not think so and neither do others (Anderson 2018). Below, we develop a vision for a national data integration service that would address many of these issues. Such a service would not only better serve cultural resource management but also enable the discipline to pursue long-standing questions about human society using data and, in the process, contribute to the public debate about our future.sENVISIONING DATA INTEGRATION IN CRMsWe begin by noting that the practice of CRM archaeology contributes to the total stock of human knowledge in two very different ways. First, it provides information that expands contemporary peoples’ understandings of their heritage. This dimension, which is most closely aligned with the humanities, focuses on translating archaeological traces into accounts of past social and cultural practices and integrating sequences of these into narratives of the past and important events in the histories of specific societies and identities. This effort increasingly (and appropriately) takes place in the context of collaboration with local, Indigenous, and descendant communities (Atalay 2012; Schmidt and Kehoe 2019; Silliman and Ferguson 2010). The role of archaeology for heritage is highlighted in the preamble of the NHPA, which states that “the spirit and direction of the Nation are founded upon and reflected in its historic heritage” and that “the historical and cultural foundations of the Nation should be preserved as a living part of our community life and development in order to give a sense of orientation to the American People” (https://www.achp.gov/sites/default/files/2018-06/nhpa.pdf). The heritage dimension of archaeology has also become increasingly prominent in CRM in the years since the passage of the Native American Graves Protection and Repatriation Act in 1990 and amendments to the NHPA in 1992. Archaeology can and does contribute this type of knowledge at a variety of scales, including the scale of individual CRM projects. So although data integration across projects can develop knowledge of heritage more powerfully than any single project, it is not required for this form of knowledge to accumulate over time.sThe situation is different for the second way archaeology contributes to human knowledge: as a source of data for studies of social and cultural processes. This dimension, which is more closely aligned with the social sciences and with National Register Eligibility Criterion D (see below), was initially articulated by advocates of processual archaeology (Ortman 2019), and it still permeates CRM archaeology today (Altschul 2005). Here, in contrast to heritage, archaeology's contribution to the total stock of human knowledge is most apparent at broad spatial and temporal scales (Perreault 2019). Human societies are fundamentally social networks embedded in physical space through which goods, energy, and information flow. From this perspective, all human societies share a set of fundamental social properties and processes (Lobo et al. 2020), but it is also clear that important aspects of these properties and processes are easier to investigate through direct observation of social behavior. So there is a distinction to be made between aspects of human social behavior that can be inferred using the archaeological record as opposed to aspects that can only be inferred using the archaeological record. The distinction is parallel to that noted by David Sepkoski (2012) concerning paleontology: there are aspects of the processes of biological evolution that can only be learned about using the fossil record, and others that are more easily learned through other means.sFrom this perspective, what archaeology uniquely contributes is a basis for integrating the outcomes of fundamental social and cultural processes over long time scales and in a greater number and diversity of societies than exist today. It also provides opportunities to examine fundamental properties and processes that are easier to isolate analytically in smaller and simpler (though still complex) systems than is often the case for present-day systems (Ortman et al. 2020). One implication of these contributions, however, is that the continued accumulation of knowledge related to social and cultural processes depends on data integration to a much greater extent than is the case for the accumulation of knowledge related to heritage.sA key step in managing archaeological resources under the NHPA is determining which resources are eligible for listing in the NRHP. Of the four criteria established by the National Park Service for evaluating historic resources for listing in the NRHP, most archaeological resources are determined eligible under Criterion D—their potential to provide information relevant to history or prehistory. It is important to recognize that when cultural resources are considered one at a time, this sort of information exhibits decreasing returns. To use an example from the well-known Permian Basin Programmatic Agreement, after excavating hundreds of lithic scatters, we learn less and less that is new about the properties of a lithic scatter with each additional excavation (Larralde et al. 2016; Schlanger et al. 2013). What this means is that, as documentation of the archaeological record accumulates, an increasing fraction of the total information is manifest in relationships both among cultural resources and between resources and other aspects of the total physical and cultural environment—not in individual resources themselves. To continue with the lithic scatter example, each additional excavation will not add much to our understanding of the lithic scatter as a resource type, but it can continue to contribute information regarding resource procurement, cultural landscapes, human–environment relationships, and technological change if the new data can be integrated with the results of previous lithic scatter documentations. As CRM proceeds, the significance of a lithic scatter, or any other type of resource, becomes less inherent in the property itself and more embedded in the relationships among many such properties across broader spatial contexts (Altschul 2005; Douglass et al. 2023). To us, this is the primary reason data integration is crucial for the continued development of CRM.sExisting Data Integration Efforts in ArchaeologysThe idea of integrating information from many projects into a single research tool is not new, and archaeologists have pursued several strategies in their efforts to achieve it (Table 1). One notable strategy involves a broad-scale compilation of a specific and especially useful data type. Radiocarbon dates are a good example. Several recent projects have shown that one can learn a tremendous amount regarding human demographic processes simply by compiling a very large number of independently dated events from known spatial locations using the “dates as data” approach pioneered by John Rick (Bird et al. 2022; Kelly et al. 2022; Rick 1987; Robinson et al. 2019; Shennan et al. 2013). Radiocarbon dates represent only a very small fraction of the total information collected by archaeologists through field and laboratory work, and they are conceptually quite simple, representing the measurement of a ratio of specific isotopes in an organic sample. Indeed, this is probably why researchers imagined that it would be feasible and worthwhile to compile radiocarbon dates at a continental scale in the first place.sTable 1.Select Database, Data Archives, and Data Integration Efforts Mentioned in the Text.sProject NamesTypesPurpose (paraphrased from website)sPrimary Spatial FocussWebsitesArchaeological Information System of the Czech Republic (AIS CR)sData IntegrationsA tool designed to integrate digital resources on Czech archaeology.sCzech Republicshttps://www.aiscr.cz/en/sArchaeology Data ServicesDigital RepositorysLong-term digital preservation of data entrusted to our care.sUnited Kingdomshttps:/archaeologydataservice.ac.uksARIADNEplussDigital Integration of Archaeological RepositoriessIntegration of European archaeological repositories. It is a searchable catalog of online datasets.sEuropeshttps://ariadne-infrastructure.eusCanadian Archaeological Radiocarbon Database (CARD)sDatabasesA compilation of radiocarbon measurements, primarily from archaeological sites in North America.sNorth Americashttps://www.canadianarchaeology.casCompiled Tree-Ring Dates from the Southwestern United StatessDatabase (with restricted use)sTree-ring dates from archaeological sites in New Mexico, Arizona, Colorado, and Utah.sUS Southwestshttps://core.tdar.org/dataset/399314/compiled-tree-ring-dates-from-the-southwestern-united-states-restrictedsCyberSWsData IntegrationsMerges several existing databases from the US Southwest into one scalable, networked database.sUS Southwestshttps://cybersw.orgsDigital Index of North American Archaeology (DINAA)sDigital IndexsAggregates archaeological and historical datasets developed over the past century from numerous sources.sNorth Americasux.opencontext.org/archaeology-site-datasDigital Archaeological Archive of Contemporary Slavery (DAACS)sDigital ArchivesA Web-based initiative that fosters comparative archaeological research on slavery throughout the Chesapeake Bay area, the Carolinas, and the Caribbean.sEastern US and Caribbeanshttps://www.daacs.orgsDigital Archiving and Networked Services (DANS)sData RepositorysA data station that allows one to deposit and search for data within the field of archaeology.sNetherlandsshttps://dans.knaw.nl/en/data-stations/archaeology/sPaleoindian Database of the Americas (PIDBA)sDatabasesProvides locational, attribute, and image data on Paleoindian materials (>ca. 10,000 cal yr BP) from all across the Americas.sNorth and South Americashttps://pidba.utk.edu/main.htmsPortable Antiquities Scheme (PAS)sDatabasesRecords of archaeological finds discovered by members of the public.sUnited Kingdomshttps://finds.org.uk/databasesThe Digital Archaeological Record (tDAR)sDigital RepositorysAn international digital repository for the digital records of archaeological investigations.sWorldwideshttps://core.tdar.org/sThe Role of Culture in Early Expansions of Humans (ROCEEH) Out of Africa Database (ROAD)sDatabasesCompilation of data within the chronological and geographic range.sAfrica, Asia, and Europeshttps://www.hadw-bw-de/en/research/research-center/roceeh/digital-resourcessThere are many other examples of efforts to compile all examples of a specific type of observation in a single database—tree-ring dates from the US Southwest (Kohler and Bocinsky 2016; Robinson and Cameron 1991), isolated finds (especially coins) from England and Wales, Clovis points from North America (Anderson et al. 2010, 2019), and so forth. What all these efforts share is a focus on a class of observation that is specific and not too abundant, and for which interobserver variation is limited. These sorts of compilations are extremely useful, but they would be even more useful if they were connected to a wider range of information. This is exponentially more difficult than compiling a single class of observation, as we discuss further below.sA second strategy archaeologists have pursued is digital archives. Examples include general repositories that hold reports and associated project data such as the UK-based Archaeology Data Service (ADS), the US-based Digital Archaeological Record (tDAR), the Dutch Digital Archiving and Networked Services (DANS), and the Archaeological Information System of the Czech Republic (AIS CR). Another set of digital archives focuses on specific subjects, including the Role of Culture in the Early Expansion of Humans (ROCEEH) Out of Africa Database (ROAD) and the Digital Archaeological Archive of Contemporary Slavery (DAACS; Galle et al. 2019). There are even archives of archives, such as the ARIADNEplus Portal. These tools focus on making digital databases from many specific projects discoverable and accessible via a search engine. This facilitates the discovery of datasets, but it leaves much of the work of integrating the discovered datasets to the downstream user. Some archaeologists do possess the relevant disciplinary knowledge and technical skills, but it means that every effort at data integration will lead to a different database, making reproduction and replication of results almost impossible (National Academies of Science, Engineering, and Medicine 2019). It also ensures that researchers from other disciplines who are interested in questions that can be answered with archaeological data will not consider the archaeological evidence, except through close collaboration with archaeologists.sA third strategy is reflected in cultural resource databases that have been developed by state historic preservation offices and some federal land management agencies. These databases contain massive amounts of survey-level information, but they are designed to manage cultural resources at the state or agency level and generally fall short of what is needed for cumulative knowledge production. For example, data fields that are most important for cultural resource management—including site numbers, locations, resource types, and culture-historical associations—are usually systematically recorded. But many other types of data that would be useful for research (including site areas, artifact assemblage information, and feature inventories) are captured much less systematically. In some systems, there are well-defined fields for storing certain types of information, but the fields are often blank because fieldworkers are not required to collect these data in this format. In others, the same information is tabulated in free-text entry fields. This captures more information but not in a format that can be analyzed quantitatively. In addition, database designs often differ across states and agencies, making it very difficult to integrate anything more than basic identifying information across databases (Halford and Ables 2023). Researchers can request and obtain data extracted from these databases, but policies regarding data access and use vary across jurisdictions. It takes a major effort to transform the data from each database into a format suitable for analysis, much less integrate data across databases.sOne successful data integration initiative is the Digital Index of North American Archaeology (DINAA), which is aggregating “site file” records from various state management databases and making them available through an online interface through which one can filter and download records (Anderson et al. 2017, 2019; Kansa et al. 2018, 2020; Wells et al. 2014). Currently, information from more than a million sites distributed over more than 40 states is available through DINAA (Figure 1). The platform is free to use and abides by the strictures of open source and open data projects, understanding and conforming to ethical obligations regarding access to sensitive data (Kansa et al. 2021). Because DINAA aggregates data from various sources, data accuracy and consistency are major hurdles that its developers must confront and overcome. Not surprisingly, there are only a few fields that contain consistent information and for which accuracy can be tested or assumed. These are mostly nominal variables—site types and culture-historical classifications—and as such, they limit the scale of analyses that can be done.sFigure 1.Distribution of DINAA Data Records as of 2022.sA central issue confronted by DINAA, which is common to all cultural resource databases in the United States, is the level of spatial precision available to the user. In many cases, states and agencies are reluctant to share precise spatial information of archaeological site locations for fear that such information will find its way to looters and vandals. Also, representatives of some descendant communities do not want locations of ancestral sites to be known to the public, or even to researchers. DINAA addresses these concerns on a project-by-project basis. Users are directed to site file managers to obtain permission for precise locational data, but it is up to the user to obtain it, and these policies can vary from state to state and from manager to manager.sThe DINAA team has achieved some remarkable results using this resource. For example, in 2017, the team published an article highlighting the effect of projected sea-level rise on archaeological sites in the US Southeast (Anderson et al. 2017). Using precise locational data on about 130,000 sites drawn mostly from eight state site files in the Southeast, Anderson and his colleagues demonstrated that tens of thousands of sites were at risk from projected sea-level rise (Figure 2). They correlated site location with elevation to show that a 1 m rise above current sea level will submerge nearly 20,000 known sites, of which more than 1,300 are eligible for listing in the NRHP. Even this number is low because it only includes recorded sites. The number of submerged sites increase as sea levels rise, reaching an astonishing number of 32,898 with a 5 m rise in sea level. These results were widely reported, resulting in some states sponsoring further research on this issue (Heilen et al. 2018). But despite these benefits, it is important to acknowledge that the information on which this study was based is not available from DINAA directly. In fact, the DINAA team had to obtain permission to use site location information from the relevant managers for each state involved, and other researchers would need to obtain the same permissions to reuse these data. These administrative burdens clearly limit the effectiveness of DINAA as a data integration service and have a chilling effect on synthetic archaeological research of all types (Robinson et al. 2019).sFigure 2.Distribution of cultural resources potentially affected by rising sea levels along the eastern United States (from Anderson et al. 2017).sFinally, a fourth approach to data integration involves stand-alone databases that address specific research problems. This is the approach taken by cyberSW, a research infrastructure consisting of a database of information for all known multiple habitation sites across the Greater Southwest dating between AD 800 and 1600, and a user interface through which researchers can select and analyze data using online tools or download datasets for offline analysis. One of the strengths of this research platform is the ability to construct demographic profiles for any group of sites, selected spatially or by site attributes. One tool translates the pottery assemblage from each selected site into a probability distribution representing the intensity of occupation (pottery deposition) over time using an approach known as uniform probability density analysis. Basically, this approach translates each pottery type into a uniform distribution based on its production span, multiplies each distribution by the number of sherds of each type in an assemblage, and then applies Bayes's Theorem to account for sampling error (see Ortman 2016). The second tool allocates the observed rooms at each site in accordance with the posterior summed probability distribution to produce a population history. The results can be examined site by site or aggregated across all sites in the selection to produce a regional population history of sedentary farmers in the region, and the underlying data and results can also be downloaded for additional analysis. Notably, this tool can be applied across the entirety of the greater US Southwest, thereby enabling demographic studies that transcend traditional culture-historical boundaries.sFor example, Figure 3 presents a demographic summary for the San Juan drainage of Colorado, New Mexico, Arizona, and Utah, constructed in cyberSW. The upper panel shows the spatial distribution of all multiple habitations within the San Juan Drainage that are currently in cyberSW, and the lower panel shows the allocation of all rooms in these sites to 50-year time slices based on their associated pottery assemblages (or a simple logistic growth model if no pottery assemblage is available). Although this analysis is incomplete in that single habitations are excluded, it does represent the population that was living in aggregated settlements over an eight-century period, integrating pottery data from several different culture-historical units (Tusayan, Mesa Verde, Cibola, Upper San Juan) in a single result. Although previous studies of specific areas within the San Juan drainage have reconstructed dynamic population histories for local areas (e.g., Schwindt et al. 2016), when all the data are integrated, one sees a consistent pattern of population growth across the entire area, at an average annual rate of 0.3% per year, followed by a sudden depopulation.sFigure 3.Demographic summary for the San Juan drainage, based on the cyberSW dataset as of 2022 (2,542 multiple habitations with occupation between AD 800 and 1600): (a) distribution of sites included in the analysis; (b) allocation of rooms. Both figures are exported directly from cyberSW.sCyberSW is the closest example we know of to an active, large-scale database that brings together information from many different projects in such a way that users can conduct synthetic research on their own. But it is still far from ideal. The cyberSW team has focused on compiling legacy data, but the system for adding new data as it is collected is much less developed. In addition, cyberSW focuses entirely on multiple habitations, which are mostly already known, whereas much CRM work focuses on single habitation and special-use sites, which are much more abundant but only known for surveyed areas. An overall demographic reconstruction tool should include single habitations and incorporate methods for extrapolating from surveyed to unsurveyed areas. Site locations in cyberSW are masked by displacing locations randomly within a 1.6 km diameter annulus centered on the actual location. The effective spatial resolution is adequate for some but not all questions archaeologists typically ask of survey data. Most importantly, and in common with all the other approaches discussed here, the data are made available with limited digestion. The analysis tools developed for the cyberSW platform return results for any data selection, but in practice, users need to know the caveats associated with the data for specific sites and regions to interpret the results appropriately. In other words, the platform does not remove the need for expert professional judgment. Finally, cyberSW has been funded by grants that emphasize development over maintenance, so the long-term sustainability of the platform is by no means assured.sData Integration in Other FieldssThe examples reviewed above illustrate that there has been significant progress with data integration in archaeology over the past few decades. Nevertheless, this review shows that, overall, archaeology still lacks the ability to integrate archaeological data in ways that facilitate synthetic research by individuals who are not experts in the relevant data, and at a level that matches the scale and scope of ongoing data collection by CRM. Our data are distributed among a variety of federal, state, and tribal agencies, and much data exists only on the computers of individual researchers and companies. And even when tools that improve the discoverability of datasets are created, the work of integrating these into larger datasets suitable for broad-scale research is left to the individual researcher (Heilen and Manney 2023). In other words, there is no system for integrating CRM data in ways that are directly useful for the broader social science research community. As a result, we cannot currently synthesize at broad scales most of the data archaeologists routinely collect. This is not a good recipe for cumulative knowledge production.sThe situation is quite different in other social sciences. If someone wants to do regional or national-scale research in economics, geography, demography, or sociology, there is a government agency staffed by large numbers of experts whose job is to translate the raw data collected by that agency into useful datasets that researchers can use. There are standard data products that are released on a consistent schedule, and they have different levels of access depending on the sensitivity of the associated data. One can obtain nonsensitive datasets simply through an internet search. These agencies basically generate, curate, and provide canonical datasets to the research community in the public interest. Good examples in the United States include the Bureau of Economic Analysis, the United States Census Bureau, the Center for Disease Control and Prevention, the Environmental Protection Agency, and USA Facts.sThere is no US government agency that provides comparable services for cultural properties, including archaeological data. The National Park Service is responsible for the National Register, but translating information from register-eligible sites into data products that are useful for research is not something this agency has done to date. One reason for this reluctance may be concerns over sensitive geographical information, especially site locations. Although it is certainly appropriate to safeguard this information, this should not be our excuse for avoiding data integration. The US Census Bureau, for example, collects and collates a far greater range of much more sensitive information than archaeologists do. To deal with sensitive information, their data products either aggregate data in ways that maintain anonymity or are available only to individuals who go through an appropriate approval process and agree not to divulge sensitive information. The data are still aggregated and maintained, and there are mechanisms and procedures to guard against inappropriate use. We believe archaeology needs something similar.sIt is important to point out that government agencies are not the only option, given that private companies are also in the data integration business. Zillow, for example, is a real estate company that estimates the market value of every US property based on public data maintained by county assessors. The company also provides a research product known as ZTRAX, which is free of charge to approved researchers. This dataset contains everything one would find on the Zillow app, including the location of a property, its square footage and age, its rooms and amenities, and its history of purchases, including the dollar amounts going back to the mid-1990s. This database contains all the basic information that is relevant for research on real estate across the United States. This example demonstrates that government agencies are not the only option for providing the data aggregation services archaeology needs. However, Zillow has recently announced that it is shutting down the ZTRAX program. We suspect this is because it has proven too expensive to maintain. This suggests that it will probably require government support, either in the form of a federal budget allocation or a requirement that developers contribute financially to data integration services through their contracts with CRM companies, for any sort of archaeological data integration service to emerge.sWhat Might an Archaeology Data Integration Service Look Like?sBelow, we engage in a broad visioning exercise to begin imagining what an archaeology data integration service might look like. Most of the details will need to be figured out through collaborative effort and federally funded and/or sanctioned initiatives. Here, we focus on the general characteristics of such a service, setting aside the steps that will be required to flesh out the details for future collaboration, planning, fact finding, and funding.sAn effective data integration service for archaeology needs to recognize the varying quality, quantity, and accuracy of archaeological data in legacy collections and ongoing academic and CRM projects. The quality of locational data, for example, was quite poor prior to the advent of global positioning systems (GPS). These data became much better during the adoption of GPS, and they are now reasonably accurate, reliable, and consistent. Similarly, site maps are quite variable depending on the time allocated to this effort and the quality of the surveyors. Artifact data also vary from quite good (for artifacts that are cleaned and analyzed in a laboratory) to abysmal (for in-field analysis; Heilen and Altschul 2013). Accuracy and reliability, of course, are to be desired, but what is critical is the ability to estimate the error rate for each data category so that the end user can calculate the confidence to place on data served out. In short, for an archaeological data integration service, the perfect need not be the enemy of the good.sIn recognition of these issues, an archaeological data integration service will need to have a few basic properties. First, the underlying data organization will need to be built around spatial information, given that this is the only property of every archaeological site that archaeologists can consistently know and record. Second, it will need to work closely with CRM so that providing data in predetermined formats becomes part of the standard CRM workflow. And third, the service will need to address the issues associated with managing sensitive geographical information, distinguishing the collection and compilation of this information from the ways it is served out to the research and preservation and management communities, for whom, and for what purposes.sFor this scheme to work, the service will need to work with agencies and CRM companies to rethink the kinds of information archaeologists routinely collect from archaeological sites and the format in which these data are collected. We suspect that culture-historical categories will still be needed, but it will also be important to think more about how to capture the human behavior represented by archaeological sites, features, and artifacts. More attention may need to be paid to functional and behavioral associations of artifacts, and to measurements of areas and densities of features and remains, than has been typical of documentation practices tailored to culture-historical purposes. Leckman and Heilen (2023) illustrate one such system that calculates these quantities using imposed grid cells.sGiven the wide variation in the ways excavation data are organized, we suspect that the best place to start is with the kinds of information typically recorded through surface surveys in the western United States and shovel-testing programs in the eastern United States, with excavation results aggregated to match survey and testing datasets. The building blocks of a useful system, reflected in state site files, are a good starting point, but these databases remain tailored to the needs of management over research, and they typically focus on assigning cultural resources to cultural-historical units and to very basic site type categories. Artifact tabulations that combine culture-historical and functional properties of assemblages are rare, and actual measurements of features within archaeological sites even more so, especially in legacy data. For this reason, the information in these files is not yet adequate as a basis for empirical, data-driven research at the scales that are necessary for archaeology to contribute to knowledge of fundamental social processes. Finally, the system will need to be something that all major stakeholders in archaeology, including agencies and private CRM companies, buy into. Both contributing data to the service and using the resulting data products will need to become part of the standard practice of CRM archaeology.sThe most fundamental aspect of the underlying database is that it will have to be based on spatial objects: points, lines, and polygons that have known and accurate spatial coordinates, geometries, and references. This is crucial because the only realistic way to reliably aggregate archaeological data at different scales—features within sites, sites within project areas, project areas within larger regional units, and all of this with other kinds of geographical and environmental information—is through their locations (McKeague et al. [2020] make similar points). It will also need to focus on aspects of archaeological sites that are amenable to consistent measurement across the intrinsic variation in the archaeological record that occurs across the United States, such as feature counts, dimensions, and functional associations; and artifact tabulations that capture both the time-space and behavioral associations of each object.sMost of the remaining data would then be attributes of these spatial objects, including their dimensions and locations, associated absolute dates when available, and a count and weight of objects from that unit, along with the date range and behavioral associations of each object category (see Holdaway et al. 2019). The logic here is that functional classifications of artifacts are much more consistent across cultures than pottery or projectile point types are. This approach has already been implemented in the cyberSW demography tools discussed earlier, and it has worked remarkably well to integrate the bewildering diversity of pottery classifications used in the US Southwest. There is no reason a similar approach could not be implemented even more broadly. Finally, faunal remains are also well suited to large-scale aggregation in that there is a well-defined taxonomy and procedures for identification that are not tailored to specific culture-historical contexts, and preliminary explorations of faunal data integration have shown great promise (Arbuckle et al. 2014; Kintigh et al. 2018; Neusius et al. 2019; Spielmann and Kintigh 2011).sGiven that a data integration service will need to be integrated into standard practice to work, it will require a Web-based data ingest system that is clearly defined and easy to integrate with the data management systems of CRM companies. And it will need to become a standard nationwide repository for basic archaeological data—nothing less than a retrospective census bureau that creates data products from archaeological evidence. Finally, the system will need to include procedures for evaluating data quality and generating standard data products with varying levels of access that are released on a consistent schedule.sAn archaeological data integration service like this would be far too big for a grant. To be practical and sustainable, the service will require an organization with a permanent staff that is supported in some way, either through user fees, allocations from professional societies, a federal budget allocation, or other forms of institutional support. The staff will need to maintain, manage, and revise the data ingest system and develop mechanisms for producing standard data products. The tradition in archaeology has been to make relatively undigested data available, leaving the responsibility for dealing with all the caveats to users. This has a chilling effect on the use of archaeological data by nonarchaeologists. If we want more researchers to take our data and results seriously, we need to invest in people who are responsible for making those decisions as part of the process of translating raw archaeological observations into useful datasets, at a variety of levels of data aggregation, just as government agencies do for other social sciences. Then, we need to make the datasets available to nonarchaeologists in ways that do not threaten site preservation. At this stage, we are unsure if the best setting for an organization like this is a private company, a nonprofit, a university center, an agency office, or some sort of partnership. But we do think it will need to be an organization.sIS A DATA INTEGRATION SERVICE WORTH THE EFFORT?sThe vision we have developed here is ambitious, and it is one that archaeology is not yet close to realizing. But we do not think this vision is impractical. DINAA and cyberSW have shown that with relatively small investments of time and money, data integration is possible and within reach. Nowadays, most archaeological data are “born digital.” So long as the basic data are recorded as attributes of spatial objects, there is no reason why we could not integrate all the information archaeologists collect into a single resource from which standard data products can be developed. It will take time to deal with the backlog of existing information, and some legacy data may not be suitable, but many archaeological sites are being rerecorded to a higher standard every year, so this problem can be expected to fade with time once a system is in place. It will also be challenging to settle on approaches for capturing data recorded at different times. Although recording methods have improved over time, older recordings may be preferable for sites that have experienced recent damage, and in other cases, different parts of the same site will have been recorded at different times. Finally, it will be hard to let go of the idea that all the details we can collect from the archaeological record are relevant for research on fundamental social processes. But the advantage of accepting this will be archaeological data products that truly allow empirical, data-driven research across traditional culture-historical boundaries. We want to emphasize again that such a service will benefit not only researchers but also cultural resource managers, preservationists, and planners, who would gain the ability to manage resources at multiple scales and from a more holistic and integrated landscape perspective.sStill, after tallying the bureaucratic impediments, the data constraints, and the financial hurdles, many readers will surely be questioning whether the effort to build and maintain a data integration service is worth it. For us, the best way to answer this question is to think about those things that will not happen if we do not have a data integration service. Here is a short list:s•Resource Management: The incorporation of cultural resources early in the planning process has been a bedrock of CRM since its founding (e.g., NPS 1983:44717). Planning documents such as literature reviews and planning tools such as predictive models have been mainstays of CRM. Creating models entails a huge investment in time and money to compile, standardize, and organize cultural and environmental data. The actual statistical analysis and interpretation, albeit the stated purpose of the study, is generally a minor component of the budgeted resources and is often not fully realized. Consequently, cultural resources are not often incorporated into land-use planning in a meaningful way, leading to project designs that disturb archaeological sites that might otherwise be avoided. A data integration service would turn this equation on its head, allowing models to be built with the entirety of existing site data, to have greater spatial scope, and to be available in a timely manner for project planning.s•Landscape Management: During the twenty-first century, the United States will witness land disturbance on a scale never seen before. Climate-induced changes in sea levels, storm surges, forest fires, hurricanes, and flood events will affect hundreds of thousands—if not millions—of sites (Heilen 2020; Hollesen 2022). Land modifications resulting from infrastructure improvement, resource extraction, urban development, and energy production and distribution also will affect sites on scales heretofore unimagined. Treating each event as a separate undertaking under Section 106 of the NHPA will likely overwhelm the regulatory framework (Heilen et al. 2018). Equally important, evaluating resources on a project basis, or even a regional basis, may not provide the proper context for making hard decisions about which archaeological resources to save and which to let go. For some resources, decision- makers need continental-wide information and displays of site types, cultural features (e.g., earthworks, rock art, civil war camps, slave outbuildings), and artifact types to prioritize management decisions impacting cultural resources. A data integration service would provide such information. Without it, social and cultural components of our heritage will be lost.s•Social and Environmental Justice: Descendant and Indigenous communities have strong attachments to archaeological sites. Whereas federally recognized Indian tribes and Native American communities commonly manage the cultural resources within their reservations, ancestral lands and sites outside reservations are managed by other governmental agencies or by private property owners. Non-Indigenous descendant communities generally do not know the locations of heritage sites and rarely have management control over them. Yet, according to the United Nations (2007), everyone has the right to know their heritage and have a say in how it is managed. A data integration service could go a long way toward meeting these basic human rights and rectifying historical injustices. The service would provide descendant and Indigenous communities with comprehensive information about the location and contents of heritage resources—information critical to making informed decisions about what information to share and with whom to share it.s•Social Science: Archaeology examines human behavior on spatial and temporal scales that are outside the realm of other social sciences. Long-term trends are routinely analyzed by archaeologists studying issues of migration, sustainability, resilience, urbanization, population dynamics, and technological change—all pressing issues of our time (Altschul et al. 2017; Kintigh et al. 2014). Yet, archaeology is generally left out of the public discourse on these issues. To some extent, this results from our focus on case studies of relatively small regions or systems that are difficult to generalize or to relate to the contemporary situation. When archaeologists have been able to synthesize continental or worldwide data, policy makers and the public take notice (e.g., Xu et al. 2020). But such efforts are rare because the effort to compile and synthesize data at this level is usually beyond individuals or even teams. A data integration service that publishes standardized reports and serves custom datasets at a continental scale would facilitate this research—research that the world needs and that only archaeologists and their collaborators can do.sLOOKING AHEADsA data integration service of the type and nature articulated in this article will only succeed if it is a discipline-wide effort. Federal mandates and funds are required to create and maintain the service. There also needs to be a shift in the culture of archaeology away from individual control of data to an ethic of data sharing. We need to convince descendant and Indigenous communities and the public at large that archaeologists can be trusted and that our research is in the public interest. To do so, archaeologists will need to come together to develop data collection standards; archaeologists and other stakeholders, particularly descendant and Indigenous communities, will need to work together to establish protocols and procedures to protect sensitive data and provide informed consent to research; and the public and their political representatives will need to be assured that economic development will not be hampered and that private property rights will not be infringed upon. In all of this, we will need our professional societies and agencies to work together as forums through which the archaeological community defines and develops the data integration service and to prioritize the service as they press issues with politicians and agency representatives. In no small way, ours is a call to action to use the data collected on behalf of the American people to better benefit the American people. The forthcoming Airlie House Revisited workshops may provide an arena in which some of these discussions could be initiated.s http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Advances in Archaeological Practice Cambridge University Press

What North American Archaeology Needs to Take Advantage of the Digital Data Revolution

Loading next page...
 
/lp/cambridge-university-press/what-north-american-archaeology-needs-to-take-advantage-of-the-digital-SFJRHRATFf

References

References for this paper are not available at this time. We will be adding them shortly, thank you for your patience.

Publisher
Cambridge University Press
Copyright
Copyright © The Author(s), 2023. Published by Cambridge University Press on behalf of Society for American Archaeology
eISSN
2326-3768
DOI
10.1017/aap.2022.42
Publisher site
See Article on Publisher Site

Abstract

sUntil the last quarter of the twentieth century, archaeology was a data-poor science, and data limitations were its primary weakness. Indeed, almost all early studies on broad topics of general interest—such as the origins of agriculture, urbanism, human impacts on the environment, and civilization—ended with the lament that the findings were preliminary due to the paucity of pertinent data. Starting in the 1960s, several developments—including the passage of the National Historic Preservation Act (NHPA), the National Environmental Policy Act (NEPA), and other legislation and regulations in the United States; the passage of similar statutes in most other industrial nations; and the imposition of cultural heritage safeguards on loans and financing in developing countries—have transformed the discipline. Indeed, the past 50 years have been a golden age for archaeological field and laboratory work, expanding our evidentiary base exponentially. And yet, the result is that we have learned much more about the archaeological record than we have learned about past human behavior. Indeed, one of the most important things we have learned is that a large amount of information, in and of itself, is not sufficient to provide firm answers to the most compelling questions people ask about the past or about human society in the big picture.sThis article explores why this is the case and suggests one possible solution. First, we detail those aspects of cultural resource management (CRM) that have been successful and those aspects that have fallen short. Then, we consider the extent to which the discipline's main problem today is not data but data integration. We examine current attempts at data integration in archaeology in North America and Europe and contrast them with those in other fields. We forward the precepts and basic framework of a data integration service that might transform archaeological practice so that the data collected through CRM can be used in ways that more closely match the needs of heritage management and archaeological research. We close with a call to action to create a data integration service in the United States. Much of what we suggest could apply to other countries, other regions, and even the entire world.sTHE ACCOMPLISHMENTS AND CHALLENGES OF CRMsCRM archaeology has been hugely successful in finding, protecting, and excavating archaeological properties. Extrapolating from the Secretary of the Interior reports to Congress on the Federal Archaeological Program (National Park Service [NPS] 2022) for the period 1985–2012, since passage of the NHPA in 1966, CRM activities have resulted in recording more than a million archaeological sites, conducting more than a million field studies, excavating and analyzing the remains from more than 100,000 sites, and curating more than a billion artifacts and associated records (see Altschul 2016:Table 1). Beyond the numbers is the fact that the NHPA has been renewed and amended multiple times. If anything, the Act's reach has increased. Importantly, the 1992 amendments provided Indian Tribes and Native Hawaiian organizations with an expanded role in decisions on projects that impact ancestral and sacred resources. The law's popularity has been repeatedly demonstrated in public surveys. For example, a poll conducted by Harris Interactive (Ramos and Duganne 2020; see also Ipsos 2018) found that 96% of the public believes that there should be laws to protect archaeological sites, and 80% believe that public funds should be used to this end.sOur perception from reading the relevant literature (e.g., Sebastian and Lipe 2010) and discussing the situation with colleagues is that most archaeologists are satisfied with the documentation of the archaeological record that is being achieved (cf. Schlanger et al. 2015). Standards for field and laboratory work have improved, and CRM, unlike academia, has a good track record of finishing projects, producing reports, and curating collections and records within reasonable time frames. Data recovery has been so successful as a mitigation practice that it has become the “go-to” method to resolve the adverse effects of projects that will disturb archaeological sites. The reliance on a science-based practice has led some archaeologists and Native Americans to argue that archaeological CRM has become yet another way that the dominant society disenfranchises Indigenous people from their heritage (see Dongoske 2020). Yet, for most, recovering and curating archaeological remains prior to their destruction remains the bedrock tenet of historic preservation.sThe area where we believe most archaeologists would not be so sanguine concerns what we have learned about the past from all this documentation. The primary failure of modern CRM practice is not that we dig too much but that we seem to learn less and less that is new with each project. Indeed, over time, there has been a trend in CRM to favor documentation of the archaeological record over analysis and interpretation, saving the latter for someone else to do at some undefined point in the future. CRM remains good at filling in the gaps of regional culture history—the who, what, when, and where of the past—but the practice rarely probes deeper questions of how past societies worked, how they affected the environment, and why they changed as they did over time (cf. Kintigh et al. 2014).sIt is not that synthetic studies are lacking in CRM. For the most part, these types of studies include syntheses based on existing literature (Class I overviews and historic contexts) and predictive models. Class I overviews consist of summaries of all or a very large proportion of published and unpublished reports that are organized into chronological or thematic categories, and that highlight what is known and what is still to be learned for a region. The areas covered can vary from small project areas to vast regions. Some states, for example, have been completely covered by first being divided into regions based on physiography or culture, with each region the subject of a comprehensive overview (see, for example, Altschul and Fairley 1989; Lipe et al. 1999). Historic contexts (NPS 1983) are a second type of synthetic study that gather and organize information about related historic properties around a common theme, place, and time (NPS 1999:6). Many states have developed context statements as aids in determining eligibility for inclusion in the National Register of Historic Places (NRHP). These contexts often compile and synthesize the results of vast numbers of published and unpublished reports on topics as diverse as Paleoindian and Archaic sites in Arizona (Mabry 1998) or social, political, and economic trends in post–World War II Ohio (Sweeten et al. 2010).sPredictive modeling has its roots in settlement pattern studies of the 1950s (e.g., Chang 1967; Willey 1953, 1956). In the early 1970s, logical and quantitative rigor was added to the analysis of why sites are located where they are as typified by the work of the Southwest Anthropological Research Group (Plog and Hill 1971). Multivariate statistical models that correlated site location with environmental variables were introduced to the discipline by Green (1973) in her study of Mayan settlement in Belize. The potential of locational models for CRM was recognized almost immediately, and by the late 1970s, federal agencies and State Historic Preservation Offices began sponsoring what became known as “predictive models” in earnest (see Kohler 1988; Thoms 1988). In 1988, the Bureau of Land Management provided the first comprehensive primer on archaeological predictive modeling (Judge and Sebastian 1988). Since then, interest in predictive modeling has remained strong, not just in the United States but throughout the world. Furthermore, there have been tremendous advances in modeling with all types of models emerging—correlative, deductive, expert, subsurface, significance—each employing different logic, methods, and goals (see Doelle et al. 2016; Heilen 2020; Verhagen and Whitley 2020). Predictive models are now used to manage archaeological resources for federal installations (e.g., Fort Polk; Anderson and Smith 2003), states (e.g., Minnesota and Washington), and even nations (e.g., the Netherlands; Kamermans and van Leusen 2005). For the most part, these models are used to assess the likelihood that any predefined area will or will not contain an archaeological site. Although they have not eliminated the need for archaeological surveys, predictive models have proven useful for planning purposes, answering such questions as these: What is the likelihood of encountering cultural resources in the Area of Potential Effect and what are appropriate levels of effort to locate and document them? What types of archaeological properties will be found? How important might such resources be to understanding the past? How significant are such resources to descendant communities?sThere continues to be a need to compile and synthesize reports for literature reviews and historic contexts as well as to organize, analyze, and display cultural and environmental spatial data in predictive models. Two key activities of these endeavors—data compilation and data integration—are among the most time consuming in CRM. Over the years, data compilation has been made easier with the systemization and digitization of state and agency site files. But even if one can find and access archaeological reports and data, there remains the problem of integrating data from different sources in ways that allows the combined data set to be used in meaningful ways. Most states and federal agencies have their own data collection methods and forms. Terms vary for such categories as site and feature type, chronological period, and artifact types and function. Enormous amounts of time, effort, and money are required to integrate data (Beebe 2017; Kansa et al. 2020; Kintigh 2013; Kintigh et al. 2018). In CRM, most integration efforts focus on management needs such as NRHP eligibility, property type (e.g., archaeological site, historic building, traditional cultural property, etc.), site size, and level of disturbance. Although these variables are critical for resource management, they are generally not sufficient to address larger research questions (Kintigh et al. 2014) or are not of pressing concern to disadvantaged groups (Flewellen et al. 2021; Franklin et al. 2022). Consequently, CRM data remain outside the realm of all but the best-funded grant research (e.g., Kohler and Reese 2014; Mills et al. 2015; Ortman et al. 2007), and they are absent from the public discourse on issues such as climate change (Kohler and Rockman 2020) and human migration (Altschul et al. 2020).sBut does it have to be this way? We do not think so and neither do others (Anderson 2018). Below, we develop a vision for a national data integration service that would address many of these issues. Such a service would not only better serve cultural resource management but also enable the discipline to pursue long-standing questions about human society using data and, in the process, contribute to the public debate about our future.sENVISIONING DATA INTEGRATION IN CRMsWe begin by noting that the practice of CRM archaeology contributes to the total stock of human knowledge in two very different ways. First, it provides information that expands contemporary peoples’ understandings of their heritage. This dimension, which is most closely aligned with the humanities, focuses on translating archaeological traces into accounts of past social and cultural practices and integrating sequences of these into narratives of the past and important events in the histories of specific societies and identities. This effort increasingly (and appropriately) takes place in the context of collaboration with local, Indigenous, and descendant communities (Atalay 2012; Schmidt and Kehoe 2019; Silliman and Ferguson 2010). The role of archaeology for heritage is highlighted in the preamble of the NHPA, which states that “the spirit and direction of the Nation are founded upon and reflected in its historic heritage” and that “the historical and cultural foundations of the Nation should be preserved as a living part of our community life and development in order to give a sense of orientation to the American People” (https://www.achp.gov/sites/default/files/2018-06/nhpa.pdf). The heritage dimension of archaeology has also become increasingly prominent in CRM in the years since the passage of the Native American Graves Protection and Repatriation Act in 1990 and amendments to the NHPA in 1992. Archaeology can and does contribute this type of knowledge at a variety of scales, including the scale of individual CRM projects. So although data integration across projects can develop knowledge of heritage more powerfully than any single project, it is not required for this form of knowledge to accumulate over time.sThe situation is different for the second way archaeology contributes to human knowledge: as a source of data for studies of social and cultural processes. This dimension, which is more closely aligned with the social sciences and with National Register Eligibility Criterion D (see below), was initially articulated by advocates of processual archaeology (Ortman 2019), and it still permeates CRM archaeology today (Altschul 2005). Here, in contrast to heritage, archaeology's contribution to the total stock of human knowledge is most apparent at broad spatial and temporal scales (Perreault 2019). Human societies are fundamentally social networks embedded in physical space through which goods, energy, and information flow. From this perspective, all human societies share a set of fundamental social properties and processes (Lobo et al. 2020), but it is also clear that important aspects of these properties and processes are easier to investigate through direct observation of social behavior. So there is a distinction to be made between aspects of human social behavior that can be inferred using the archaeological record as opposed to aspects that can only be inferred using the archaeological record. The distinction is parallel to that noted by David Sepkoski (2012) concerning paleontology: there are aspects of the processes of biological evolution that can only be learned about using the fossil record, and others that are more easily learned through other means.sFrom this perspective, what archaeology uniquely contributes is a basis for integrating the outcomes of fundamental social and cultural processes over long time scales and in a greater number and diversity of societies than exist today. It also provides opportunities to examine fundamental properties and processes that are easier to isolate analytically in smaller and simpler (though still complex) systems than is often the case for present-day systems (Ortman et al. 2020). One implication of these contributions, however, is that the continued accumulation of knowledge related to social and cultural processes depends on data integration to a much greater extent than is the case for the accumulation of knowledge related to heritage.sA key step in managing archaeological resources under the NHPA is determining which resources are eligible for listing in the NRHP. Of the four criteria established by the National Park Service for evaluating historic resources for listing in the NRHP, most archaeological resources are determined eligible under Criterion D—their potential to provide information relevant to history or prehistory. It is important to recognize that when cultural resources are considered one at a time, this sort of information exhibits decreasing returns. To use an example from the well-known Permian Basin Programmatic Agreement, after excavating hundreds of lithic scatters, we learn less and less that is new about the properties of a lithic scatter with each additional excavation (Larralde et al. 2016; Schlanger et al. 2013). What this means is that, as documentation of the archaeological record accumulates, an increasing fraction of the total information is manifest in relationships both among cultural resources and between resources and other aspects of the total physical and cultural environment—not in individual resources themselves. To continue with the lithic scatter example, each additional excavation will not add much to our understanding of the lithic scatter as a resource type, but it can continue to contribute information regarding resource procurement, cultural landscapes, human–environment relationships, and technological change if the new data can be integrated with the results of previous lithic scatter documentations. As CRM proceeds, the significance of a lithic scatter, or any other type of resource, becomes less inherent in the property itself and more embedded in the relationships among many such properties across broader spatial contexts (Altschul 2005; Douglass et al. 2023). To us, this is the primary reason data integration is crucial for the continued development of CRM.sExisting Data Integration Efforts in ArchaeologysThe idea of integrating information from many projects into a single research tool is not new, and archaeologists have pursued several strategies in their efforts to achieve it (Table 1). One notable strategy involves a broad-scale compilation of a specific and especially useful data type. Radiocarbon dates are a good example. Several recent projects have shown that one can learn a tremendous amount regarding human demographic processes simply by compiling a very large number of independently dated events from known spatial locations using the “dates as data” approach pioneered by John Rick (Bird et al. 2022; Kelly et al. 2022; Rick 1987; Robinson et al. 2019; Shennan et al. 2013). Radiocarbon dates represent only a very small fraction of the total information collected by archaeologists through field and laboratory work, and they are conceptually quite simple, representing the measurement of a ratio of specific isotopes in an organic sample. Indeed, this is probably why researchers imagined that it would be feasible and worthwhile to compile radiocarbon dates at a continental scale in the first place.sTable 1.Select Database, Data Archives, and Data Integration Efforts Mentioned in the Text.sProject NamesTypesPurpose (paraphrased from website)sPrimary Spatial FocussWebsitesArchaeological Information System of the Czech Republic (AIS CR)sData IntegrationsA tool designed to integrate digital resources on Czech archaeology.sCzech Republicshttps://www.aiscr.cz/en/sArchaeology Data ServicesDigital RepositorysLong-term digital preservation of data entrusted to our care.sUnited Kingdomshttps:/archaeologydataservice.ac.uksARIADNEplussDigital Integration of Archaeological RepositoriessIntegration of European archaeological repositories. It is a searchable catalog of online datasets.sEuropeshttps://ariadne-infrastructure.eusCanadian Archaeological Radiocarbon Database (CARD)sDatabasesA compilation of radiocarbon measurements, primarily from archaeological sites in North America.sNorth Americashttps://www.canadianarchaeology.casCompiled Tree-Ring Dates from the Southwestern United StatessDatabase (with restricted use)sTree-ring dates from archaeological sites in New Mexico, Arizona, Colorado, and Utah.sUS Southwestshttps://core.tdar.org/dataset/399314/compiled-tree-ring-dates-from-the-southwestern-united-states-restrictedsCyberSWsData IntegrationsMerges several existing databases from the US Southwest into one scalable, networked database.sUS Southwestshttps://cybersw.orgsDigital Index of North American Archaeology (DINAA)sDigital IndexsAggregates archaeological and historical datasets developed over the past century from numerous sources.sNorth Americasux.opencontext.org/archaeology-site-datasDigital Archaeological Archive of Contemporary Slavery (DAACS)sDigital ArchivesA Web-based initiative that fosters comparative archaeological research on slavery throughout the Chesapeake Bay area, the Carolinas, and the Caribbean.sEastern US and Caribbeanshttps://www.daacs.orgsDigital Archiving and Networked Services (DANS)sData RepositorysA data station that allows one to deposit and search for data within the field of archaeology.sNetherlandsshttps://dans.knaw.nl/en/data-stations/archaeology/sPaleoindian Database of the Americas (PIDBA)sDatabasesProvides locational, attribute, and image data on Paleoindian materials (>ca. 10,000 cal yr BP) from all across the Americas.sNorth and South Americashttps://pidba.utk.edu/main.htmsPortable Antiquities Scheme (PAS)sDatabasesRecords of archaeological finds discovered by members of the public.sUnited Kingdomshttps://finds.org.uk/databasesThe Digital Archaeological Record (tDAR)sDigital RepositorysAn international digital repository for the digital records of archaeological investigations.sWorldwideshttps://core.tdar.org/sThe Role of Culture in Early Expansions of Humans (ROCEEH) Out of Africa Database (ROAD)sDatabasesCompilation of data within the chronological and geographic range.sAfrica, Asia, and Europeshttps://www.hadw-bw-de/en/research/research-center/roceeh/digital-resourcessThere are many other examples of efforts to compile all examples of a specific type of observation in a single database—tree-ring dates from the US Southwest (Kohler and Bocinsky 2016; Robinson and Cameron 1991), isolated finds (especially coins) from England and Wales, Clovis points from North America (Anderson et al. 2010, 2019), and so forth. What all these efforts share is a focus on a class of observation that is specific and not too abundant, and for which interobserver variation is limited. These sorts of compilations are extremely useful, but they would be even more useful if they were connected to a wider range of information. This is exponentially more difficult than compiling a single class of observation, as we discuss further below.sA second strategy archaeologists have pursued is digital archives. Examples include general repositories that hold reports and associated project data such as the UK-based Archaeology Data Service (ADS), the US-based Digital Archaeological Record (tDAR), the Dutch Digital Archiving and Networked Services (DANS), and the Archaeological Information System of the Czech Republic (AIS CR). Another set of digital archives focuses on specific subjects, including the Role of Culture in the Early Expansion of Humans (ROCEEH) Out of Africa Database (ROAD) and the Digital Archaeological Archive of Contemporary Slavery (DAACS; Galle et al. 2019). There are even archives of archives, such as the ARIADNEplus Portal. These tools focus on making digital databases from many specific projects discoverable and accessible via a search engine. This facilitates the discovery of datasets, but it leaves much of the work of integrating the discovered datasets to the downstream user. Some archaeologists do possess the relevant disciplinary knowledge and technical skills, but it means that every effort at data integration will lead to a different database, making reproduction and replication of results almost impossible (National Academies of Science, Engineering, and Medicine 2019). It also ensures that researchers from other disciplines who are interested in questions that can be answered with archaeological data will not consider the archaeological evidence, except through close collaboration with archaeologists.sA third strategy is reflected in cultural resource databases that have been developed by state historic preservation offices and some federal land management agencies. These databases contain massive amounts of survey-level information, but they are designed to manage cultural resources at the state or agency level and generally fall short of what is needed for cumulative knowledge production. For example, data fields that are most important for cultural resource management—including site numbers, locations, resource types, and culture-historical associations—are usually systematically recorded. But many other types of data that would be useful for research (including site areas, artifact assemblage information, and feature inventories) are captured much less systematically. In some systems, there are well-defined fields for storing certain types of information, but the fields are often blank because fieldworkers are not required to collect these data in this format. In others, the same information is tabulated in free-text entry fields. This captures more information but not in a format that can be analyzed quantitatively. In addition, database designs often differ across states and agencies, making it very difficult to integrate anything more than basic identifying information across databases (Halford and Ables 2023). Researchers can request and obtain data extracted from these databases, but policies regarding data access and use vary across jurisdictions. It takes a major effort to transform the data from each database into a format suitable for analysis, much less integrate data across databases.sOne successful data integration initiative is the Digital Index of North American Archaeology (DINAA), which is aggregating “site file” records from various state management databases and making them available through an online interface through which one can filter and download records (Anderson et al. 2017, 2019; Kansa et al. 2018, 2020; Wells et al. 2014). Currently, information from more than a million sites distributed over more than 40 states is available through DINAA (Figure 1). The platform is free to use and abides by the strictures of open source and open data projects, understanding and conforming to ethical obligations regarding access to sensitive data (Kansa et al. 2021). Because DINAA aggregates data from various sources, data accuracy and consistency are major hurdles that its developers must confront and overcome. Not surprisingly, there are only a few fields that contain consistent information and for which accuracy can be tested or assumed. These are mostly nominal variables—site types and culture-historical classifications—and as such, they limit the scale of analyses that can be done.sFigure 1.Distribution of DINAA Data Records as of 2022.sA central issue confronted by DINAA, which is common to all cultural resource databases in the United States, is the level of spatial precision available to the user. In many cases, states and agencies are reluctant to share precise spatial information of archaeological site locations for fear that such information will find its way to looters and vandals. Also, representatives of some descendant communities do not want locations of ancestral sites to be known to the public, or even to researchers. DINAA addresses these concerns on a project-by-project basis. Users are directed to site file managers to obtain permission for precise locational data, but it is up to the user to obtain it, and these policies can vary from state to state and from manager to manager.sThe DINAA team has achieved some remarkable results using this resource. For example, in 2017, the team published an article highlighting the effect of projected sea-level rise on archaeological sites in the US Southeast (Anderson et al. 2017). Using precise locational data on about 130,000 sites drawn mostly from eight state site files in the Southeast, Anderson and his colleagues demonstrated that tens of thousands of sites were at risk from projected sea-level rise (Figure 2). They correlated site location with elevation to show that a 1 m rise above current sea level will submerge nearly 20,000 known sites, of which more than 1,300 are eligible for listing in the NRHP. Even this number is low because it only includes recorded sites. The number of submerged sites increase as sea levels rise, reaching an astonishing number of 32,898 with a 5 m rise in sea level. These results were widely reported, resulting in some states sponsoring further research on this issue (Heilen et al. 2018). But despite these benefits, it is important to acknowledge that the information on which this study was based is not available from DINAA directly. In fact, the DINAA team had to obtain permission to use site location information from the relevant managers for each state involved, and other researchers would need to obtain the same permissions to reuse these data. These administrative burdens clearly limit the effectiveness of DINAA as a data integration service and have a chilling effect on synthetic archaeological research of all types (Robinson et al. 2019).sFigure 2.Distribution of cultural resources potentially affected by rising sea levels along the eastern United States (from Anderson et al. 2017).sFinally, a fourth approach to data integration involves stand-alone databases that address specific research problems. This is the approach taken by cyberSW, a research infrastructure consisting of a database of information for all known multiple habitation sites across the Greater Southwest dating between AD 800 and 1600, and a user interface through which researchers can select and analyze data using online tools or download datasets for offline analysis. One of the strengths of this research platform is the ability to construct demographic profiles for any group of sites, selected spatially or by site attributes. One tool translates the pottery assemblage from each selected site into a probability distribution representing the intensity of occupation (pottery deposition) over time using an approach known as uniform probability density analysis. Basically, this approach translates each pottery type into a uniform distribution based on its production span, multiplies each distribution by the number of sherds of each type in an assemblage, and then applies Bayes's Theorem to account for sampling error (see Ortman 2016). The second tool allocates the observed rooms at each site in accordance with the posterior summed probability distribution to produce a population history. The results can be examined site by site or aggregated across all sites in the selection to produce a regional population history of sedentary farmers in the region, and the underlying data and results can also be downloaded for additional analysis. Notably, this tool can be applied across the entirety of the greater US Southwest, thereby enabling demographic studies that transcend traditional culture-historical boundaries.sFor example, Figure 3 presents a demographic summary for the San Juan drainage of Colorado, New Mexico, Arizona, and Utah, constructed in cyberSW. The upper panel shows the spatial distribution of all multiple habitations within the San Juan Drainage that are currently in cyberSW, and the lower panel shows the allocation of all rooms in these sites to 50-year time slices based on their associated pottery assemblages (or a simple logistic growth model if no pottery assemblage is available). Although this analysis is incomplete in that single habitations are excluded, it does represent the population that was living in aggregated settlements over an eight-century period, integrating pottery data from several different culture-historical units (Tusayan, Mesa Verde, Cibola, Upper San Juan) in a single result. Although previous studies of specific areas within the San Juan drainage have reconstructed dynamic population histories for local areas (e.g., Schwindt et al. 2016), when all the data are integrated, one sees a consistent pattern of population growth across the entire area, at an average annual rate of 0.3% per year, followed by a sudden depopulation.sFigure 3.Demographic summary for the San Juan drainage, based on the cyberSW dataset as of 2022 (2,542 multiple habitations with occupation between AD 800 and 1600): (a) distribution of sites included in the analysis; (b) allocation of rooms. Both figures are exported directly from cyberSW.sCyberSW is the closest example we know of to an active, large-scale database that brings together information from many different projects in such a way that users can conduct synthetic research on their own. But it is still far from ideal. The cyberSW team has focused on compiling legacy data, but the system for adding new data as it is collected is much less developed. In addition, cyberSW focuses entirely on multiple habitations, which are mostly already known, whereas much CRM work focuses on single habitation and special-use sites, which are much more abundant but only known for surveyed areas. An overall demographic reconstruction tool should include single habitations and incorporate methods for extrapolating from surveyed to unsurveyed areas. Site locations in cyberSW are masked by displacing locations randomly within a 1.6 km diameter annulus centered on the actual location. The effective spatial resolution is adequate for some but not all questions archaeologists typically ask of survey data. Most importantly, and in common with all the other approaches discussed here, the data are made available with limited digestion. The analysis tools developed for the cyberSW platform return results for any data selection, but in practice, users need to know the caveats associated with the data for specific sites and regions to interpret the results appropriately. In other words, the platform does not remove the need for expert professional judgment. Finally, cyberSW has been funded by grants that emphasize development over maintenance, so the long-term sustainability of the platform is by no means assured.sData Integration in Other FieldssThe examples reviewed above illustrate that there has been significant progress with data integration in archaeology over the past few decades. Nevertheless, this review shows that, overall, archaeology still lacks the ability to integrate archaeological data in ways that facilitate synthetic research by individuals who are not experts in the relevant data, and at a level that matches the scale and scope of ongoing data collection by CRM. Our data are distributed among a variety of federal, state, and tribal agencies, and much data exists only on the computers of individual researchers and companies. And even when tools that improve the discoverability of datasets are created, the work of integrating these into larger datasets suitable for broad-scale research is left to the individual researcher (Heilen and Manney 2023). In other words, there is no system for integrating CRM data in ways that are directly useful for the broader social science research community. As a result, we cannot currently synthesize at broad scales most of the data archaeologists routinely collect. This is not a good recipe for cumulative knowledge production.sThe situation is quite different in other social sciences. If someone wants to do regional or national-scale research in economics, geography, demography, or sociology, there is a government agency staffed by large numbers of experts whose job is to translate the raw data collected by that agency into useful datasets that researchers can use. There are standard data products that are released on a consistent schedule, and they have different levels of access depending on the sensitivity of the associated data. One can obtain nonsensitive datasets simply through an internet search. These agencies basically generate, curate, and provide canonical datasets to the research community in the public interest. Good examples in the United States include the Bureau of Economic Analysis, the United States Census Bureau, the Center for Disease Control and Prevention, the Environmental Protection Agency, and USA Facts.sThere is no US government agency that provides comparable services for cultural properties, including archaeological data. The National Park Service is responsible for the National Register, but translating information from register-eligible sites into data products that are useful for research is not something this agency has done to date. One reason for this reluctance may be concerns over sensitive geographical information, especially site locations. Although it is certainly appropriate to safeguard this information, this should not be our excuse for avoiding data integration. The US Census Bureau, for example, collects and collates a far greater range of much more sensitive information than archaeologists do. To deal with sensitive information, their data products either aggregate data in ways that maintain anonymity or are available only to individuals who go through an appropriate approval process and agree not to divulge sensitive information. The data are still aggregated and maintained, and there are mechanisms and procedures to guard against inappropriate use. We believe archaeology needs something similar.sIt is important to point out that government agencies are not the only option, given that private companies are also in the data integration business. Zillow, for example, is a real estate company that estimates the market value of every US property based on public data maintained by county assessors. The company also provides a research product known as ZTRAX, which is free of charge to approved researchers. This dataset contains everything one would find on the Zillow app, including the location of a property, its square footage and age, its rooms and amenities, and its history of purchases, including the dollar amounts going back to the mid-1990s. This database contains all the basic information that is relevant for research on real estate across the United States. This example demonstrates that government agencies are not the only option for providing the data aggregation services archaeology needs. However, Zillow has recently announced that it is shutting down the ZTRAX program. We suspect this is because it has proven too expensive to maintain. This suggests that it will probably require government support, either in the form of a federal budget allocation or a requirement that developers contribute financially to data integration services through their contracts with CRM companies, for any sort of archaeological data integration service to emerge.sWhat Might an Archaeology Data Integration Service Look Like?sBelow, we engage in a broad visioning exercise to begin imagining what an archaeology data integration service might look like. Most of the details will need to be figured out through collaborative effort and federally funded and/or sanctioned initiatives. Here, we focus on the general characteristics of such a service, setting aside the steps that will be required to flesh out the details for future collaboration, planning, fact finding, and funding.sAn effective data integration service for archaeology needs to recognize the varying quality, quantity, and accuracy of archaeological data in legacy collections and ongoing academic and CRM projects. The quality of locational data, for example, was quite poor prior to the advent of global positioning systems (GPS). These data became much better during the adoption of GPS, and they are now reasonably accurate, reliable, and consistent. Similarly, site maps are quite variable depending on the time allocated to this effort and the quality of the surveyors. Artifact data also vary from quite good (for artifacts that are cleaned and analyzed in a laboratory) to abysmal (for in-field analysis; Heilen and Altschul 2013). Accuracy and reliability, of course, are to be desired, but what is critical is the ability to estimate the error rate for each data category so that the end user can calculate the confidence to place on data served out. In short, for an archaeological data integration service, the perfect need not be the enemy of the good.sIn recognition of these issues, an archaeological data integration service will need to have a few basic properties. First, the underlying data organization will need to be built around spatial information, given that this is the only property of every archaeological site that archaeologists can consistently know and record. Second, it will need to work closely with CRM so that providing data in predetermined formats becomes part of the standard CRM workflow. And third, the service will need to address the issues associated with managing sensitive geographical information, distinguishing the collection and compilation of this information from the ways it is served out to the research and preservation and management communities, for whom, and for what purposes.sFor this scheme to work, the service will need to work with agencies and CRM companies to rethink the kinds of information archaeologists routinely collect from archaeological sites and the format in which these data are collected. We suspect that culture-historical categories will still be needed, but it will also be important to think more about how to capture the human behavior represented by archaeological sites, features, and artifacts. More attention may need to be paid to functional and behavioral associations of artifacts, and to measurements of areas and densities of features and remains, than has been typical of documentation practices tailored to culture-historical purposes. Leckman and Heilen (2023) illustrate one such system that calculates these quantities using imposed grid cells.sGiven the wide variation in the ways excavation data are organized, we suspect that the best place to start is with the kinds of information typically recorded through surface surveys in the western United States and shovel-testing programs in the eastern United States, with excavation results aggregated to match survey and testing datasets. The building blocks of a useful system, reflected in state site files, are a good starting point, but these databases remain tailored to the needs of management over research, and they typically focus on assigning cultural resources to cultural-historical units and to very basic site type categories. Artifact tabulations that combine culture-historical and functional properties of assemblages are rare, and actual measurements of features within archaeological sites even more so, especially in legacy data. For this reason, the information in these files is not yet adequate as a basis for empirical, data-driven research at the scales that are necessary for archaeology to contribute to knowledge of fundamental social processes. Finally, the system will need to be something that all major stakeholders in archaeology, including agencies and private CRM companies, buy into. Both contributing data to the service and using the resulting data products will need to become part of the standard practice of CRM archaeology.sThe most fundamental aspect of the underlying database is that it will have to be based on spatial objects: points, lines, and polygons that have known and accurate spatial coordinates, geometries, and references. This is crucial because the only realistic way to reliably aggregate archaeological data at different scales—features within sites, sites within project areas, project areas within larger regional units, and all of this with other kinds of geographical and environmental information—is through their locations (McKeague et al. [2020] make similar points). It will also need to focus on aspects of archaeological sites that are amenable to consistent measurement across the intrinsic variation in the archaeological record that occurs across the United States, such as feature counts, dimensions, and functional associations; and artifact tabulations that capture both the time-space and behavioral associations of each object.sMost of the remaining data would then be attributes of these spatial objects, including their dimensions and locations, associated absolute dates when available, and a count and weight of objects from that unit, along with the date range and behavioral associations of each object category (see Holdaway et al. 2019). The logic here is that functional classifications of artifacts are much more consistent across cultures than pottery or projectile point types are. This approach has already been implemented in the cyberSW demography tools discussed earlier, and it has worked remarkably well to integrate the bewildering diversity of pottery classifications used in the US Southwest. There is no reason a similar approach could not be implemented even more broadly. Finally, faunal remains are also well suited to large-scale aggregation in that there is a well-defined taxonomy and procedures for identification that are not tailored to specific culture-historical contexts, and preliminary explorations of faunal data integration have shown great promise (Arbuckle et al. 2014; Kintigh et al. 2018; Neusius et al. 2019; Spielmann and Kintigh 2011).sGiven that a data integration service will need to be integrated into standard practice to work, it will require a Web-based data ingest system that is clearly defined and easy to integrate with the data management systems of CRM companies. And it will need to become a standard nationwide repository for basic archaeological data—nothing less than a retrospective census bureau that creates data products from archaeological evidence. Finally, the system will need to include procedures for evaluating data quality and generating standard data products with varying levels of access that are released on a consistent schedule.sAn archaeological data integration service like this would be far too big for a grant. To be practical and sustainable, the service will require an organization with a permanent staff that is supported in some way, either through user fees, allocations from professional societies, a federal budget allocation, or other forms of institutional support. The staff will need to maintain, manage, and revise the data ingest system and develop mechanisms for producing standard data products. The tradition in archaeology has been to make relatively undigested data available, leaving the responsibility for dealing with all the caveats to users. This has a chilling effect on the use of archaeological data by nonarchaeologists. If we want more researchers to take our data and results seriously, we need to invest in people who are responsible for making those decisions as part of the process of translating raw archaeological observations into useful datasets, at a variety of levels of data aggregation, just as government agencies do for other social sciences. Then, we need to make the datasets available to nonarchaeologists in ways that do not threaten site preservation. At this stage, we are unsure if the best setting for an organization like this is a private company, a nonprofit, a university center, an agency office, or some sort of partnership. But we do think it will need to be an organization.sIS A DATA INTEGRATION SERVICE WORTH THE EFFORT?sThe vision we have developed here is ambitious, and it is one that archaeology is not yet close to realizing. But we do not think this vision is impractical. DINAA and cyberSW have shown that with relatively small investments of time and money, data integration is possible and within reach. Nowadays, most archaeological data are “born digital.” So long as the basic data are recorded as attributes of spatial objects, there is no reason why we could not integrate all the information archaeologists collect into a single resource from which standard data products can be developed. It will take time to deal with the backlog of existing information, and some legacy data may not be suitable, but many archaeological sites are being rerecorded to a higher standard every year, so this problem can be expected to fade with time once a system is in place. It will also be challenging to settle on approaches for capturing data recorded at different times. Although recording methods have improved over time, older recordings may be preferable for sites that have experienced recent damage, and in other cases, different parts of the same site will have been recorded at different times. Finally, it will be hard to let go of the idea that all the details we can collect from the archaeological record are relevant for research on fundamental social processes. But the advantage of accepting this will be archaeological data products that truly allow empirical, data-driven research across traditional culture-historical boundaries. We want to emphasize again that such a service will benefit not only researchers but also cultural resource managers, preservationists, and planners, who would gain the ability to manage resources at multiple scales and from a more holistic and integrated landscape perspective.sStill, after tallying the bureaucratic impediments, the data constraints, and the financial hurdles, many readers will surely be questioning whether the effort to build and maintain a data integration service is worth it. For us, the best way to answer this question is to think about those things that will not happen if we do not have a data integration service. Here is a short list:s•Resource Management: The incorporation of cultural resources early in the planning process has been a bedrock of CRM since its founding (e.g., NPS 1983:44717). Planning documents such as literature reviews and planning tools such as predictive models have been mainstays of CRM. Creating models entails a huge investment in time and money to compile, standardize, and organize cultural and environmental data. The actual statistical analysis and interpretation, albeit the stated purpose of the study, is generally a minor component of the budgeted resources and is often not fully realized. Consequently, cultural resources are not often incorporated into land-use planning in a meaningful way, leading to project designs that disturb archaeological sites that might otherwise be avoided. A data integration service would turn this equation on its head, allowing models to be built with the entirety of existing site data, to have greater spatial scope, and to be available in a timely manner for project planning.s•Landscape Management: During the twenty-first century, the United States will witness land disturbance on a scale never seen before. Climate-induced changes in sea levels, storm surges, forest fires, hurricanes, and flood events will affect hundreds of thousands—if not millions—of sites (Heilen 2020; Hollesen 2022). Land modifications resulting from infrastructure improvement, resource extraction, urban development, and energy production and distribution also will affect sites on scales heretofore unimagined. Treating each event as a separate undertaking under Section 106 of the NHPA will likely overwhelm the regulatory framework (Heilen et al. 2018). Equally important, evaluating resources on a project basis, or even a regional basis, may not provide the proper context for making hard decisions about which archaeological resources to save and which to let go. For some resources, decision- makers need continental-wide information and displays of site types, cultural features (e.g., earthworks, rock art, civil war camps, slave outbuildings), and artifact types to prioritize management decisions impacting cultural resources. A data integration service would provide such information. Without it, social and cultural components of our heritage will be lost.s•Social and Environmental Justice: Descendant and Indigenous communities have strong attachments to archaeological sites. Whereas federally recognized Indian tribes and Native American communities commonly manage the cultural resources within their reservations, ancestral lands and sites outside reservations are managed by other governmental agencies or by private property owners. Non-Indigenous descendant communities generally do not know the locations of heritage sites and rarely have management control over them. Yet, according to the United Nations (2007), everyone has the right to know their heritage and have a say in how it is managed. A data integration service could go a long way toward meeting these basic human rights and rectifying historical injustices. The service would provide descendant and Indigenous communities with comprehensive information about the location and contents of heritage resources—information critical to making informed decisions about what information to share and with whom to share it.s•Social Science: Archaeology examines human behavior on spatial and temporal scales that are outside the realm of other social sciences. Long-term trends are routinely analyzed by archaeologists studying issues of migration, sustainability, resilience, urbanization, population dynamics, and technological change—all pressing issues of our time (Altschul et al. 2017; Kintigh et al. 2014). Yet, archaeology is generally left out of the public discourse on these issues. To some extent, this results from our focus on case studies of relatively small regions or systems that are difficult to generalize or to relate to the contemporary situation. When archaeologists have been able to synthesize continental or worldwide data, policy makers and the public take notice (e.g., Xu et al. 2020). But such efforts are rare because the effort to compile and synthesize data at this level is usually beyond individuals or even teams. A data integration service that publishes standardized reports and serves custom datasets at a continental scale would facilitate this research—research that the world needs and that only archaeologists and their collaborators can do.sLOOKING AHEADsA data integration service of the type and nature articulated in this article will only succeed if it is a discipline-wide effort. Federal mandates and funds are required to create and maintain the service. There also needs to be a shift in the culture of archaeology away from individual control of data to an ethic of data sharing. We need to convince descendant and Indigenous communities and the public at large that archaeologists can be trusted and that our research is in the public interest. To do so, archaeologists will need to come together to develop data collection standards; archaeologists and other stakeholders, particularly descendant and Indigenous communities, will need to work together to establish protocols and procedures to protect sensitive data and provide informed consent to research; and the public and their political representatives will need to be assured that economic development will not be hampered and that private property rights will not be infringed upon. In all of this, we will need our professional societies and agencies to work together as forums through which the archaeological community defines and develops the data integration service and to prioritize the service as they press issues with politicians and agency representatives. In no small way, ours is a call to action to use the data collected on behalf of the American people to better benefit the American people. The forthcoming Airlie House Revisited workshops may provide an arena in which some of these discussions could be initiated.s

Journal

Advances in Archaeological PracticeCambridge University Press

Published: Feb 1, 2023

Keywords: cultural resource management; data integration; archaeological synthesis; historic preservation; gestión de recursos culturales; integración de datos; síntesis arqueológica; preservación histórica

There are no references for this article.