Beginning in 2011, The National Children’s Study (NCS) Data Linkages Program under the direction of the National Institute of Health, Eunice Kennedy Shriver, National Institute of Child Health and Human Development (NICHD) compiled a select set of demographic and socio-economic statistics from the ongoing American Community Survey (ACS) program for each of the 37 NCS Vanguard Study locations and 3 Provider Based Sites (PBS) into a Study Center Profile (see Appendix 1 for list of 40 study locations). Study Center Profiles were created using 2010 and 2012 ACS data.
In 2015, the NCS PO requested the NCS Study Center Profiles be expanded to include items beyond those found in the ACS. The Data Linkages team identified and evaluated data sources for this effort that would provide supplemental items such as health, neighborhood, environmental, and socioeconomic indicators. These data, when combined with data collected as a part of the NCS protocols, could inform epidemiological analyses by allowing models to control for a broader list of potential confounders, to examine effect modification by participant and neighborhood characteristics, and to determine how parental and neighborhood factors relate to environmental exposures and children's well-being and growth.
Enhancements were made to the existing Study Center Profiles, and more years were added so that each year of the NCS (i.e., 2008-2014) had an enhanced profile. Companion all-county files for each year have also been produced and include data for all counties nationally for which the measure was available. In all, 14 datasets have been produced (i.e., seven files specific to the NCS counties and seven files for all counties nationally). These data are expected to enhance the analytic utility of the NCS participant data to be archived. This document summarizes the processes by which data sources and measures were selected, describes the measures included, and the contents of this delivery.
Selection of Data Sources and Measures
The NCS Extant Data Library1 served as the main resource for identifying data sources that contained additional measures relevant to the NCS. These data sources (see Appendix 2) had already been evaluated as relevant to the NCS as a part of creating the Extant Data Library. Sources were determined relevant to the NCS if they supplied information on the respondents' personal, situational, economic, and health characteristics, along with corresponding information about the social-economic conditions of the communities in which they lived.
An expanded search outside the Library was also conducted to identify additional data sources. Databases such as The Health Indicators Warehouse, Data.gov, and HealthData.gov were mined for additional measures that could describe the NCS counties. These databases served as a good starting point for this review as they are a compilation of data sources, some with existing measures available for download.
Data sources that could produce measures that were relevant to the NCS, i.e. measures that could describe a child’s environment in its totality, including both the physical (e.g. air quality metrics, number of supermarkets in the vicinity) and social (e.g. percent of county in the Women, Infants, and Children (WIC) program) determinants of health were of interest. Data sources containing measures that could be easily linked to NCS counties, with minimal data processing, were prioritized.2 Table 1 lists the data sources that were consulted in preparing the enhanced study center profiles and notes whether the source already existed in the Extant Data Library or was added outside of those that initially existed on the Library. These sources were eventually added to the Library. Community Commons (CC) and the Health Indicators Warehouse (HIW) were not added to the Library as data sources as they themselves are a library of extant data sources. The indicator retrieved from HIW was based on a model; the main input data source was already included in the library (i.e., EPA’s Air Quality System). Likewise, the data sources used to create the indicators within Community Commons, such as the Census or the American Community Survey, were already in the Library.
Table 1. Data sources consulted for profile enhancements
The Data Linkages team reviewed the selected data sources for measures that characterized exposures in the child’s environment. All potential measures were compiled and evaluated for inclusion in the enhanced Study Center Profiles. The following considerations were used in evaluating the feasibility of including the additional measures in the Study Center Profiles.
Table 2. Considerations for evaluating potential enhancement measures
The geographic level (e.g., state, county) at which the data element was available.
Not at county level but county level estimates could be generated with minimal processing considered
The years in which the data were available
Measure did not have to be available for every year
The cost for acquiring the dataset
No cost preferred
Minimal cost considered
High cost or rigorous process for obtaining data (e.g., Data Use Agreement) not considered
The level of effort to manipulate the data into a format that could be appended to the existing Study Center Profile
Extensive processing or calculations required not considered
A suffix denoting the data source been added to each variable in the format VarName_DataSource. Suffixes are as follows:
American Community Survey (_ACS)
Air Quality System (_AQS)
Behavioral Risk Factor Surveillance System (_BRFSS)
Community Commons (_CC)
Cleanups in my Community (_CUC)
EPA Geospatial Data Download (_EPA)
Food Environment Atlas (_FEA)
Health Resources and Services Administration (_HRSA)
National Transportation Atlas (_NTA)
Pesticide National Synthesis Project (_PNS)
Toxics Release Inventory (_TRI)
Variables have been categorized into a topic area to facilitate analysis using similar variables. Topic areas are noted in the data dictionary and layout file for each dataset. The following topic areas appear in the enhanced profiles and all county files:
Air and Rail Transportation
Food and Food Security
Health Care Access and Use
Health Conditions and Behaviors
Data Dictionary and Layout Files
The accompanying data dictionary includes variable attributes, data sources, and indicates for which years each measure is available for all measures included in this delivery. The data dictionary provides a comprehensive overview of the annual files. To supplement these, a layout file is provided to accompany each annual file and includes the attributes for only the variables on that particular dataset.
The following counties were aggregated into one site for the purposes of NCS data collection. For the purposes of the enhanced Study Center Profiles, these counties have been presented separately.
Lincoln County, MN
Pipestone County, MN
Yellow Medicine County, MN
Brookings County, SD
Note: ACS summary files contain both yearly and multi-year estimates, the availability of which depends on the size of the geography in question, in this case the county, and for some estimates when the question was added. Many of the study locations are located in counties of sufficient size (65,000 persons or greater) to obtain the most recent yearly estimate starting in 2012. However, several of the study locations are associated with counties that, because of the U.S. Census Bureau’s standards for confidentially and survey precision, are not large enough to obtain yearly estimates. For counties that contain more than 20,000 but fewer than 65,000 persons, the U.S. Census Bureau provides three-year estimates rather than yearly values. For example, in Duplin County, NC the U.S. Census Bureau reports the ACS data based on data collected during a three year period from 2010 through 2012. For counties that contain fewer than 20,000 persons, the U.S. Census Bureau provides five-year estimates.See “A Compass for Understanding and Using American Community Survey Data What General Data Users Need to Know” (2008) for a detailed discussion of the ACS, at https://www.census.gov/library/publications/2008/acs/general.html.
Alphabetical List of Data Sources contained in the NCS Extant Data Library