As stewards of data derived from clinical care, HMORN investigators have an ethical and legal obligation to use these data appropriately. This includes ensuring that the results of scientific investigations have the potential to improve care for study participants and the general population, and abiding by all HIPAA and other regulatory compliance policies.
Each institution has its own institutional review board (IRB) review process. As part of this process, many have arrangements for streamlining the review for multi-site studies. The HMORN has also established a facilitated review process available for a variety of study designs (data only studies, surveys, and other minimal risk research).
Institutional impact assessment
In addition to IRB review, many institutions require one or more reviews to ensure that new studies are feasible, have scientific merit, have adequate resources, do not pose unreasonable institutional risk, do not duplicate existing projects, and do not interfere with patient care. These review processes are especially important for intervention studies.
Data access
Data for HMORN investigations are drawn from multiple sources including, but not limited to, standardized data warehouses, site-specific disease registries, EMRs, and pharmacy databases, as well as primary data collection. For most studies, data are accessed at the site level by analytic and data management staff who are familiar with the databases and have institutional permissions to access specialized databases such as pharmacy data and EMR summary data. For multi-site studies that use data from the standardized VDW, efficiencies are achieved by sharing data extraction code that has been written and validated at a single site then deployed at other sites to be run against local VDW files. Data management staff at all sites work closely with site investigators to refine data queries and prepare analytic data sets.
Data sources and tools are described in more detail in the Section VI.
Cohort development and utilization
Analytic cohorts are usually study-specific and initially developed through the data extraction processes described above. HMORN electronic data may be supplemented through data linkages with other sources, such as vital records and cancer registries. Depending on study design, analytic cohorts may also contain data on survey responses, variables from clinical trials, alternatively collected data such as home blood pressure readings, and other data points specific to a given research investigation. Depending on the study, cohorts may be reusable, or a de-identified extract of the data may be made available to other investigators for secondary data analysis. Procedures for re-contact and/or re-consent will vary from project to project. See the data sharing section below for more information.
Intervention studies
Studies that involve interventions with patients or providers require close partnerships with any involved health care delivery systems or health plans. It is important to make sure that interventions do not interfere with clinical care and that all outreach efforts appropriately protect patient privacy. In some instances, undertaking certain studies may require notifying the primary care provider of the patient/member, since the research could potentially affect care coordination or delivery. An example might be a study that provides participants with a study drug that could interact with other medications the participant takes. Hence, the impact of research on care must be preempted through communication with the health system or patient’s care team. As noted above, some sites have embedded research clinics, which can facilitate data collection for interventional trials.
Methods and analytic processes
HMORN analytic and methodological experts at each site are knowledgeable in biostatistics, epidemiology, qualitative methods and other methods relevant to the research and the often unique data sources available at their site. Special emphases are described in detail in the previous section and in the site profiles (Section VII). Investigators and analytic staff with relevant methodological knowledge participate in study design, analytic plan development, and the conduct and interpretation of analyses and preparation of scientific products. HMORN researchers make regular contributions to methodological literature, publishing on topics that include risk adjustment methods, validation of electronic data and paper medical records, statistical methods, cost-effectiveness and cost-utility analysis, and Bayesian methods.
For multi-site investigations, limited data sets or de-identified data are commonly aggregated and analyzed by the lead site. Depending on the data use agreements in place, some analyses may be undertaken by a site investigator with an interest in a particular aspect of a study.
Cohort access and re-contacting participants
The HMORN follows federal funding agency and IRB policies regarding post-study access to de-identified data. When specific projects meet criteria for such access, HMORN investigators arrange contact processes for interested parties. Re-contacting study participants may be possible with approval from the principal investigator and IRB for open, ongoing investigations. Re-contacting participants of closed studies is dependent on the original study design and original IRB approvals for managing linkage files between valid participant identifiers and anonymous study identifiers. Certain studies (e.g. those designed as ongoing cohorts such as the Framingham Study) have study designs that incorporate ongoing re-contacting while others are designed to be finite and have linkage files destroyed after a specific time period. In an effort to facilitate data sharing and maximize data utility, the HMORN has developed tools to help streamline the development of data use agreements, as well as guidelines that urge the use of bidirectional or multi-lateral data use agreements.
Research Data
HMORN sites have substantial experience acquiring data from various sources at their organization and using it for research. Additionally, many HMORN researchers are skilled at the analysis of merged data from multiple HMORN health systems. This particular data use often requires cross-walks among various coding conventions, interpretation of data in light of local practice and coding practices, and efforts to ensure that data are fully deidentified. Research use of data from clinical systems—be they from electronic medical records, billing claims, or other sources—requires transformation quality checks and careful, knowledgeable interpretation. These data systems are not inherently interoperable and are subject to periodic updates and revisions. The data analysts at the HMORN sites have worked with investigators to develop tested, standardized approaches to handling these issues.
Data sources
In the HMORN, data sources vary by institution and over time. However, researchers typically have access to health plan claims data, medical record data (which are increasingly electronic), and other information from health care data systems, public records, and other sources. Table 9 summarizes commonly used data sources.
Table 9: Typical data sources for HMORN research
Data type
|
Considerations
|
Source: health plan
|
Health plan claims
|
Claims data provide a wealth of information, such as diagnoses, procedures, and treating clinician. Because claims data are not created for clinical or research purposes, special attention may be required for interpretation, depending on the clinical domain. Within the HMORN, claims data may include more detailed information than is usually contained in systems such as Medicare, but this means standard approaches from other claims data systems will require substantial review and revisions.
|
Source: data derived from health care encounters
|
EMR data
|
All HMORN research sites have experience using EMR data. Most members use EpicCare and many of the sites’ analysts are expert in pulling and using data using SQL from the underlying database. Most sites have established standard data tables for lab test results, blood pressure measures, and height and weight measures with data drawn from the EMR. Use of other data within the EMR has also progressed, illustrated by one cancer screening research study in which 4 sites have extracted electronic text from cytopathology reports of Pap smear results and established standard data tables of categorized results. Many sites have used natural language processing (NLP) to enable use of the electronic text contained in the EMR, and HMORN scientists and programmers with NLP experience meet monthly by conference calls.
|
Medical charts
|
Patient medical charts—both paper and electronic-- are the gold standard for studies that require review of a patient’s history. Typically, medical charts are only reviewed for patients in a medical group that is part of the same covered entity as the researcher; review of charts from other health care providers may occasionally be possible with additional funding and relevant permissions. Several research projects have developed and tested methods for training chart abstractors across multiple sites and accomplish chart abstraction efficiently and accurately.
|
Lab data
|
Typically available from EMR or other systems. Coding standards vary around the country. LOINC is the most common coding standard used by HMORN members. Sites that do not use this coding standard have created an internal crosswalk to it. The HMORN has a lab data committee that guides the development of standardized data tables for the lab test results most commonly called for in research studies.
|
Clinical registries
|
Several studies have access to registries developed for clinical operations that can be made available for research purposes. The clinical areas vary, but include information on diabetes, heart disease, cancer, genetics, perinatology, chronic diseases and preventive services.
|
Biospecimen resources
|
Several HMORN-affiliated sites have large, well-established biobanks. They operate independently. Currently clinical and research repositories differ in terms of consent, material collected and restrictions on use.
|
Patient reported outcomes
|
Several sites collect information self-reported by patients including race or ethnicity, language preference, general health, functional or mental status, and preventive health behaviors. Systematic collection of patient-reported data is increasing on a national level. Currently the amount of patient-reported data available for research varies across sites but is increasing.
|
Data on health care cost, utilization, benefit designs
|
There is great interest in the ability to link clinical, benefit, and cost data. Sites vary in their capacity based on investigators’ interests and available data.
|
Pharmacy data
|
Pharmacy data includes data on two types of transactions: ordering prescriptions and dispensing medications. Data availability may depend on whether the medication was ordered and dispensed in the inpatient or outpatient setting.
|
Source: Primary data collection
|
Survey data
|
About one third of the research centers have dedicated survey units, or comparable ability to deploy trained survey interviewers to collect telephone, web and in-person survey data.
|
Clinical trials data
|
Clinical trials data collection follows carefully designed procedures using a range of commercial and custom-built clinical trial management systems. VDW data, such as that described above, has the potential to facilitate subject sampling and recruitment by narrowing the population based on a trial’s predetermined eligibility criteria.
|
Source: Secondary data
|
Cancer registries
|
Most HMORN research centers have independent cancer registries. Many others have linked information from state tumor registries.
|
Medicare/Medicaid
|
All HMORN research sites can link information from their records to Medicare/Medicaid with appropriate permissions.
|
Vital records
|
All HMORN research sites can link information from their records to state birth and death records with appropriate permission.
|
Site-specific data tools
Each HMORN member research organization has developed a set of tools to access, link and manage data for research. Depending on a study’s specific needs, an HMORN researcher may be able to draw entirely on pre-compiled data in a research warehouse, or may need to work with partners at the health plan, health care provider group, or outside institution or agency to assemble and link new data.
The Virtual Data Warehouse (VDW)
A cornerstone of the HMORN’s success is the ability to coordinate comprehensive data resources in support of a large and varied program of multi-site, multi-purpose collaborative research. A key example is the Virtual Data Warehouse, or VDW—a one-of-a-kind shared resource which is created by translating data from local systems into a common format based on agreed upon data standards. The VDW combines demographic and clinical information from EMRs, insurance claims, and registries for defined, diverse, and geographically distributed patient populations. Active input of researchers with each successive project has progressively enriched its utility. Figure 4 shows a schematic of the VDW and its use in the research context.
Figure 4: Virtual Data Warehouse within HMORN
The VDW is a federated or distributed data warehouse. There is no central data repository; data reside at each home HMORN site and sites maintain control over their data and its uses. The VDW model relies upon computerized datasets stored behind separate security firewalls at participating HMORN sites. Each site’s datasets include variables with identical names, formats, and specifications and identical variable definitions, labels, coding, and definitions. Table 10 provides an overview of the current data domains housed within the VDW. A set of informatics tools—hardware and software—facilitate storage, retrieval, processing, and managing VDW datasets. Specific policies and procedures govern the use of VDW resources. An internal website provides extensive documentation on all components of the VDW, including years of data availability, types of variables contained in a data area, and known anomalies.
This approach allows multi-site data checking, data characterization and analysis to be conducted using a single computer program developed at one HMORN site which is then distributed to all sites and executed locally. Programs written at one site can be run at other sites with a minimum of site-specific customization; sites then review findings and return results or subsets of data (via secure mechanisms) to the requester for merging and further analysis.
Data standardization is a dynamic and active process that involves the following steps: 1) specifying common variable names, labels, coding, and definitions; 2) writing programs to extract and convert variables stored in legacy information systems to the common standards; 3) testing standardized data for consistency and accuracy; 4) standardizing methods by writing macros that are used across projects; and 5) instructing researchers and analysts on how to use the VDW to guide construction of analytic files for approved research projects. The HMORN’s VDW Operational Committee (VOC) provides direction to each HMORN site on VDW implementation. The VOC is also responsible for maintaining current documentation of data availability across sites, including site variations and site-specific issues, quality control evaluation of domain-specific data at each site, and polices and procedures for initiation and conduct of multi-site research within the HMORN.
This structure eliminates the potential security risks inherent in a centralized model (where all data are pooled at a single site). Another key advantage of the VDW model is that data remain with health plan staff, data analysts, investigators, and providers who are best positioned to consult on proper use of the data, help interpret findings, and investigate anomalies. Each site extracts, transforms, and loads their local data into the common VDW data model which enforces uniform data element naming conventions, definitions, and data storage formats (i.e., semantic and syntactic interoperability).
Share with your friends: |