As individuals we accept that our health is something that can and should be monitored. A routine medical checkup begins and ends with measurement. Weight, height, blood pressure, temperature, and pulse are all checked. Routine blood work includes blood sugar and cholesterol, and all are compared with what are considered normal values. If nothing else we leave the exam content that something scientific has taken place. The premise of this checkup is that these metrics provide a basis of assessing health that can be used to inform decision-making. Yet while measures of blood sugar, cholesterol, and blood pressure are grossly prognostic for several chronic health conditions, they reveal little about what may be lurking beneath the surface for a large subset of important diseases such as cancer and Alzheimer’s disease and about our psychological wellbeing. The fact that self-reported health is often a stronger predictor of mortality than clinical measures suggests just how lacking these standard metrics can be.1
Measuring health at the scale of whole communities is more daunting and involves a different set of indicators. Epidemiological researchers have long recognized the multidimensionality and scale-dependent nature of efforts to measure and advance public health. The challenges of finding adequate metrics are amplified considerably as we consider the goal of building sustainable communities. The environmental movement is only beginning to recognize that, like human health, sustainability is fundamentally an integrative and multi-scale phenomenon. Building long-term resilience at the community level involves physical, economic, and social systems, all working together.
In recent years “sustainability report cards,” which attempt to benchmark sustainability efforts, have been developed by a range of entities including states, cities, regions, and individual organizations. Like a checkup, periodic measures like this reflect a growing realization that assessment and informed action require reliable data. But how useful is the information currently gathered among governments, policy makers, and practitioners? How often are data integrated across physical, economic, and social systems in ways that can be truly prognostic of the community’s overall “health”? How often is data from the indicators actually consulted to inform the design of policies or to evaluate the success of interventions?
These are urgent questions: the rapidly changing planet demands that we act quickly and effectively to transform our methods of living on this earth in ways that are consistent with its biophysical limits and that allow us to adapt to changes over the next century. To do that, we need good assessment tools along with solid data that match the scales at which different types of interventions can be deployed across a wide range of outcomes related to physical, economic, and social systems.
This paper explores efforts to develop “metrics of sustainability.” The conceptual foundation of the research driving this initiative is full-spectrum sustainability—a holistic, comprehensive, applied paradigm of sustainability that considers integrated socio-economic and physical measures of progress for communities in tandem with efforts to maximize community engagement and social/environmental justice.2 The research is situated in Oberlin—a small college town in Ohio—where multiple initiatives are underway to effect transformations towards full-spectrum sustainability. Below we explain why metrics are important, identify the problems with existing metrics, and describe our approach as one potential solution.
Effective metrics need quality data
Given growing climate destabilization, quality data is essential to support effective responses by policy makers. Advances in our understanding of sustainability and measurement technology are ready to support the development of this foundation. Despite the time urgency of developing new and cost-effective interventions, we must be careful to consider lessons from the past on the importance of quality evaluation. It is not always obvious what is working and what is not, particularly when outcomes are multi-faceted, and are social or psychological in nature, such as improving overall wellbeing, social justice, and shifts in values. Joan McCord3 describes a now classic intervention executed in the 1930’s where the goal was to take boys at risk for delinquency and provide them with sustained social services and support. The program was well designed and well run. Each boy received from a professional social worker tutoring, referral to special services like eye doctors, and a number of enrichment activities like trips to sporting events, access to a wood shop, and a summer camp. During, immediately after, and 30 years after the program, participants swore the program was the best thing that ever happened to them. Sounds like a success! But was it?
Luckily, the researchers did an equally outstanding job of assessing their program by including a control group of boys paired on major demographic variables who received no intervention. Three years after the program the boys who received services were just as likely to have appeared in court as controls. Thirty years later, the intervention group had a higher rate of mental illness, arrests, serious crime, alcoholism, and death. Not only did this program not help, it possibly hurt. Yet it sounded so good. If the data had not been collected to track individual outcomes over a sufficiently long time scale, we would have never known.
Existing data are inadequate
Unfortunately, good data is not always readily available and quality evaluation does not always come easily or cheaply. The ability to link human actions and policies to real and measurable changes in quality of life, quality of the environment, equity, and social change requires types of data that are not currently collected in a systematic way. As cities large and small embrace broad sustainability goals and initiatives, there is renewed enthusiasm for using ready-made social and economic indicators to benchmark, compare, and evaluate progress on key social and economic dimensions of sustainability. Yet for most key measurements of the social, economic, and physical health of communities, researchers and decision makers remain at the mercy of traditional administrative sources of data, collected and updated at time intervals and geographic scales that have remained largely unchanged over the past fifty years.
Consider the kinds of economic and environmental indicators that are readily available to communities who want to measure progress towards sustainability goals. Census data is available every 10 years and can be used to track basic demographic and economic characteristics such as income, unemployment, poverty, and education as well as sustainability—relevant indicators such as population density, housing characteristics, commute times, and transportation modes. Though many census variables are collected at the block level which offers some granularity, comparable environmental measures such as biodiversity and air and water quality are not collected at the same or overlapping scales. Social indicators that reveal issues of equity, social norms, and networks are virtually non-existent. And unlike many economic or physical variables that are routinely monitored and recorded, these variables cannot be captured or recreated retrospectively. Though nationally available social and economic indicators provide an opportunity to compare places as far apart as Homer, Alaska and Key West, Florida, this benefit of geographic coverage is often outweighed by their lack of richness. Most indicators were not designed to support decision making nor were they created with sustainability goals in mind. As such, they are insufficient to support comprehensive assessment tools at the community scale.
An additional challenge is that these standard metrics are collected on time scales (years or decades) that do not allow for immediate feedback regarding the effectiveness of local interventions. As aggregate measures, many of these indicators tell us very little about the variability within the community or who among us might be experiencing the brunt of certain social, economic, or environmental ills. Pearsall and Pierce4 find that despite overall improvements in environmental quality, focusing exclusively on aggregate data fails to consider potential inequalities among neighborhoods or social groups. The authors note a few exceptions to this, such as initiatives in Portland, Oregon, to measure changes in biodiversity by neighborhood, demonstrating the power of ecological indicators being integrated with social indicators as opposed to addressing the two separately.
More recently there have been a number of efforts to develop national metrics specific to sustainability. The majority of sustainability indicators focused on community-level sustainability fall under one of two broad methodological paradigms: a top-down approach that is predominantly expert-led, and a bottom-up approach that is community-based.5 Indicators produced by top-down approaches – such as the Environmental Sustainability Index6 or the Genuine Progress Indicator7,8—attempt to quantify the complexities of dynamic systems, but suffer from the same limitations noted above (such as the lack of a social dimension and the use of large spatial and temporal scales). For example, GPI has been successfully applied at the national and state levels and has more recently been developed for some counties and major cities. But it was not originally intended for tracking community-level change, and there are no smaller-scale equivalents.
From the bottom-up side, many communities have active programs to track indicators of economic, social, and environmental quality at neighborhood, city, and metropolitan regional scales9,10 These measures are often rich in detail, and are scaled at a level appropriate for guiding decision makers. However, they lack standardization across communities, making comparisons across regions extremely difficult and generally include only a limited set of social measures. Furthermore, these community-derived metrics are often developed by community groups or city employees with no formal training in assessment, and thus these measures are rarely formally validated.
A Framework of Social Metrics
Our research team set out to pilot an approach to community-level metrics of sustainability that would address many of the limitations described above while providing useful and timely information to decision makers. Our approach was informed by the following tenets:
• Gathering baseline data on social, economic, and physical metrics is critical to documenting change related to programmatic initiatives.
• Data about physical systems must be integrated with information about social and economic systems.
• Metric development must balance the need for standardization with its relevance to local conditions, initiatives, and interests.
• Metrics must capture both general and site-specific community patterns, processes, and control mechanisms that hinder or advance transformation.
• Metrics must be sensitive and reliable enough to capture changing conditions and the effects of interventions.
• The approach should be relevant to and replicable in other communities, and deployable with limited resources.
We conducted a thorough literature review of sociological, public health, and psychological theories of behavior change and community transformation, as well as a review of existing metrics of sustainability. While metric development was influenced by existing validated metrics, it also had a significant bottom-up component. The research team solicited input from a broad range of constituents to ensure that the survey items included were of broad community interest and value. Community stakeholders were invited to submit questions to ensure that we had measures relevant for evaluating existing community initiatives and programs. Upon completion of the surveys, researchers reported back to community stakeholders, and these results have been used to inform outreach efforts and strengthen funding proposals.
The major constructs we measured were:
• Individual-level psychological variables such as attitudes, norm perceptions, identity, and efficacy around environmental issues.
• Household behavior that impacts physical and economic systems, including recycling, transportation, buying habits, and resource use. Energy-related behaviors would be used to estimate the carbon footprint.
• Measures of the individual’s relationship to the community and the extent to which core social needs are met through the community.11 This included measures of belonging that are sensitive to social justice concerns (gentrification), measures of trust (neighborliness, town-gown relations), and measures of community members’ sense that their social and demographic groups are valued by the community.
• Measures relevant to particular local initiatives, including the plan to build a new LEED-certified public school campus and a burgeoning local foods market.
The social metrics we developed are significantly different from traditional socio-economic variables commonly used as sustainability indicators. Our indicators are informed by developments in the field of sociology, social psychology, and community-based social marketing (see figure 1). The final survey was made up of 137 questions, and took approximately 25 minutes to complete. Although our main interests were in assessing what was happening in Oberlin, we included a control community, a nearby college similar to Oberlin in terms of demographic characteristics but not yet engaged in comprehensive efforts to become more sustainable. This control community will allow us to determine if interventions in Oberlin are actually speeding up the transition to sustainability, simply mirroring a broader cultural shift, or even possibly counter-productive.
Another important goal was to develop a survey strategy that could be replicated in communities without high levels of resources and expertise. Developing and validating the survey instrument took considerable staff time and expertise, but once standard metrics are established, they can be shared freely with other communities. By using an online survey platform we minimized costs (print copies were made available upon request). Sample recruitment was also low budget. Sampled households were initially sent letters inviting participation, followed by two rounds of postcards and hand-delivered reminder cards. We offered no financial incentives, and yet achieved a response rate of 36 percent. In all, the direct expenses of data collection did not exceed $1500.
Connecting Social with Physical Metrics
The transformation of physical systems is central to creating full spectrum sustainability and requires fundamental changes in the pattern and magnitude of energy flows, material cycles, and land-use along with their measurements. A physical metrics framework to collect, organize, and track information on physical resources pertaining to a community should include both the natural capital of the area and the flow of resources through it, including consumptive uses (see figure 1). Data availability varies widely across indicators with some areas (e.g. energy, water, waste) having long-term baseline data available whereas others (e.g. biodiversity, air quality) have very limited data availability at the local scale.
Existing community sustainability indicators often fail to make meaningful connections between physical and social metrics. For example, measures of total energy and resource use can be reliably calculated with access to public utility data. But in order to understand how individual and collective behavioral changes translate into community-level energy use reductions, measures of related behavioral, economic, and social dimensions of change are also necessary. Without such a connection, it is impossible to evaluate the efficacy and cost effectiveness of competing local intervention strategies.
Figure 2 presents an example of an approach to reduce residential energy use in Oberlin combining different kinds of interventions. Several policies and programs currently exist to provide support to homeowners of varying incomes to reduce electricity use. These include financial incentives for buying energy-efficient appliances, replacing incandescent with more energy-efficient light bulbs, and improving insulation and other structural measures to reduce the heating and cooling loads. Additionally, a system of real-time monitoring of electricity and water use provides direct feedback to residents about electricity use in the community – an effort to more closely connect people with the systems that support their resource use.12 These are delivered through partnerships among the city, local utilities, academic researchers, and local non-profit organizations. While the above framework is focused on physical systems, it is supplemented by social initiatives to engage, educate, and motivate people to participate in these energy-related initiatives and become proactive agents of change. Integrating social metrics into physical system interventions ensures that programs do not yield disproportionate benefits across the community, are calibrated to the socio-economic fabric of different neighborhoods, and built upon the core social motives.
Connecting Social with Economic Metrics
The metrics for economic transformation are, not surprisingly, more consistent and well-developed than others (see figure 1). Income, employment, education, poverty, and many other demographic and socio-economic data are regularly collected by federal agencies. Yet, geographic and temporal scale issues persist in the way economic indicators are being used to monitor and track community-level sustainability. Economic priorities often follow fiscal or electoral cycles – rather than sustainability thresholds informed by physical systems. Finally, economic indicators often do not distinguish between local circulation of money versus large corporate establishments driving economic productivity that transcend geographic boundaries.13 We have thus far developed the metrics for economic transformations based on existing economic indicators – especially modeling after initiatives in Seattle, Washington and Dubuque, Iowa. We categorized the measures under the broad categories of economic progress, vulnerability, and resilience. This framework includes specific indicators on education, access to affordable housing, equitable distribution of income, and level of local investment across multiple scales.
Getting Local
Use of local, repeated surveys are just one facet of long-term evaluation efforts, but can be valuable tools in the short run to help to shape the direction of local initiatives, combining both bottom-up community self-reflection and more universal and systematic approaches to evaluation. For example, understanding the basic drivers of overall community satisfaction (or conversely, dissatisfaction) must be understood before embarking on sustained community initiatives. Using basic statistical models and local data, the pilot survey in Ohio revealed that in the comparison community, access to employment, and concerns over drinking water quality were the two most important positive and negative drivers, respectively, of overall community satisfaction. Thus any intervention or initiative related to sustainability, even if it is not directly related to drinking water or employment, must recognize these priorities and drivers of satisfaction of community life. In Oberlin, relatively poor access to retail shopping was among the most important negative drivers of overall satisfaction. This locally based information can be used to frame local initiatives and targeted programs while evaluating more global strategies across diverse communities with different priorities. For example “buying local” is a common initiative to reduce fuel consumption for shopping trips and improve the economic self-sufficiency and resilience of local communities. In one community this may be framed to improve the quality and diversity of local shopping while in the other community it can be framed to improve access to local jobs. In both cases, this framing is data-driven and based on feedback from the local community as opposed to expert-driven approaches that may misfire.
Even if an intervention takes different forms, the effectiveness of these strategies can be assessed globally through standardized metrics such as a household’s annual total miles traveled for shopping-related trips or the percentage of disposable income spent locally. Meanwhile, these harder end points can be assessed concurrently with dimensions of overall satisfaction and a sense of belonging that informed the local strategy. Over time, effective initiatives would see measurable improvements in both the more specific standardized metrics designed to capture local economic resilience and energy use and more global ratings of satisfaction with place that informed the strategy.
Beyond the community level
What happens in a particular community is of great importance to those who live there, of course, but an adequate response to climate change will require many communities to simultaneously make transformations using cost-effective methods. Below we consider one promising model for sustainability practitioners to consider, and outline key strategies for linking together data sets generated at the local level to maximize the speed and accuracy with which we evaluate potential solutions.
The National Violent Death Reporting System14 provides one model for how communities, states, and government agencies might coordinate data collection efforts. The NVDRS began as a smaller local effort to better understand violent deaths by integrating multiple sources of administrative data with death records on a case-by-case basis. This effort has been adopted by the Centers for Disease Control and expanded to a national initiative including 18 states. The system builds upon vital statistics data and coroner records to provide a far richer set of measures to understand precipitants to violent deaths. Using a standardized coding scheme, local and state-wide sources of data are systematically compiled and integrated. It also allows flexibility to include additional indicators based on the availability of state-wide sources of data and local priorities.
An analogous network of sustainability databases would open up the possibility of addressing a whole new set of important questions: how might peaks of energy use be related to other indicators on the same day? Are sulfur dioxide emissions related to asthma hospitalizations and how does this relate to peaks and types of household energy use? Is the strategy to inform sensitive individuals about limiting exposure during peak ozone days effective in reducing the burden to individuals and to healthcare? Does providing information to the public about critical periods for limiting energy use change behavior as measured through local utility data? These are only a few examples of cross-system questions spanning multiple entities (EPA, healthcare systems, local utilities) using existing data sources that might be routinely monitored and enhanced through integration rather than as isolated strategies and metrics.
And while an individual community committed to measurement and assessment could implement such an integrated system on their own, the true power comes from the ability to compare results across sites. Rather than relying on national data sources designed for other purposes or working in isolation, local communities that are motivated to capture a richer set of measures could form the foundation of a network of local laboratories while retaining the flexibility to collect measures specific to their goals. Interventions to address economic, environmental, and social problems within and across these networked laboratories could then provide a solid evidence base for the effectiveness of local interventions. In fact, smaller towns and cities may serve as the best laboratories as their boundaries are less fluid than big cities and less prone to spill-over effects and in and out-migration from adjacent communities. Networks across different geographic scales could develop standard protocols for collecting information needed to evaluate local initiatives.
Local laboratories could maximize their impact by the use of randomized designs (with a control group and one or more experimental groups) to assess particular interventions. As McCord’s analysis of the intervention for troubled youth demonstrates, it is all too easy to reach false conclusions and overstate the effectiveness of strategies that community members are invested in. We need to do more than just identify programs that people feel good about. Instead, we need to identify a set of ‘best practices’ tested across a range of communities using a mixed evidence base to evaluate their effectiveness.
Conclusions
Ultimately, what is most needed is not just more and better measurements but systematic approaches to implementing and evaluating change. Academics and policy makers should not have blind faith in existing indicators to evaluate progress. Instead they should think carefully about the curious properties of aggregates, the science of measurement, and the importance of developing innovative, comprehensive, and systematic approaches of measuring change. Networked micro-laboratories, enhanced national indicator systems, wider use of randomized trials, and increased attention to measuring social and physical systems are just some of the possible solutions to better test the effectiveness of local interventions using a standard set of measures while allowing for flexibility based on the priorities, motivation, and resources of local communities. This has the potential to much more closely represent local realities and priorities and provide opportunities for greater engagement of residents and empowerment of local initiatives.
Acknowledgment: This research was funded by a grant from The Schmidt Family Foundation.