Data Critique

Social Vulnerability Index

The dataset includes information on the number of people who fall under certain “vulnerability criteria” for different US counties using data from 2018 to 2022. The themes that are measured for determining vulnerability are socioeconomic status, household characteristics, racial & ethnic minority status, and housing type/transportation. In the context of our research, our dataset will help us illuminate the degree to which certain people and communities suffer in the event of disasters, such as wildfires.
This dataset was based on census data measuring about sixteen factors such as unemployment, racial and ethnic minority status, and disability status, and was then organized into four themes. The source of the data is census data and also ZIP/ZCTA location codes to categorize the census numbers. This is a CDC dataset so it is government-created based on government census data.
Since the spreadsheet is based on census data, issues with what is left out of the census trickles down into our dataset. For example, the census does not include the ethnicity of Middle Eastern/North African which is an increasing amount of the population with its unique challenges. Moreover, the census may undercount certain populations such as undocumented people that are wary of sharing information with the government. Also, the housing part of the vulnerability aspect does not include a section for the unhoused population, which is a big group left out of the data. This missing information keeps us from revealing the struggles and needs of underrepresented groups in the US, especially during times of emergency.
If this dataset were our only source, we would not be able to assess the extent to which certain communities struggle during disasters and what resources they need.

California Wildfire Dataset

The California wildfire dataset includes the various wildfires that have occurred in California in the years between 2019 ~ 2023. This dataset focuses on large wildfire perimeters (over 5,000 acres) within California. It provides geospatial data on recent large fires, including the size, location, and containment status of these fires. It includes the year of the fire, the agency of the fire, the name of the fire, the acres burned from the fire, the alarm date, the containment data, and the cause. We will use this dataset along with the California drought data set to examine if there is a correlation as many wildfires are catapulted by dried-out vegetation. Moreover, we will use this Social Vulnerability Index dataset with the California wildfire dataset to investigate which communities were victims of displacement and if there is a correlation.
The California wildfire dataset was created by the California Department of Forestry and Fire Protection by using sources from different wildfires that occurred in the state of California. Generated by CAL FIRE, this data is collected through satellite imagery, field reports, and firefighting operations. It is updated regularly to reflect ongoing fire activity. Some setbacks of the dataset are that some wildfires are missing as some historical records were lost or damaged or had inadequate information to be added to the dataset.
This dataset lacks detailed accounts of people displaced by wildfires, including temporary shelter use and relocation. Additionally, data on health outcomes (ie. respiratory illnesses or stress-related conditions linked to wildfires) and the destruction of culturally significant sites or ecosystems are missing.
If this dataset was our only source, we would be missing the qualitative understanding of vulnerability as policymakers would have an incomplete picture of wildfire impacts without data on the lived experiences of affected communities.

California Drought Dataset

The U.S. Drought Monitor (USDM) data is generated each week through assessments from scientists at the National Drought Mitigation Center (NDMC), the National Oceanic and Atmospheric Administration (NOAA), and the United States Department of Agriculture (USDA). Scientists measure temperature, precipitation levels, water levels, and soil moisture to determine drought severity. The data spans 25 years to the present and depicts the level of drought severity for each week in California. and drought categorization for each county. Drought levels are divided into five categories where D0 indicates that a particular region is either going into a drought or coming out of a drought. From D1 to D4 the drought severity increases. These levels are determined by scientific assessments and climatological indices such as the Palmer Drought Severity Index and the Standardized Precipitation Index. The dataset captures weekly reports of drought conditions. The dataset does not take into account the local communities that may be affected by drought and the USDM also advises users to not use the dataset to make inferences about local conditions. If this dataset was the only source of information, it would face critical issues with correlating drought levels with fire and correlating the broader community experiences. While the weekly data provides an ample amount of information specific to drought conditions, it does not provide details of social, economic, and political factors specific to cities. Different counties have varying methods of allocating resources to manage the water systems and support vulnerable populations in the case of natural disasters. Without context of each region’s resource availability and distribution, the dataset alone fails to explain the correlation between fires to communities.