Reading and writing and … location. Visualizing where different peformance metrics correlate.

5 February 2010

Our parent company, Universal Mind, was tasked by the Colorado Department of Education and Center for Assessment to visualize data from their innovative models for measuring student progress. The public version of that project is available at schoolview.org. SchoolVIEW has some great features to visually compare school performance in terms of proficiency and growth (improvement over prior years) in reading, writing, and math. (You can learn more about the project here.)

CDE SchoolVIEW

CDE's SchoolVIEW

SchoolVIEW data in SpatialKey

I was interested in seeing the correlation between these different metrics, and (since we’re obsessed with location) how that correlation relates to geography. So, I imported that data into SpatialKey.  The source file was a CSV with a row for each school.  Here’s what that data looks like:

SpatialKey’s bivariate renderer allowed me to quickly explore the data in just that manner. The bivariate renderer allows you to select two numeric attributes in your dataset, and an aggregate calculation for each. In the image below, I selected average math growth percentile and average math proficiency.  Each dot in the scatterplot legend at the upper right represents a colored location (grid cell) on the map.  The position of the dot represents its relative score for average math growth (y axis) and average math proficiency (x axis).  The color “behind” the each dot is the color used for the corresponding grid cell on the map.

This visualization shows the coorelation between math proficiency and growth, as it relates to location. (Click the image for a larger view.)

We can see there is a general positive correlation, where most locations have a similar relative score for math performance and growth: Most points on the scatterplot are along an imaginary diagonal line from the bottom left (low in both metrics) to upper right (high in both metrics).  What’s often interesting and informative is to see areas that deviate from the norm in terms of the correlation.  Areas with relatively high proficiency but low growth – “strong but losing ground” – are colored blue, while areas with low proficiency but high growth – “risin’ up” – are colored red.  These are both negative correlations.  Areas that score low on both metrics are shaded white, while those high in both attributes are shaded black – both positive correlations.  For this type of visual analysis, areas that fall toward the middle of both ranges are usually less interesting, and so those colors are more transparent to allow you to focus on the extremes.  It may take a few seconds to orient yourself to this view, but once acclimated it’s a powerful way to visualize some complex – and otherwise difficult to express – relationships.

You can correlate any pair of attributes by simply selecting from one of the axes in the scatterplot legend. This next image compares average math and reading proficiency.  First, notice there seems to be an even stronger correlation between these two variables than the previous set.  (The points line up even closer on the imaginary diagonal line.)   It’s also interesting to compare these two images; Notice how the schools in some locations are relatively strong (shaded black) or weak (shaded white) in both visualizations, while others show a particular weakness in one of the metrics.

Selecting a point on the scatterplot shows the corresponding location on the map. In this case, we've highlighted a school that is an outlier because it's relatively strong in math versus its perforrmance in reading, realtive to other schools. We can easily see this school is in Moffat County. (Click the image for a larger view.)

SpatialKey makes it easy to uncover and visualize these relationships, and to share them with others. From uploading the spreadsheet with school data to presentation, this only took a few minutes to create – without any programming or hassle. And, this is just the start. By adding filters we can see these trends for schools of certain sizes or types, or compare these trends over time.

Further Analysis

An interesting next step would be to see if there is any correlation between the areas that deviate from the norm school performance and property value changes. For example, are the “rising up” areas ones where real estate values have been growing faster than average, or gentrification is taking place. (Of course, determining causality is a whole different conversation!) One could bring additional real estate or demographic data into SpatialKey to help answer those questions. SpatialKey makes it easier to understand the relationships between disparate datasets.

Try it out for yourself

Don’t take our word for it. You can start uploading your own data and visually correlating it right away by signing up for the 30-day trial of SpatialKey. Or, contact us and we’ll be happy to walk you through the process.

del.icio.us:Reading and writing and ... location.  Visualizing where different peformance metrics correlate. digg:Reading and writing and ... location.  Visualizing where different peformance metrics correlate. spurl:Reading and writing and ... location.  Visualizing where different peformance metrics correlate. newsvine:Reading and writing and ... location.  Visualizing where different peformance metrics correlate. furl:Reading and writing and ... location.  Visualizing where different peformance metrics correlate. reddit:Reading and writing and ... location.  Visualizing where different peformance metrics correlate. Y!:Reading and writing and ... location.  Visualizing where different peformance metrics correlate.

Comparing Thematic Maps with Density Heatmaps

4 February 2010

Now that we’ve rolled out thematic mapping by state, county, and zip code in SpatialKey, you can produce some fantastic thematic maps with only a few mouse clicks. But it’s important to understand how these thematic maps represent your data, and when it might be appropriate to use thematic maps versus density maps. Both are useful, and SpatialKey makes switching between the two methods easier than it has ever been before.

We’ll compare a zip-code thematic map with a heatmap. Both maps show average home sale price by geographic area (either zip codes or clusters of points). The image below shows the two map types side by side.
thematic_heatmap_comparison

Now we’ll step through an analysis of these different map types to see why they produce different views of the same data.

Thematic map by zip code

First, let’s take a look at mapping home sales in Sacramento by zip code. The map below shows thematic zip codes colored by the average sale price. You can see the highest range is $400,000 and up and includes 3 zip codes in the image below. I want to focus on comparing the two labeled zip codes, 95818 and 95822. You can see that the 95822 zip code area has a much lower average sale price than 95818, which is immediately north of it.

sacramento_prices_zip_thematic

Density heatmap with zip-code boundaries

However, if we switch to a density heatmap we see a different picture. Switching from thematic zip codes to a density map takes literally 3 clicks in SpatialKey. The map below shows average sale price as a density map, with the boundaries of the zip codes overlaid in red. This is the exact same data showing the exact same attribute (home sales showing average sale price). But if you compare this image with the thematic map above you’ll notice that the hotspots tell a different story. A fluid area that overlaps both the zip codes we looked at above is actually the area with the high average prices. That area doesn’t cleanly fall into a single zip code.

sacramento_prices_heatmap_w_zip

This isn’t too shocking, since it intuitively makes sense that fairly arbitrary boundaries like zip codes wouldn’t directly map to more or less expensive areas of town. But it illustrates the difficulty of rendering your data thematically by certain shapes, like zip codes or counties.

Density heatmap with neighborhood boundaries

To further analyze the dataset I decided to load in the boundaries of the neighborhoods in Sacramento (the file was downloaded here). Now we see boundaries that come much closer to matching the home prices. Intuitively this also makes sense; if you think about home prices in your city you’ll likely think of expensive and cheap neighborhoods, not zip codes.

sacramento_prices_heatmap_w_neighborhoods

Everything has its place

Both thematic maps and density maps are useful when exploring geographic data. Both show you important aspects of your data, but it’s important to keep in mind the inherent limitations of the different methods. With SpatialKey, we provide you with the tools to easily switch back and forth between these rendering methods in seconds.

Try it out for yourself

You can start uploading your own data and making thematic maps right away by signing up for the 30-day trial of SpatialKey.

del.icio.us:Comparing Thematic Maps with Density Heatmaps digg:Comparing Thematic Maps with Density Heatmaps spurl:Comparing Thematic Maps with Density Heatmaps newsvine:Comparing Thematic Maps with Density Heatmaps furl:Comparing Thematic Maps with Density Heatmaps reddit:Comparing Thematic Maps with Density Heatmaps Y!:Comparing Thematic Maps with Density Heatmaps

Visual mapping and analysis for “regular” business users?

1 February 2010

We all know that a picture is worth a thousand words. Images from Tiananmen Square, September 11th, or the recent devastation in Haiti are universally understood and move people to action more than words ever could. Visualizing vs. reading about events is becoming more and more prevalent, with an increasing number of people receiving their information from the web or cell phone. In parallel with the upsurge in use of images and multimedia content to communicate information, the advent of Google Earth, online maps, or car and phone navigation tools has created an explosion in the use of visual maps in every day life. Instead of reading text, we are now provided maps to more easily see how to get from point A to point B, or where to find open homes in a specific neighborhood. For most of us, seeing is understanding and believing.

Photo courtesy of CNN and Google maps.

On the business side, 80% of business data has a location component which provides a goldmine of untapped information for marketing, sales and operations. But current visual mapping and analysis tools are expensive, can only be accessed by trained specialists, and require heavy IT involvement to set up and maintain. This is a big barrier to entry for most businesses. They want to “see”, understand and communicate data trends, but don’t have the time nor means to invest in yet another expensive infrastructure.

The businesses that already do leverage visual mapping and analysis can more effectively and more quickly see geographic or time-based data and trends critical to sales and operations. This provides them a real competitive advantage. Many oil and gas companies for example have invested in sophisticated Geographic Information Systems (GIS) and brought in GIS specialists to gain insight on their location intelligence via visual maps. This allows them not only to plot areas with the highest potential to drill in, but also better manage their pipelines, operations, retail facilities, and more.

…. or….    . …and…            … and..     

Thankfully, a revolution is taking place that allows “regular” business users -with no GIS training nor deep pockets-  to leverage the power of visual mapping and analysis. Enter Software as a Service (SaaS). SaaS is transforming mapping and data visualization in the business world the same way Google Maps revolutionized mapping for consumers. Using cost-effective, user friendly SaaS mapping and analysis applications, such as SpatialKey, organizations of all types and sizes can now import their business data, combine it with geographic or competitive information, and start visually analyzing trends critical to their business. Where are key customers located? How can they maximize results in their sales territories? How best to map their sales territories? Where should they open a new retail outlet? How does Q2 sales compare to Q1 on a geographic basis? What marketing campaign resulted in the highest ROI? And so much more.

Opportunities and threats previously hidden within row and column-based datasets are now clearly visible via interactive maps. Concepts difficult to explain in text or PowerPoint presentations can now also be shown and therefore easily understood resulting in better decision making. What’s more, since everyday decision makers can use these applications, “what if” questions can be answered on the fly versus having to wait for an analyst to do a new data query. Decision-making, communication, and collaboration are improved. After all, seeing is understanding and believing, even in the business world.

Note: we’ll be adding blog posts around visual mapping for sales and marketing users over the next few weeks. In the meantime you can find out more at our sales and marketing and/or enterprise solutions pages.

del.icio.us:Visual mapping and analysis for  digg:Visual mapping and analysis for  spurl:Visual mapping and analysis for  newsvine:Visual mapping and analysis for  furl:Visual mapping and analysis for  reddit:Visual mapping and analysis for  Y!:Visual mapping and analysis for

The easiest way to create thematic maps by state, county, or zip code

21 January 2010

We’ve just launched new thematic mapping features in SpatialKey that let you create maps of your data by state, county, or zip code with a few simple mouse clicks. We think this is the easiest way to create thematic maps – ever. To show off these abilities I’ll show an example of creating a thematic map of unemployment rate by US county. The end result will look like this:

unemployment_thematic_counties

Find some data

Your data must have location details down to the level of granularity that you are trying to map. For instance, if you want to show a map of states, all the records in your data should at least have US state (your data can be more granular too, you can map address-level data by state if you want). In this example I’ll be mapping US unemployment rate. The data for unemployment is provided by the Bureau of Labor Statistics and can be found here. I took the latest stats by US county and extracted only the data for October 2009.

After just a little massaging in a spreadsheet program my data looked like this:

spreadsheet_counties

You can download the CSV file that I used if you’d like to try it out for yourself.

Upload to SpatialKey

Once you have your data ready, you can upload it to SpatialKey. If you don’t already have a SpatialKey account, you can sign up for the free 30-day trial to get access right away and start uploading. During the upload process you’ll be asked to identify the location columns in your data, like street address, zip code, etc. We’ll do our best to automatically identify these columns based on your data, but you might have to help us out.

upload_1upload_2upload_3

Make your map

When you load your data onto a map you’ll be asked what kind of map you want to create. We’ll make a thematic shape map, and we’ll choose to map the data by US geography (this includes state, county, or zip code).

map_wizard_1map_wizard_2

Then we choose what the map should display. In this example we want to show the unemployment rate, so I’ll pick average unemployment rate, which will start me off with  a map of the US states with the average unemployment rate for all the counties in each state.

map_wizard_3map_wizard_4

Now our thematic map shows the average unemployment rate for all the counties aggregated by state.

thematic_states

I can switch between this state view of the data and counties by changing the options in the layer’s settings panel.

thematic_options

Now we have a map of all the counties in the US (including Puerto Rico) that shows the unemployment rate of each county.

unemployment_thematic_counties

Customize and Explore

You can easily customize the bin ranges if you want to tweak them, or you can control the colors used (all maps are the same, just with a different color scheme):

unemployment_us_counties_blueunemployment_us_counties_green

unemployment_us_counties_purpleunemployment_us_counties_bw

You can also use all the filtering options that SpatialKey offers to filter the data in your thematic maps. Here’s an example of filtering to only show counties where the total labor force is over 100,000.

thematic_labor_force_100000

And here’s another example to show only the counties where the unemployment rate is greater than 15%:

theamtic_unemployment_15percent

No Programming Required

To generate these maps you don’t have to write a single line of code. It’s as simple as uploading your data and stepping through a few guided steps. If you wanted to change the map to show the total labor force per county instead of the unemployment rate, it only takes 3 clicks. There are lots of ways to make these maps, like this great tutorial on FlowingData shows, but we think SpatialKey gives you the easiest way to create and analyze thematic maps.

del.icio.us:The easiest way to create thematic maps by state, county, or zip code digg:The easiest way to create thematic maps by state, county, or zip code spurl:The easiest way to create thematic maps by state, county, or zip code newsvine:The easiest way to create thematic maps by state, county, or zip code furl:The easiest way to create thematic maps by state, county, or zip code reddit:The easiest way to create thematic maps by state, county, or zip code Y!:The easiest way to create thematic maps by state, county, or zip code