tutorial 6.1:

Working with tabular data in Excel

1) Introduction to Excel

The Excel spreadsheet software is hugely useful as a tool for working with tabular data. Many manipulations of data that can take some work to accomplish in code are immediately solved with built-in functions in Excel, so it's possible to save considerable time by pre-structuring your CSV in Excel before bringing it in to Processing. Here are a few types of functionality in Excel that can save you time:

  • sort the rows based on values in a selected column or columns
  • table rotation (change rows to columns and vice-versa)
  • find the average, max, min, and standard deviation for a given column
  • convert time values to Unix Timestamp
  • insert new columns in a table with values pulled from another table based on conditional evaluation

In addition to using Excel to restructure and extract information from the data in your table(s), Excel also has extensive built-in capacity for visualization. And, this built-in capacity can be extended through the use of custom templates which provide additional functionality, such as the NodeXL network graphs template for Excel.

Here are a few ways to visualize tabular data using Excel:

  • use the built-in charts (pie chart, bar chart, etc)
  • conditionally colour cells in table to give a quick visualization of the distribution of values
  • plot values on a map

2) Alternatives to Excel

If you don't happen to have a license for Excel, there are several online spreadsheet applications and open source alternatives for download. Here are a few of these:

3) Workshop examples

You can download all the examples referred to in class here (143MB).