I am currently using this blog for on a series of exploratory visualizations based on the 1941 South American tour of American Ballet Caravan. The related posts begin with a bit about the datasets themselves here and here. The menu at the top left lists all of the posts in chronological order. A few of my favorites are this one, this one, this one, and this one.

Mapping Tours through the Dancers’ Eyes

These maps are based on an archival document I have been fascinated by for a while: the dancer William Dollar’s pseudonymised 410-page account of American Ballet Caravan’s South American tour, entitled “Old Granny Spreads Goodwill.” They continue the work I did with the dancer birthplaces and citizenship, in that it they are meant to complicate the exchange of touring, by visualizing it in terms of a more global network instead of something point-a-to-point-b or even a loop. I see these maps as the first stage of an associative map that depicts the world as it was seen and imagined from the perspectives of dancers on tour (as recorded in this case by a single dancer).

I constructed the underlying dataset for these by searching through Dollar’s manuscript for moments when allusions or comparisons to other places appear. For example, late in the tour, a location just outside Cucuta is compared to the “Badlands of North Dakota.” Early on in the tour, European cities tend to provide more regular points of reference for Dollar, while later on he increasingly references the South American cities the company previously visited. Languages were also part of how dancers built imaginative associations that remapped their world; there is a great description of “Maestro” (Balanchine) in Rio de Janeiro attempting to apply his English, French, and Russian to the Portuguese journal in front of him. While some associations seem to be based on places to which the dancers had previously traveled, other were likely imagined, such as the comparison between the Guaya and Congo rivers.

For this first map, I used color to indicate the city from which the association is being made, and composite operations appear darker (ie: if the same line would be drawn more than once). Note: As always, if you want to get more up close and personal than the basic zoom and drag functions that are embedded, the best way is to click the share button in the upper right, and then “link to this map” will take you to CartoDB’s site.

This first iteration of the dataset required many decisions. In order to be able to filter later, I categorized each GeoJSON linestring by location, nationality, or language, depending on how the association was made. However, it would be useful to introduce additional gradation in order to account for the difference between encountering people from a place — for example, a dancer taking up with a Polish refugee in Bogota — and a comparison to the distant place itself. At this point, I omitted all references to historical backgrounds about places, as well as the backgrounds of the dancers themselves, if neither were part of framing a particular experience. While I stuck primarily to locations that are referenced by name in general or specific form, I did fill in gaps in a few instances, such as locating “Nazi” in Germany, or correcting the transliterated “Rawshia” to Russia.

As one final test, I layered this dataset with another that I built previously regarding number of shows in each location, from lightest (least shows) to darkest (most shows). Here, I also flipped the coloring of the lines. Whereas on the first map, lines were colored by the city from which the association originated, on this second map, the association lines are colored by the (most common) cities to which they refer. Later on in the tour, you see how dancers returned to Rio and Buenos Aires, where they had stayed longer, as newly acquired points of reference.

This is still a first step. While I have noted some of the ways this dataset could be further refined, there other places to go, as well. It would be great for example, to begin to warp the basemap itself, whether historical or contemporary, in order to redraw a picture of the world from the dancers’ eyes.

Colliding with Other Datasets: Population Versus Number of Shows

One of the things on my to-do list has been to collide my touring data with other available datasets from 1941. In the long term, such digital work not only offers new ways to analyze dance history, but also points up new ways to make dance materials available to historical studies in other fields.

Unfortunately, very little historical data from the tour year is already collated online. But I did use a combination of the World Almanac and Book of Facts from 1941 and 1942, together with the 1941 Rand McNally Commercial Atlas to construct a little dataset of populations for the cities in which American Ballet Caravan performed. I then ran these against the number of shows performed in each city, in order to see whether any stood out as anomalies.

For this map, I have divided the color of the cities by standard deviations above or below the mean. The farthest outliers are -2 and +3. The dataset is not really large enough to support a high level of confidence, but it at least gives some sense of outliers. Here it’s possible to compare Mendoza, where 3 shows were performed for a population of 76,780 to Rosario, where 2 shows were performed for a population of over half a million, or Sao Paulo, which had 4 shows for 1,151,249 residents. Medellin comes out at precisely the mean. So why did they do so few shows in Sao Paulo? Or so many in Mendoza?

The map is color coded by standard deviation, with relevant information (city, population, and number of shows) appearing on the hover.

Mapping a More Distributed Picture of Exchange: ABC Traveler Demographics

With these maps, I take a break from the South American route itself to look instead at where the dancers and other travelers on the 1941 tour came from. This creates a (somewhat) more distributed picture of exchange than one might imagine when looking at only those singular lines that begin and end in New York harbor.

The data comes from a series of documents collected in preparation for travel visas, which required that dancers provide proof of birthplace, birthdate, and current citizenship. For most dancers born inside the United States, the lists are relatively straightforward, except where documentation had been lost. For those born in one of seven countries outside of the United States, there is additional information and dates for immigration status (“alien registration”, “first papers”), quota numbers, and cities of application. For some reason, Balanchine’s birthplace is never specified beyond “Russia.”

In manually collating the datasets, I focused primarily on place of birth and current citizenship, which are only two of the many possible ways to mark affiliations to place. A dancer born in London was domiciled in Montreal, worked in the US under a quota visa, but traveled under a valid Canadian passport. Two dancers born in different cities in Germany required extensive documentation, since they were officially listed as “stateless.” These examples suggest how many lives exceed the chosen anchors.

To take a general overview: the travelers were born in 36 cities across 8 countries: the United States, England, Germany, Mexico, Cuba, Australia, Russia, and Canada. Between them, their birthplaces touch several continents. New York City dominates the United States cities, followed by Philadelphia. Beyond what we think of now as the northeast corridor (New Jersey, Connecticut, Massachusetts, Pennsylvania), travelers came from Illinois, Ohio, Utah, Oklahoma, Washington, North Carolina, California, and Missouri. Dancers held citizenships in four countries, plus the stateless Germans, whose passports had been invalidated.

Here is a map marking places of birth. The coloring is by country, but specific city names appear when you hover over a particular dot:

Whereas here is a map where places of birth are colored by the travelers’ citizenship as of spring 1941, when they set out on tour:

Maps like these do not take into account, for example, the travels of dancers and other tour workers between their birth and the spring of 1941. For example, while northern California is not represented, two of the dancers had come from San Francisco Ballet. Likewise, the wardrobe mistress from Russia had actually been on Ballet Russes’s South American tour, and another dancer is listed as previously performing for “Monte Carlo,” although the dates are not specified.

But there is another map that I’d like to make even more. It’s a point to point map, based not on factual data such as this, but on the “fictional” novel written on tour. (Update: I did it and it’s here.) Each time a location is mentioned, I would draw a line between the city where the novel is when the other place is mentioned, and the reference. In this way, it would be possible to build a map of interconnected places from the perspective of those on tour. There are many decisions to make with this: would it need to distinguish between external and internal referents, ie: world events versus associations in the dancers’ own minds? What counts as a concrete enough reference to use? And how might enough lines ultimately warp the basemap to create a different picture of the world landscape?

In the meantime, one last map experiment that colors birthplace by company role to think a bit more about labor:

Historical Basemaps

Up until now, I have been working with relatively generic contemporary baselayer maps. But I was interested to see how different certain older maps might be, given some of the political upheaval in South America in the past 80 years. Of course, however, I first had to get lost in the fabulousness of the historical maps themselves.

For example (and extremely apropos), in 1942 Ernest Dudley Chase published what was called the “Good Neighbor Pictorial Map of South America.” While one digital collection explains that this was meant to offer up “a positive message of solidarity between South America and the U.S.”, another indicates that it was used by the Moore-McCormick Lines. This was in fact the ocean liner company that American Ballet Caravan used between New York and Rio at the start of their tour.

Other maps were created for different uses, such as this 1942 physical-political classroom map, designed for viewing from up to 40 feet away (and used in UC Berkeley’s Department of Geography). For the moment, I have started with a Rand McNally road map from 1941.

Unfortunately none of the pre-referenced maps in the extensive David Rumsey Map Collection corresponded closely enough to locations and dates. But the collection does allow users to contribute their own georeferencing, so I started working on the road map with some clumsy correlations of my own. Mike then took the next step to develop a Mollweide projection, that he then rubbersheeted to match some prominent cities. He also wrote me out a tutorial in the process.

Screen Shot 2015-07-28 at 9.37.43 AMThe historical maps do more than simply provide an accurate baselayer. For example, I have been wondering about some of the empty spaces on the maps that I have been creating so far. In other words, where didn’t the tour go and why? Or why choose these particular cities? I have been searching for other open datasets pertaining to South America in 1941, which might help answer such questions via population, GDP, etc. However, looking at the 1941 atlas, it is conspicuous that the most dominant lines and therefore the largest roads on the map trace out almost exactly the path taken by American Ballet Caravan (even though the company itself often traveled via other means). I’ve turned on the breadcrumbs in the cumulative animation, so that this pattern is visible.

Merging Tables and Layering Maps

In progress: a post about researching and georeferencing fabulous historical basemaps. (Really… who can resist a 1942 map entitled “The Good Neighbor Pictorial Map of South America”?) But for now: a bit of experimentation with composite maps. Before moving on to keying off of other datasets, there is more to do with putting together my own data differently.

CartoDB allows users to merge datasets and also to layer multiple datasets. This screengrab comes from a map that is not very pretty (yet), but informative nonetheless.

Screen Shot 2015-07-27 at 5.20.19 PM

This map consists of two layers. The farther-back layer is drawn from a dataset with the routes converted to GeoJSON linestrings. Here each string is labeled with the type of transportation. Right now, these are simply represented as direct lines, although in the future they could be arced, or even made to run along a suggested route (so that boats would not be, for example, running on land).

The second layer is based on several tables which have been merged inside CartoDB on the basis of “stay” identifiers. The choropleth is based on number of performances per city, from 1 to 18. I made the distribution uneven, because there is a much finer gradient of performances in the lower numbers than in the upper, and therefore the difference between 1 and 2 performances seems more significant than, say 15 and 18. The cities with null values are white with a 50% opacity (they were only passed through in transit). I’ve also added labels that appear on a hover and list the city name and number of performances for the cities in which performances occurred, although they are currently blank for transit cities.

This composite map is useful in how it begins to pull the pieces together. Montevideo, for example, suddenly appears very important as the only location not “on the way” — compare this with the one-night stands in Vina del Mar near the water, or Manizales along the rail journey. The addition of certain transit (non-performance) cities also begins to flesh out a more circuitous journey, such as the transition in Callao from boat to train between Valparaiso and Lima.

Getting an Outside Eye

Yesterday, I met with Mike Migurski to look over some of the datasets and maps that I’ve been setting up. A few key takeaways from the conversation:

  • There is a project from last November, which was doing something very similar in circling around the same datasets in different ways to see what insights turn up: Paul Downey’s One CSV, Thirty Stories (he only actually makes it to 21, but still fascinating).
  • How to key off of other datasets (for example with a GeoNames id), but also the complications of this for historical work, where borders, roads, population, etc will have been different. Another interesting collision might be to explicitly follow this historical thread and run up against a WWII dataset.
  • Restructuring my database. While I’ve been working with two primary tables (“shows” and “routes”), Mike suggested adding a third for “stays,” so that each has a unique identifier, for example if dancers pass through a town more than once. We also discussed the difference between entering a value, for example in a “transportation” column, versus having five columns, one for each possible type of transportation, into which one selects only y/n.
  • How to deal with ambiguity. I’ve been struggling with knowing when dancers are in a given place in order to perform, but not how long they traveled to get there versus how much in advance of a given performance they arrived to rehearse (other archival evidence suggests this was not consistent — sometimes they even performed the night they arrived). We talked about dealing with these stays and trips by keeping track of four dates for each: i) earliest possible begin; ii) latest possible begin; iii) earliest possible end; iv) latest possible end. This is appealing to me, because it has the potential to highlight the offstage time, which is harder to track than the performances.