Ghost Hack

Design and Programming by Michael Chang
Design and Programming by Michael Chang
  • My Portfolio
  • About Michael Chang
  • rss
  • archive
  • Arms Trade Visualization

    *edit* This post was created a few days ago but had to be taken down. Should be okay now that It’s up on Google’s official blog.

    —-

    I work for Data Arts Team at Google and we get a lot of cool oddball projects here and there, more of which I might write about in the future, permissions pending.

    Here’s something that is just wrapping up.

    Arms Trade Visualization (best viewed on Chrome on full-screen).

    A few weeks ago I was introduced to Dr. Robert Muggah, principle of SecDev Group who was giving a talk at INFO summit (Illicit Networks Forces in Opposition), a conference about how information technologies can expose shitty things happening in the world (slave trade, illegal arms dealing, bad actors network). Somehow our team was asked to make a data viz on data, and the project fell on my lap.

    Of the data sets that were presented to me, one stood out the most ripe for visualization, and that was the (legal) arms trade data set collected by the UN and compiled by Robert Muggah and PRIO (Peace Research Institute Oslo).

    This data set contains about a hundred thousand “trades” between countries for weapons reported by the UN. The columns represent the year of the trade (as high resolution as we could get, at least that’s what they said), the exporter and importer country names, the exporter and importer country codes (who knows what they are? I didn’t look very far), weapons code (UN specified) that refers to some categories like ammunition, military rifles, sporting rifles, etc, and the US dollar amount for the trade, unadjusted for inflation.

    As an aside, I really enjoy data that comes in as “flat” as possible such as this CSV, especially for time series things. The trick is now to ‘bin’ this into something I can parse for a data viz.

    The first order of business is to sanitize the country names against something I could use. Upon reflection I probably should have done this by matching the country’s ‘code’ id with some data base of countries, but my inquiry about this with the data provider didn’t turn out anything.

    From my previous projects I already had a country list that had ISO-3166 codes matched to country names, for example “United States” to “US”, or “Taiwan” to “TW”. In addition, the country codes are paired up already with latitude and longitudes important for us later on.

    Here are some examples of country names mismatching:

    In the CSV: “Antigua & Barbuda”
    In my set: “Antigua and Barbuda”

    In the CSV: “Congo (Republic of)”
    In my set: “Congo” 

    I decided it’s better to ‘fix’ the data set’s naming than it is to come up with yet another match list. Instead of a straight “find and replace” which would have been horribly messy especially with a hundred thousand entries and over two hundred and fifty countries.

    I opened the CSV in Google Refine, a nice little utility that allows me to quickly sort and see issues in the data set. In addition to fixing the names, I could save out the “change-list” so that in the future if I found out I screwed up somewhere I could essentially revert or adjust, as the change list json acts as a modification history to the data (this comes super handy later when I discovered that I had missed a few countries).

    The next step is to “bin” the data into groups by year, instead of attempting to parse a hundred thousand pieces of data each time. This is helpful since I knew my data viz was going to be a time series, and I knew that it would be useful to scrub the timeline and see changes, so organizing the data in a way that makes this easy was high priority. 

    I wrote a small python script to both convert and save the data out as JSON, which is really useful since I’m eventually going to parse it with Javascript anyway, and here I can organize it into the exact way I want to actually access the data. (Not sure if I can share this part since but it’s really simple).

    Originally the weapon codes were described as such:

    930100 – military weapons, and includes some light weapons and artillery as well as machine guns and assault rifles etc.  

    930190 – military firearms – eg assault rifles, machineguns (sub, light, heavy etc), combat shotguns, machine pistols etc

    930200 – pistols and revolvers

    930320 – Sporting shotguns (anything that isn’t rated as a military item).

    930330 – Sporting rifles (basically anything that isn’t fully automatic).

    930621 – shotgun shells

    930630 – small caliber ammo (anything below 14.5mm which isn’t fired from a shotgun.

    At some point we decided that the UN’s weapons trade categorization were far too arbitrary and confusing, even to the expert that we asked this about. So instead we just decided to combine the categories. 

    Okay! The JSON came out to be roughly the same filesize as the CSV, weighing at about 5 megs. Again, upon reflection, we could have used country codes to save space, but I wasn’t really worried at this prototype stage. I just wanted something on-screen so we can see.

    You can see the output JSON here.

    Time to parse this guy. I won’t cover too much on WebGL, THREE.js and globe code, but I’ll gloss over a bit of it here.

    First I read in the country names, then matched up country name with country codes and latitude longitudes. Here is the json that I produced using public data found on the internet.

    The following snippet is how I converted the lat lons into 3D geometry coordinates. The final code included a bit of fudging factor to make things match up, but for the most part this works.

    	//	take the lat lon from the data and convert this to 3d globe space
            var lon = country.lon - 90;
            var lat = country.lat;
            
            var phi = Math.PI/2 - lat * Math.PI / 180;
            var theta = 2 * Math.PI - lon * Math.PI / 180;
    		
    	var center = new THREE.Vector3();                
            center.x = Math.sin(phi) * Math.cos(theta) * rad;
            center.y = Math.cos(phi) * rad;
            center.z = Math.sin(phi) * Math.sin(theta) * rad;  	
    	
    	//	save and catalogue       
    	country.center = center;

    Now we’re getting somewhere. Each line you see represents a trade. You can barely make out europe in that tight patch of countries where it’s really bright. The curvature is just a spline construction (see the code here).

    Already it’s rendering slowly since *each* trade is a GL line draw call, so I decided to dump all of them into a vertex buffer, and it works much faster, but we have this laser ball now pointing to the middle. That’s okay though, since it will be covered up by an eventual sphere.

    It’s also important to identify the trades by country, so here you see each country being assigned a random color and then applying the color to the line itself. By using additive blending and reducing the brightness, we can get a sense of volume for the trading area.

    Next we need to see the countries! The most obvious way is to bring in some NASA texture map or a PNG of some country border, but that’s kind of boring, so I decided to use a “pin map” version of the world map presented by vertices. As you can see each country is composed of a series of vertices and I’m simply extruding them up to form hexagon bricks that I then color by the randomized country colors.

    The data viz is now lacking a key layer of information… “which country is selling, and which country is buying”? How do I represent importers and exporters? I ended up with this experiment where particles represent weapon sales and are traveling from the country of origin to destination so you can see who are the primary exporters and importers.

    Obviously you can only see how this works in motion, but you can tell from the final version that this solution is what I ended up with.

    At this stage there’s still a few problems that I needed to solve. How do I represent categories of weapons sale, do I use color? Particle icons? Something else? Another problem you can see is the country centers. At this point they were using the country vertex locations and getting the average center, which is usually not what we think of as ‘center of a country’. For example, United States gets pulled north due to the amount of vertices Alaska contains. This was eventually fixed by doing a lookup of actual lat lons of countries instead of finding averages of vertices.

    To prepare for the eventual UI that is going to be built, I prototyped the filter functionalities by adding DAT.gui, a simple UI interface made my friends on the team. This is really handy for getting quick checkboxes and dropdowns when doing creative coding work that needs to be adjusted on the fly.

    A few things I decided to change in my design at this point, given feedback from the team and examining our options. First was that the pins (hexagonal bricks) representing the countries were too visually noisy, since the lines and particles were the actual “meat” of what we should be looking at. Second, the colors needed to be fixed so that they actually represent something useful other than simply showing different countries. Third, the particles themselves need a bit of love. Finally, I’ve also planned on adding some sort of 2D “HUD” markers that sit on top of the globe to act as selectors for the countries, in leu of doing actual country picking (more on this later).

    The first problem was really tricky — I wanted to flatten the representation of countries back to a texture on the globe, but I also needed to highlight countries. Essentially I needed a dynamically updatable texture map, with any of the 250+ countries capable of lighting up when I needed it to.

    A few considerations came to mind… I could render each country as vertex / quad geometry, but I would have to produce this geometry some how and that would be pretty time intensive and questionable if it would work out well.

    Alternatively … I could have two hundred and fifty country image panels and somehow mapped their UVs for rendering. Again, this sounded really painful and I dreaded the notion of trying to map these correctly back onto the globe.

    I tried this wonky solution of using an SVG of the world map (the equirectangular svg map found on Wikipedia), and using this cool little library called CanVG that renders SVGs onto canvas, and then injecting the resulting canvas into a GL texture.

    It worked alright but the rendering time was around two seconds per render (!). It was  unbearable, and given the level of interactivity I wanted this was unacceptable.

    I talked to Ryan Alexander about this problem, asking if there was a GL shader solution. He recommended that I try a lookup table which I’ll attempt to explain here.

    The strategy revolves around having a globe texture with each country colored at a different grayscale index. I wrote a small Processing app that took the SVG I was using previously and rendering each country (thank god the SVG was already split by ISO-3166 country codes!!) as a different gray value, between 1 and 256. If there were more than 255 countries I would run into problems… but thankfully that wasn’t the case. Additionally, pure black 0 was used to represent the ocean (no country).

    In addition to outputting this image, I also had the Processing program write out a JSON array of grayscale lookup to country code.

    Finally I cooked up this bit of GLSL code. The idea is really simple… there will be two image buffers, the first one holds the image above, and the second is a 256x1 pixel buffer that will be the actual color rendered for that country. For example, ‘FR’ (France) has a grayscale value of 3, so on the second image buffer I look up the third pixel from the left for what color to render France with.

    With this I was able to mask any number of countries and color them instantaneously, all done from the video card. Using the same lookup map I was able to render one pass at the original grayscale (0-255 index) values to do picking, so that ended up being a two for one victory.

    1. <script id=“globeFragmentShader” type=“x-shader/x-fragment”>   
    2.         uniform sampler2D mapIndex;
    3.         uniform sampler2D lookup;
    4.         uniform sampler2D outline;
    5.         uniform float outlineLevel;
    6.         varying vec3 vNormal;
    7.         varying vec2 vUv;      
    8.         void main() {                          
    9.                
    10.                 vec4 mapColor = texture2D( mapIndex, vUv );    
    11.                 float indexedColor = mapColor.x;       
    12.                 vec2 lookupUV = vec2( indexedColor, 0. );
    13.                 vec4 lookupColor = texture2D( lookup, lookupUV );                              
    14.                 float mask = lookupColor.x + (1.-outlineLevel) * indexedColor;
    15.                 mask = clamp(mask,0.,1.);
    16.                 float outlineColor = texture2D( outline, vUv ).x * outlineLevel;
    17.                 float diffuse = mask + outlineColor;
    18.                 gl_FragColor = vec4( vec3(diffuse), 1.  );                                             
    19.  
    20.         }
    21. </script>

    The next problem I needed to tackle was the ‘country markers’, which were planned to include country names. Originally I used a crappy custom-written SVG-pinning library I wrote for shits and giggles. This allowed me to see what 2D HUD elements looked like on the globe.

    The 2D elements are really dom elements since I didn’t want to render any type or solve those kinds of issues in WebGL, this is what the web is best for, after-all. The trick would be to translate the 3D position to 2D screen space position and then use CSS to place it correctly.

    The following snippet does the conversion, using THREE.js’s projector class.

    function screenXY(vec3){
    	var projector = new THREE.Projector();
    	var vector = projector.projectVector( vec3.clone(), camera );
    	var result = new Object();
    	result.x = Math.round( vector.x * (window.innerWidth/2) ) + window.innerWidth/2;
    	result.y = Math.round( (0-vector.y) * (window.innerHeight/2) ) + window.innerHeight/2;
    	return result;
    }	
    
    then…
    var matrix = rotating.matrixWorld;
    var abspos = matrix.multiplyVector3( country.center.clone() );
    var screenPos = screenXY(abspos);

    and now we can give it to the element’s css style left and top for placement.

    Eventually I replaced the crappy SVG lib with just straight dom spans as markers, and started adding typography so we can see the country names. There’s also a few tricks to figuring out the sizing, occlusion behind the globe, and LODing, but I won’t cover that here. You can see this in markers.js the (ugly) logic behind this.

    If you’ll recall the color-space was being used initially to represent countries. Now that we have a much cleaner country separation, we could use color to represent other things. At first I tried using color to represent weapons categories, but this lead to problems due to additive blending, which tend to output a rainbow splash instead of anything visually useful.

    We finally decided to distill (simplify) the color space down to two colors and represent purely imports and exports and upon reflection this was probably the wisest choice decision I’ve made in my career doing data viz. It’s still imperfect, since the colors do tend to merge (and I stubbornly refuse to give up on additive blending, see this image for comparison), but the end result strikes a decent balance between aesthetics and actual utility.

    Our team hired Pitch Interactive to complete the remaining UI elements which ‘completes’ the visualization, including bar graphics for import export and category comparisons, filtering options, and a timeline scrubber with a historical graph showing change over time. 

    I wish I could talk even more on Robert Muggah’s investigation of the data, listening to him talk about it is really jaw-dropping since he can point out global conflicts directly on the visualization, for example in 1999 the arms race between India and Pakistan causing a gigantic spike in ammunition sales from Russia to India, or how depending on the country you can tell the exact brand of pistol or rifle (eg if it came from Italy vs Germany). 

    Here are images from the INFO event in Los Angeles featuring this visualization. Photos courtesy of Google.

    Overall I’m pretty happy with the way the visualization turned out. It was a pretty classic “make this data look sexy” exercise, and as much technical hurdle as it was, the project itself was very very straightforward. 

    • July 30, 2012 (4:05 pm)
    • 24 notes
    1. cognitiveembolism likes this
    2. jt5d likes this
    3. myhandmadejewelry likes this
    4. sspboyd likes this
    5. bigdatasociety reblogged this from mflux and added:
      designer and programmer...shares recent project visualizing Global Arms Trade. It’s
    6. bigdatasociety likes this
    7. taikiken likes this
    8. endquote likes this
    9. yannickbrouwer likes this
    10. vellum likes this
    11. worsethandetroit likes this
    12. georgevaldes reblogged this from notational
    13. javisantana likes this
    14. haikyo likes this
    15. adhocratic likes this
    16. oppen likes this
    17. inoook reblogged this from mflux
    18. jamesproctor likes this
    19. marlonlemes reblogged this from mflux
    20. ogaooooo reblogged this from mflux
    21. ogaooooo likes this
    22. notational reblogged this from mflux
    23. mflux posted this
© 2011–2013 Ghost Hack